Installation

Introduction

Humio is a fast and flexible platform for logs and metrics, available self-hosted or through Humio Cloud.

Humio cloud

Within Humio’s Cloud offerings, at this time, we have two different regions, EU (cloud.humio.com) and US (cloud.us.humio.com). Within the documentation, we will refer to “http://$YOUR_HUMIO_URL/” where $YOUR_HUMIO_URL is the URL for your particular Humio Cloud installation.

Humio Self-hosted

If you choose to self-host your Humio instance, there are two primary ways of installing it

  • Running it in a Docker container, or
  • Running as a jar file

If you are getting started with Humio, we recommend running Humio as a Docker container since Docker contains the external dependencies needed, Kafka and Zookeeper. If you plan on running Humio on bare metal, please refer to our Bare Metal Installation Guide.

For information on how to choose hardware, and how to size your Humio installation, see Humio instance sizing.

Guides

Hardware Requirements

Hardware requirements depend on - how much data you will be ingesting, and - how many concurrent searches you will be running

Scaling Your Environment

Humio was made to scale, and scales very well within the nodes in a cluster. Running a cluster of three or more Humio nodes provides higher capacity in terms of both ingest and search performance, and also allows high availability by replicating data to more than one node.

If you want to run a clustered node please review Cluster Setup.

Estimating Resources

Here are a few guidelines to help you determine what hardware you’ll need.

  1. Assume data compresses 9x on ingest. Test your installation; better compression means better performance.
  2. You need to be able to hold 48 hours of compressed data in 80% of your RAM.
  3. You want enough hyper-threads/vCPUs (each giving you 1GB/s search) to be able to search 24 hours of data in less than 10 seconds.
  4. You need disk space to hold your compressed data. Never fill your disk more than 80%.

Example Setup Your machine has 64 GB of RAM, 8 hyper-threads (4 cores) and 1 TB of storage. Your machine can hold 460 GB of ingest data compressed in RAM and process 8 GB/s.
In this case, it means 10 seconds worth of query time will run through 80 GB of data. So this machine fits an 80 GB/day ingest, with +5 days’ data available for fast querying. You can store 7.2 TB of data before your disk is 80% full, corresponding to 90 days at 80 GB/day ingest rate.

This example assumes that all data has the same retention settings. But you can configure Humio to automatically delete some events before others, allowing some data to be kept for several years while other data gets deleted after one week, for example.

For more details, refer to our Instance Sizing Reference.

Non-cloud: Bare-Metal/VM/Kubernetes Workers

Assumptions:

  • 30 Day Retention on NVME
  • 20% Overhead left on NVME
  • 10x Compression
  • Secondary Storage can extend retention at slower speeds (SAN/NAS/RAID)
  • Kafka 5x Compression - 24 Hour Storage
  • Humio does not provide a self-hosted Kubernetes solution for Kafka and Zookeeper

Zookeeper/Kafka clusters are separate from Humio clusters to avoid resource contention and allow independent management.

X-Small - 1 TB/Day Ingestion

Instances vCPU Memory Storage Total Storage
Humio 3 16 64 GB NVME 2 TB 6 TB
Kafka 3 4 8 GB SSD 500 GB 1.5 TB
Zookeeper 3 Shared with Kafka Shared with Kafka SSD 50 GB 150 GB

Small - 3 TB/Day Ingestion

Instances vCPU Memory Storage Total Storage
Humio 3 32 128 GB NVME 6 TB 9 TB
Kafka 3 4 8 GB SSD 1 TB 3 TB
Zookeeper 3 Shared with Kafka Shared with Kafka SSD 50 GB 150 GB

Medium - 5 TB/Day Ingestion

Instances vCPU Memory Storage Total Storage
Humio 6 32 128 GB NVME 6 TB 36 TB
Kafka 3 8 16 GB SSD 1 TB 3 TB
Zookeeper 3 Shared with Kafka Shared with Kafka SSD 50 GB 150 GB

Large - 10 TB/Day Ingestion

Instances vCPU Memory Storage Total Storage
Humio 12 32 128 GB NVME 6 TB 72 TB
Kafka 6 8 16 GB SSD 1 TB 6 TB
Zookeeper 3 Shared with Kafka Shared with Kafka SSD 50 GB 150 GB

X-Large - 30 TB/Day Ingestion

Instances vCPU Memory Storage Total Storage
Humio 30 32 128 GB NVME 7 TB 210 TB
Kafka 6 8 16 GB SSD 1.5 TB 13.5 TB
Zookeeper 3 Shared with Kafka Shared with Kafka SSD 50 GB 150 GB

AWS: EC2/EKS Workers

Assumptions:

  • Retention on NVME Varies due to fixed size, but is > 30 days
  • 20% Overhead left on NVME
  • 10x Compression
  • S3 Bucket storage used for longer retention
  • AWS Managed Kafka Service (MKS) for Zookeeper/Kafka
  • Humio does not provide a self-hosted Kubernetes solution for Kafka and Zookeeper

Zookeeper/Kafka clusters are separate from Humio clusters to avoid resource contention and allow independent management.

AWS EKS

AWS Reference Architecture

X-Small - 1 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 3 i3.2xlarge / 8 61 GB NVME 1.9 TB 5.7 TB
Kafka 3 kafka.m5.xlarge/ 4 16 GB EBS 500 GB 1.5 TB
Zookeeper MSK MSK MSK MSK MSK

Small - 3 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 3 i3.4xlarge / 16 122 GB NVME 3.8 TB 11.4 TB
Kafka 3 kafka.m5.xlarge/ 4 16 GB EBS 500 GB 1.5 TB
Zookeeper MSK MSK MSK MSK MSK

Medium - 5 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 6 i3.8xlarge / 32 244 GB NVME 7.6 TB 45.6 TB
Kafka 3 kafka.m5.2xlarge/ 8 16 GB EBS 1.5 TB 4.5 TB
Zookeeper MSK MSK MSK MSK MSK

Large - 10 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 12 i3.8xlarge / 32 244 GB NVME 7.6 TB 91.2 TB
Kafka 6 kafka.m5.2xlarge/ 8 16 GB EBS 1.5 TB 9 TB
Zookeeper MSK MSK MSK MSK MSK

X-Large - 30 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 30 i3.8xlarge / 32 244 GB NVME 7.6 TB 228 TB
Kafka 9 kafka.m5.2xlarge/ 8 16 GB EBS 2 TB 18 TB
Zookeeper MSK MSK MSK MSK MSK

GCP/GKE

Assumptions:

  • 30 Day Retention NVME
  • 20% Overhead left on NVME
  • 10x Compression
  • GCS Bucket storage used for longer retention
  • Humio does not provide a self-hosted Kubernetes solution for Kafka and Zookeeper

Zookeeper/Kafka clusters are separate from Humio clusters to avoid resource contention and allow independent management.

X-Small - 1 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 3 n-standard-16 / 16 122 GB NVME 3 TB (8x375GB) 9 TB
Kafka 3 n-standard-8 / 8 32 GB PD-SSD 500 GB 1.5 TB
Zookeeper 3 Shared with Kafka Shared with Kafka PD-SSD 50 GB 150 GB

Small - 3 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 3 n-standard-16 / 16 122 GB NVME 5 TB (14x375GB) 18 TB
Kafka 3 n-standard-8 / 8 32 GB PD-SSD 500 GB 1.5 TB
Zookeeper 3 Shared with Kafka Shared with Kafka PD-SSD 50 GB 150 GB

Medium - 5 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 6 n-standard-32 / 32 128 GB NVME 6 TB (16x375GB) 36 TB
Kafka 6 n-standard-8 / 8 32 GB PD-SSD 1 TB 6 TB
Zookeeper 3 Shared with Kafka Shared with Kafka PD-SSD 50 GB 150 GB

Large - 10 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 12 n-standard-32 / 32 128 GB NVME 6 TB (16x375GB) 72 TB
Kafka 6 n-standard-8 / 8 32 GB PD-SSD 1 TB 6 TB
Zookeeper 3 Shared with Kafka Shared with Kafka PD-SSD 50 GB 150 GB

X-Large - 30 TB/Day Ingestion

Instances EC2 Instance Type/vCPU Memory Storage Total Storage
Humio 30 n-standard-64 / 64 256 GB NVME 7.5 TB (20x375GB) 225 TB
Kafka 9 n-standard-8 / 8 32 GB PD-SSD 1.5 TB 13.5 TB TB
Zookeeper 3 Shared with Kafka Shared with Kafka PD-SSD 50 GB 150 GB

Configuration Options

Please refer to the configuration reference page.