Digest nodes are responsible for executing real-time queries and compiling incoming events into segment files.
Whenever parsing is completed new events are placed in a Kafka queue called the Digest Queue.
A cluster will divide data into partitions (or buckets), we cannot know exactly which partition a given event will be put in - Partitions are chosen randomly to spread the workload evenly across digest nodes.
Each partition of that queue must have a node associated to handle the work that is placed in the partition or they would never be processed and saved.
Humio clusters have a set of rules that associate partitions in the Digest Queue with the nodes that are responsible for processing events on that queue. These are called Digest Rules.
You can see the current Digest Rules for your own cluster by going to the Cluster Management Page and selecting the Digest Rules tab on the right-hand side:
When a node is assigned to at least one digest partition, it is considered to be a Digest Node.
Example Digest Rules
The table shows three digest rules. Node
1 will receive 2x more work
3. This is because
1 is assigned to two partitions while node
is only handling events on partition
If a node is not assigned to a partition, it will not take part in the digest phase.
When removing a digest node it is important that you first assign another cluster node to take the work in any digest partitions that the node is assigned to. If you do not do this, there will be no one to process the incoming events and it will stack up, and in the worst case data might get lost.
Humio is introducing High-Availability for ingest nodes, and will allow you to assign failover nodes in each digest rule. This will mean that you don’t have to be quite so careful and that digest nodes can fail without impacting the systems ability to function.