Ingest Data into Humio

After establishing a Humio Cloud account or installing Humio on a server, you’ll want to put in place a system to feed data automatically into Humio. This is known as ingesting data. There are a few significant steps to doing this:

In most cases you will want to use a data shipper or one of our platform integrations. If you are interested in getting some data into Humio quickly, take a look at Ingesting Application Logs guide.

Humio is optimized for live streaming of events in real time. If you ship data that are not live, you need to observe some basic rules so that the resulting events are stored in Humio as efficiently as if they had been received live. See Backfilling).

Using the Ingest API

Figure 1, Ingest API Flow

If your needs are simple and you don’t care too much about potential data loss due to, for example, network problems, you can also use Humio’s Ingest API. You may use the Ingest API directly or through one of Humio’s client libraries.

As Figure 1 here illustrates, this is by far the simplest flow, and is completely appropriate for some scenarios, like analytics.

See the Ingest API reference page for more information on it. For a list of software that is supported, see the Software Libraries reference page.

Data Shippers

Figure 2, Ingest Process

A Data Shipper is a system tool that looks at files and system properties on a server and sends them to Humio. Data shippers take care of buffering, retransmitting lost messages, log file rolling, network disconnects, and a slew of other things so your data or logs are send to Humio in a reliable form.

In Figure 2 here, “Your Application” is writing logs to a log file. The data shipper reads the data and pre-processes it (for example, this could be converting a multiline stack-trace into a single event). It then ships the data to Humio on one of our Ingest APIs.

Data shipping can be done in a couple of ways:

You can find a list of supported data shippers on the Data Shippers section of the Documentation.

Platform Integrations

Figure 3, Platform Integration

Depending on your platform, the data flow will look slightly different from Figure 3 here. Some systems use a built-in logging subsystem, others have you start a container running with a data shipper. Usually you will assign containers/instances in some way to indicate which repository and parser should be used at ingestion.

If you want to get logs and metrics from your deployment platform, like a Kubernetes cluster or your company PaaS, see the Provisioning and Containers installation sections.

Take a look at the individual integrations pages for more details.