Query Functions

Query functions are specified using a name followed by brackets containing parameters. Parameters are supplied as named parameters

sum(field=bytes_send) // calculate the sum of the field bytes_send

Functions are allowed to have one unnamed parameter. The sum function accepts the field parameter as unnamed parameter and can be written as sum(bytessend) .

A function is either a Transformation function or an Aggregate Function.

Transformation Functions (also sometimes referred to as Filter Functions) can filter events and add, remove and modify fields.

Aggregate Functions combine events into a new results - often a single number or row. For example function count returns one event with one field _count.

Each query function is described below:

avg

( Aggregate Function )

Calculates the average for a field over a set of events. Result is returned in field named _avg

Parameters

Name Type Required Default Description
field string Yes field to extract a number from and calculate average over
as string No _avg name of output field

field    is the unnamed parameter

Examples

Example 1

Find the average bytes send in http responses

avg(field=bytes_send)

bucket

( Aggregate Function )

Bucket is an extension of the groupby function for grouping by time. Look at timechart before using this function. This function divides the search time interval into buckets. Each event is put into a bucket based on its timestamp.
Events are grouped by their bucket. Which is a field named _bucket . The value is the buckets start time in milliseconds (UTC time).
Bucket takes all the same parameters as groupby. The _bucket is added to the fields grouped by.

Parameters

Name Type Required Default Description
limit number No Defines the maximum number of series to produce (defaults to 20). A warning is produced if this limit is exceeded, unless the parameter is specified explicitly.
timezone string No Defines the time span for each bucket. The time span is defined as a relative time modifier like 1hour or “3 weeks”. If not provided the search time interval is divided into 127 buckets
span string No Defines the time span for each bucket. The time span is defined as a relative time modifier like 1hour or “3 weeks”. If not provided the search time interval is divided into 127 buckets
buckets number No Defines the the number of buckets. The time span is defined by splitting the query time interval into this many buckets. 0..1500
field [string] No specifies which fields to group by. Note it is possible to group by multiple fields.
function [Aggregate] No count(as=_count) Specifies which aggregate functions to perform on each group. Default is to count(as=_count) the elements in each group

span    is the unnamed parameter

Examples

Example 1

Divides the search time interval into buckets. As time span is not specified, the search interval id divided into 127 buckets. Events in each bucket are counted

bucket(function=count())

Example 2

Count different http status codes over time. Bucket them into time intervals of 1 minute. Notice we group by 2 fields status code and the implicit _bucket

bucket(1min, field=status_code, function=count())

Example 3

Show response time percentiles over time. Calculate percentiles per minute (bucket time into 1 minute intervals)

bucket(span=60sec, function=percentile(field=responsetime, percentiles=[50, 75, 99, 99.9]))

cidr

( Filter Function )

Filter events using CIDR subnets

Parameters

Name Type Required Default Description
subnet [string] Yes Specifies a CIDR subnet to filter on.
field string Yes Specifies the field to run the CIDR expression against.
negate bool No false Only let addresses not in the given subnet pass though. (Also let events without the assigned field pass thru.)

field    is the unnamed parameter

Examples

Example 1

match events for which the 'ipAddress' attributes is in the ip range 77.243.48.0/20

cidr(ipAddress, subnet="77.243.48.0/20")

Example 2

match events for which the 'ipAddress' attributes is in the ip range 77.243.48.0/20 or 256.0.0.0/16

cidr(ipAddress, subnet=["77.243.48.0/20", "256.0.0.0/16"])

collect

( Aggregate Function )

Collect fields from multiple events into one event.

Parameters

Name Type Required Default Description
multival bool No true Collect resulting value as multivalue
fields [string] No The names of the fields to keep.
maxlen number No 2000 the maximum length of the generated @rawstring.

fields    is the unnamed parameter

Examples

Example 1

collect visitors (each visitor defined as non-active after 1 minute)

groupby(client_ip, function=session(maxpause=1m, collect([url])))

concat

( Filter Function )

Concatenates the values of a list of fields into a value in a new field

Parameters

Name Type Required Default Description
field [string] Yes Fields to concatenate
as string No _concat The output name of the field to set (defaults to _concat)

field    is the unnamed parameter

Examples

Example 1

Concatenates the values of the fields f1 and f2 from each event into the field combined.

concat([f1, f2], as="combined")

count

( Aggregate Function )

Counts events streaming through the function. Result is put in a field named _count It is possible to specify a field, and only events containing that field is counted. It is also possible to do a distinct count. When having many distinct values Humio will not try to keep them all in memory. Estimation is then used and the result will not be a precise match.

Parameters

Name Type Required Default Description
field string No Only events with this field is counted
distinct bool No count distinct values. When having many distinct values Humio uses estimation and the result is not excact.
as string No _count name of output field

field    is the unnamed parameter

Examples

Example 1

Count the number of events in the search time period

count()

Example 2

Count the number of http method requests(GET, PUT, POST etc)

groupby(field=http_method, function=count())

Example 3

Count the number of http method requests(GET, PUT, POST etc) over time

timechart(function=count())

Example 4

Count all events containing the field statuscode

count(field=statuscode)

Example 5

Count unique values. Count the number of different statuscodes

count(statuscode, distinct=true)

counterAsRate

( Aggregate Function )

Calculates the rate for a counter field. This function can show the rate at which a counter changes. The result is returned in a field named _rate.
counterAsRate is often expected to be used as the function in a timechart (or groupby). NOTE this function requires at least to two points for calculating a rate. ** When used in a timechart, it is important to have at least two points in each bucket to get a rate.**
This function expects the field to have monotonic increasing values over time. If this is not the case no result is returned. Counters are often reset at server restarts or deployments. Running Rate over a reset, would not return a result. Using counterAsRate in a timechart returns a rate for each buckets where the counter was not reset and nothing for the buckets where the server was reset

Parameters

Name Type Required Default Description
field string Yes field to extract a number from and calculate rate over
as string No _rate name of output field

field    is the unnamed parameter

Examples

Example 1

Show the rate of a counter over time

timechart(function=counterasrate(counter))

default

( Filter Function )

Creates a field with the name of the parameter field setting its value to value. If the field already exists on an event the field keeps its existing value.

Parameters

Name Type Required Default Description
value string Yes The default value that will be assigned to field if not already set.
replaceEmpty bool No false If the field’s value is the empty string, override the value with the default.
field string Yes The field to set the default value for.

value    is the unnamed parameter

Examples

Example 1

Set the default value of the field `minutes` to `0` so it can be used in a calculation. If we did not to this, the event would be discarded during the eval step because eval requires all used fields to be present."

default(field=minutes, value=0)

Example 2

You can use the field operator `=~` and default parameter (`value`) to write it like:

minutes ~= default(0)

Example 3

By default empty values are kept as the field does indeed exist when it has the empty value. You can set `replaceEmpty=true` to replace empty values with the default as well.

default(field=message, value="N/A", replaceEmpty=true)

drop

( Filter Function )

This is a filter that lets you remove attributes/columns from a result set.

Parameters

Name Type Required Default Description
fields [string] Yes The names of the fields to discard.

fields    is the unnamed parameter

Examples

Example 1

drop the single field `header`

drop(header)

Example 2

drop multiple fields

drop([header,value])

eval

( Filter Function )

Creates a new field by evaluating the provided expression.
The eval string must always start with an assignment (f=expr….) The result is stored in a field with that name.
In the expression it is possible to supply names of fields, strings and numbers.
The operators available are ==, !=, as well as +, -, *, and /; and parentesized expressions.
In context of an eval expression (unlike filters), identifiers always denote field values, so e.g. eval( is_warning= (loglevel==WARN) ) is most likely wrong; you want to write (loglevel=="WARN").
The order of evaluation of arguments is left to right.

Parameters

Takes no parameters

Examples

Example 1

Get response size in KB

eval(responsesize = responsesize / 1024)

Example 2

Add fields together

eval(c = a + b)

Example 3

Match a field to the timespan. Count should be per minute (not 5 minutes as the bucket span is)

timechart(method, span=5min) | eval(_count=_count/5)

format

( Filter Function )

Format a string using printf-style.
The formatted string is put in a new field named by the as parameter (default is _format). The fields used as input parameters to the formatting are named using the field parameter, which can be an array.
This function is backed by Java’s Formatter class.
For detailed documentation follow the link.
At the moment fields can only be used as datetime values if they are in iso 8601 format of if they are Milliseconds since the beginning of the epoch starting at 1 January 1970 00:00:00 UTC.

Parameters

Name Type Required Default Description
format string Yes The formatter string. See the Java documentation
field [string] Yes Fields to insert into the formatter string. This is the field names on events (not an actual value
as string No _format The output name of the field with the formatted string)
timezone string No When formatting dates and times it is possible to specify a timezone. Examples: Europe/Copenhagen, UTC, America/New_York, +01

format    is the unnamed parameter

Examples

Example 1

format a float to have 2 decimals

format("%.2f", field=price, as=price) | table(price)

Example 2

Concatenate 2 fields with a comma as separator

format(format="%s,%s", field=[a, b], as="combined") | table(combined)

Example 3

Get the hour of day out of the events @timestamp

format("%tm", field=@timestamp, as=hour) | table(hour)

formatTime

( Filter Function )

Formats a string according to strftime, similar to unix strftime. See Java docs for valid Date/Time valid escapes.

Parameters

Name Type Required Default Description
format string Yes Format string. See Java docs for valid Date/Time valid escapes.
field string No @timestamp where to get the current time
as string Yes Specifies the output field
locale string No Specifies the locale such as US or en_GB
timezone string No Specifies the timezone such as GMT or EST

format    is the unnamed parameter

Examples

Example 1

format time

time := formatTime("%Y/%m/%d %H:%M:%S", field=@timestamp, locale=en_US, timezone=Z)

geohash

( Filter Function )

Calculates a geohash value given two fields representing latitude and longitude.

Parameters

Name Type Required Default Description
lat string No ip.lat The field to use for latitude.
lon string No ip.lon The field to use for longitude.
precision number No 12 The precision to use in the calculation. Usually 12 is enough.
as string No _geohash The name of the field that is produced by the function.

   is the unnamed parameter

Examples

Example 1

Calculate the geohash value of a set of coordinates.

geohash(lat=myLatField, lon=myLonField)

groupBy

( Aggregate Function )

Like groupby in sql. Groups events by specified fields and executes aggregate functions on each group.
Returns events containing the fields specified in the field parameter and the fields returned by each aggregate function. For example a _count field if function=count()
When showing time series data the timechart and bucket functions are an extension of group by that groups by time. Look at timechart and bucket.
Groupby limits the number of groups to what is configured in MAX_STATE_LIMIT. Default is 20000. After that no more groups will be created. In that case see if the top function is a better match. It is on the roadmap to remove this limitation. At the moment groupby is implemented to work entirely in memeory and cannot spill to disk.

Parameters

Name Type Required Default Description
field [string] Yes specifies which fields to group by. Note it is possible to group by multiple fields.
function [Aggregate] No count(as=_count) Specifies which aggregate functions to perform on each group. Default is to count(as=_count) the elements in each group
limit string No limit for the number of group elements [0..∞]. Default is What is specified in the configuration parameter MAX_STATE_LIMIT which by default is 20000.

field    is the unnamed parameter

Examples

Example 1

Count different http status codes

groupBy(field=status_code, function=count())

Example 2

group by http method and http statuscode and count the events in each group

groupBy(field=[method, status_code], function=count())

Example 3

Find the maximum response time for each device while also counting number of requests for each device

groupBy(field=device, function=[max(responsetime, as=time), count()]) | sort(time)

holtwinters

( Filter Function )

Used to generate a trendline for a periodic dataset. The key parameter period specifies the length of the period to be applied. This is used after a timechart, as in timechart(...) | holtwinters(1week). The function implements triple exponential smoothing, and adds an extra graph for every series in the graph. The computation assumes periodic data set (the period parameter specifies the length of the period), such as daily or weekly traffic measurements. The function works best when the timechart’s span (with of each bucket) is a multiple of the period, as the smoothing is done in units of the buckets.

For example, a 30 day chart with a 7 day period would have 4 periods from day 2 to 30 (the first 2 days being ignored). Based on this, the first 3 periods (in this case period 1, 2 and 3) are fed into the smoothing/forecast computation, producing a “forecast” for the fourth period. Smoothing data for period 2, 3 and 4 are emitted, the last period being entirely computed.

Parameters

Name Type Required Default Description
period string Yes Defines the trend’s period, such as 1week or 1day
alpha number No 0.5 data smoothing factor in the interval [0..1]
beta number No 0 trend smoothing factor in the interval [0..1]
gamma number No 0 seasonal change smoothing in the interval [0..1]

period    is the unnamed parameter

Examples

Example 1

Show event count graph with dayly aggregates and compute the weekly trend.

timechart(function=count(), span=1d) | holtwinters(period=7d)

in

( Filter Function )

Filter records by values where field is in values.

Parameters

Name Type Required Default Description
field string Yes The field to filter records by.
values [string] Yes The values for to match the field against. A field which any of these values will pass through. Values can contain * wildcards.

field    is the unnamed parameter

Examples

Example 1

Filter by Log Level

loglevel =~ in(values=["ERROR", "WARN"])

ipLocation

( Filter Function )

Determine country, city, and long/latitude for an IP address (ipv4 or ipv6). The attributes ip.country, ip.city, ip.lon, ip.lat are added to the event.

In order to use your own GeoLite2 database, place a copy of GeoLite2-City.mmdb in the humio-data directory.

This product includes GeoLite2 data created by MaxMind, available from https://www.maxmind.com

Parameters

Name Type Required Default Description
field string No ip The field from which to get the IP address
as string No Name the prefix to add to fields added by the ipLocation function. Defaults to ‘.’ (the name of the field from which to get the IP address).

field    is the unnamed parameter

Examples

Example 1

Based on the field ip, the attributes ip.country, ip.city, ip.lon and ip.lat are added to the event.

ipLocation()

Example 2

Based on the field address, the attributes address.country, address.city, address.lon and address.lat are added to the event.

ipLocation(field=address)

Example 3

Based on the field ip, the attributes address.country, address.city, address.lon and address.lat are added to the event.

ipLocation(as=address)

kvParse

( Filter Function )

Key-value parse events. This function can run an extra key-value parser on events. It is used to parse key-values of the form:

  • key=value
  • key="value"
  • key='value'

So for a log line like this:

2017-02-22T13:14:01.917+0000 [main thread] INFO UserService - creating new user id=123, name='john doe' email=john@doe

The key-value parser extracts the fields:

  • id: 123
  • name: john doe
  • email: john@doe

Use the field parameter to specify which fields should be key-value parsed. Specify @rawstring to key-value parse the rawstring

Parameters

Name Type Required Default Description
field [string] No @rawstring Fields that should be key-value parsed
as string No Prefix for all resolved field keys
override bool No false Override existing values for keys that already exist in the event.

field    is the unnamed parameter

Examples

Example 1

Key-value parse the log line: `creating new user id=123, name='john doe' email=john@doe`. This will add the fields `id=123`, `name=john doe' and `email=john@doe` to the event.

kvParse()

Example 2

Key-value parse the log line: `creating new user id=123, name='john doe' email=john@doe loglevel=ERROR`. Assuming the event already has a `loglevel` field, replacing the value of that field with `ERROR` requires the `override=true` parameter.

kvParse(override=true)

Example 3

Key value parse a nested field. In this example we will use JSON input: `{"service": "paymentService", "type": "payment", "metadata":"host=server5,transactionID=123,processingTime=100"}` and parse out the key-values in the metadata field

parseJSON() | kvParse(metadata)

Example 4

Key-value parse the log line and export fields with a prefix: `creating new user id=123, name='john doe' email=john@doe`. This will add the fields `user.id=123`, `user.name=john doe' and `user.email=john@doe` to the event.

kvParse(as="user")

length

( Filter Function )

Returns the number of characters in a string field.

Parameters

Name Type Required Default Description
field string Yes The name of the input field to length.
as string No _length name of output field

field    is the unnamed parameter

Examples

Example 1

The number of characters in the @rawstring field

length(@rawstring)

Example 2

The number of characters in the @rawstring field, putting it in the field rawLength

length(@rawstring, as="rawLength")

lookup

( Filter Function )

Enrich events with metadata. Data can be managed in the Files tab in the UI or be uploaded as CSV or JSON files using the files endpoint.

Parameters

Name Type Required Default Description
from string Yes Specifies the source file.
include [string] No Specifies fields to include, defaults to all
on [string] Yes specifies which field(s) to join on; if passed as array the first is the field from the file, the second is from the event

from    is the unnamed parameter

Examples

Example 1

lookup rows from table "users.csv" where the 'userid' column equals the 'id' field of this event

lookup("users.csv", on=[userid,id])

lowercase

( Filter Function )

Lowercases the contents of a string field, replacing the value in the input fields.

Parameters

Name Type Required Default Description
field [string] Yes The name of the input field or fields (in []) to lowercase.
locale string No The name of the locale to use, as ISO 639 language and an optional ISO 3166 country, such as ‘da’,‘da_DK’ or ‘en_US’. When not specified, uses the system locale
include string No values What to lowercase. One of (‘values’, ‘fields’, ‘both’). Defaults to ‘values’. If set to ‘fields’ or ‘both’, then the ‘field’ names are matched against event fields non-case-sensitive, the matching fields names are lower-cased, and the resulting key-value pair added to the event, with the value being lower-cased only for ‘both’.

field    is the unnamed parameter

Examples

Example 1

With an event with a field "Bar=CONTENTS", you get 'contents' in the 'Bar' field

lowercase("Bar")

Example 2

With an event with a field "BAR=CONTENTS", you get 'CONTENTS' in the 'bar' field, while BAR is still CONTENTS.

lowercase("BaR", include="field")

Example 3

With an event with a field "BAR=CONTENTS", you get 'contents' in the 'bar' field, while BAR is still CONTENTS.

lowercase(field=["foo","bar"], include="both")

match

( Filter Function )

Search using a CSV or JSON file and enrich entries.

To use this function, you need to upload a CSV or JSON file using the lookup api.

You can use it to do something like field IN xxx, where xxx is really all the values in a column=Name in the CSV file=File.CSV you specify.

The default behavior (when struct=true) this function works like an ‘INNER JOIN’. With strict=false, this function works like the deprecated lookup function, i.e., it just enriches events that match, but let all events pass through even if they don’t match.

Parameters

Name Type Required Default Description
file string Yes Specifies the source file.
strict bool No true If true (default) only yield events that match a key in the file; if false let all events through (works like the function ‘lookup’).
include [string] No Specifies columns to include. If no argument given, include all columns from the corresponding row in the output event
column string No specifies which column in the file to use for the match. Defaults to the value of the ‘field’ parameter.
field string Yes specifies which field in the event (log line) that must match the given column value
glob bool No false If true, the key column in the underlying file is interpreted as a globbing pattern with *.

file    is the unnamed parameter

Examples

Example 1

matches events for which the 'id' field matches the value of 'userid' in the table "users.csv".

match(file="users.csv", column=userid, field=id, include=[])

Example 2

matches events for which the 'id' field matches the value of 'userid' in the table "users.csv", and add all other columns of the matching row to those events.

id =~ match(file="users.csv", column=userid)

max

( Aggregate Function )

Finds the biggest number for the specified field over a set of events. Result is returned in a field named _max

Parameters

Name Type Required Default Description
field string Yes field to extract a number from
as string No _max name of output field

field    is the unnamed parameter

Examples

Example 1

What was the maximum responsetime

max(responsetime)

min

( Aggregate Function )

Finds the smallest number for the specified field over a set of events. Result is returned in a field named _min

Parameters

Name Type Required Default Description
field string Yes field to extract a number from
as string No _min name of output field

field    is the unnamed parameter

Examples

Example 1

What was the minimum responsetime

min(responsetime)

moment

( Aggregate Function )

EXPERIMENTAL. calculates percentiles over numbers.
returns one event with a field for each of the percentiles specified in the percentiles parameter. Fields are named like by appending _ to the values specified in the percentiles parameter.For example the event could contain the fields _50, _75 and _99. Input can be a number OR the special format for the moment sketch "[min,max,Σv⁰,Σv¹,Σv²,...]" used for client-side computed sketches.

Parameters

Name Type Required Default Description
field string Yes Specifies the field for which to calculate percentiles. The field must contain numbers
k number No 5 Specifies the order of the underlying polynomial.
percentiles [number] No [50, 75, 99] Specifies which percentiles to calculate. An example is [50, 75, 99]
as string No prefix of output fields

field    is the unnamed parameter

Examples

Example 1

Calculate 50 75 99 and 99.9 percentiles for events with the field responsetime.

percentile(field=responsetime, percentiles=[50, 75, 99, 99.9])

Example 2

In a timechart, calculate percentiles for both of the fields r1 and r2.

timechart(function=[percentile(field=r1,as=r1),percentile(field=r2,as=r2)])

now

( Filter Function )

Assigns the current value as milliseconds since 1970.

When used in a non-live query, the value of now is fixed to when the query was issued. For live queries, the value of now is the live value of current system time, which can divert between Humio nodes.

Parameters

Name Type Required Default Description
as string No _now name of output field

as    is the unnamed parameter

Examples

Example 1

assign `curr` the value of 'now'

curr := now()

Example 2

use now() in an assignment

isOld := (now()-@timestamp) > 1000

parseHexString

( Filter Function )

parseHexString parses the input from hex encoded bytes decoding the resulting bytes as a string using the selected character set. If the input fields has a prefix (other than “0x” and “16#”) then use regex or replace to remove the prefix before using parseHexString. Any non-hex characters in the input are ignored - the decoding attempts to decode all chars in the field that match 0-9 or A-F. Case is ignored.

Parameters

Name Type Required Default Description
field string Yes Specifies the field containing the hex string to use as input
charset string No UTF-8 The charset to use when transforming bytes to string. Valid charsets are: (‘ISO-8859-1’, ‘UTF-8’)
as string No _parsehexstring Name of output field

field    is the unnamed parameter

Examples

Example 1

Parses the string '48656c6c6f576f726c64' from field=foo into the field text, getting the value 'Helloworld'

parseHexString(foo, as=text, charset="ISO-8859-1")

Example 2

Parses the string '0x4 865 6c6c6f576f726c6420 plus F 0 9 F 9 8 8 0' from field=hex into the field text, using UTF-8 and getting the value 'Helloworld 😀' where the smiley is the result of decoding the trailing digits.

hex := "0x4 865 6c6c6f576f726c6420 plus F 0 9 F 9 8 8 0" | text := parseHexString(hex)

parseInt

( Filter Function )

Converts an integer from any radix (or base), such as from hexadecimal or octal, to base=10, the decimal radix, expected as input by all other functions. E.g. converting “FF” to “255” using radix=16 or “77” to “63” using radix=8. The conversion is always unsigned.
If the input fields has a prefix (other than “0x” and “16#”) then use regex to remove the prefix before using parseInt.

Parameters

Name Type Required Default Description
field string Yes The name of the input field.
as string No The output name of the field to set (defaults to the same as the input field)
radix number No 16 Input Integer base (2 to 36).
endian string No big Input Digit-pair ordering (little, big) for hexadecimal.

field    is the unnamed parameter

Examples

Example 1

Shows how to parse a hexadecimal string in little endian as an integer. An input event with the field "hexval" with the value "8001" results in the field "centigrades" having the value (1*256)+128=384.

parseInt(hexval, as="centigrades", radix="16", endian="little")

Example 2

Shows how to parse a hexadecimal string in big endian as an integer. An input event with the field "hexval" with the value "8001" results in the field "centigrades" having the value (128*256)+1=32769.

parseInt(hexval, as="centigrades", radix="16", endian="little")

Example 3

Shows how to parse a binary string as an integer. An input event with the field "bitval" with the value "00011001" results in the field "flags" having the value 16+8+1=25.

parseInt(bitval, as="flags", radix="2")

parseJson

( Filter Function )

Parse data as JSON. The specified fields will be parsed as JSON. Specify field=@rawstring to parse the rawstring into JSON. It is possible to prefix the names of the extracted fields using the prefix parameter

Parameters

Name Type Required Default Description
field [string] No @rawstring Fields that should be parsed as JSON
prefix string No Prefix the name of the extracted JSON fields with the value of this parameter

field    is the unnamed parameter

Examples

Example 1

If the whole event send to Humio is JSON like `{"service": "userService", "timestamp": "2017-12-18T20:39:35Z", "msg": "user with id=47 logged in"}`

parseJson() | parseTimestamp(field=timestamp)

Example 2

If a field in the incoming event contains JSON like `2017-12-18T20:39:35Z user id=47 logged in details="{"name": "Peter", "email": "peter@test.com", "id":47}"`\n\n In the example below the details field is extracted using the `kvparse` function and then `parseJson` is used to parse the JSON inside the details field.

/(?<timestamp>\S+)/ | parseTimestamp(field=timestamp) | kvParse() | parseJson(field=details)

Example 3

It is possible to prefix names of the extracted JSON fields. This can be useful for avoiding collisons with existing fields with the same name. For example the input line `added new user details="{"email": "foo@test.com", "name": "Peter"}` Could be parsed into these fields: `user.email=foo@test.com`, `user.name=Peter`.

kvParse() | parseJson(field=details, prefix="user.")

parseTimestamp

( Filter Function )

Parse a string into a timestamp.

This function is important for creating parsers, as it is used to parse the timestamp for an incoming event.

Before parsing the timestamp, the part of the log containing the timestamp should be captured into a field.
This is done using functions like regex() and parseJson() before parseTimestamp.

The format string is specified using Javas DateTimeFormatter. Humio also supports specifying the following in the format string:

  • unixtimeMillis UTC time since 1970 in milliseconds
  • unixtime UTC time since 1970 in seconds

If the timestamp is parsed it will create a field ‘@timestamp’ containing the parsed timestamp in UTC milliseconds and a @timezone field containing the original timezone.

It is possible to parse time formats leaving out the year designator as is sometime seen in time formats from Syslog. For example Mar 15 07:48:13 can be parsed using the formtat MM d HH:mm:ss. In this case Humio will guess the year.

Parameters

Name Type Required Default Description
format string No Pattern used to parse the timestamp. Default value will parse an ISO 8601 date using the format string yyyy-MM-dd'T'HH:mm:ss[.SSS]XXX. The format string is specified using Javas DateTimeFormatter. The following formats are also possible: millis or unixTimeMillis - is the epoch time in millis (UTC). seconds or unixTimeSeconds - is the epoch time in seconds (UTC).
field string Yes The field holding the timestamp to be parsed
timezone string No If the timestamp does not contain a timezone, it can be specified using this parameter. A list of the available timezones can be found here. Example are Europe/London, America/New_York and UTC
as string No @timestamp Name of output field that will contain the parsed timestamp. The timestamp is represented as milliseconds since 1970 in UTC. Humio expects to find the timestamp in the field @timestamp, so do not change this when creating parsers.
timezoneAs string No @timezone Name of output field that will contain the parsed timezone. Humio expects to ind the timezone in the field @timezone, so do not change when creating parsers
addErrors bool No true Add errors to the event if it was not possible to parse a timestamp

format    is the unnamed parameter

Examples

Example 1

Events having a timestamp in ISO8601 format can be parsed using the default format. An example is a timestamp like `2017-12-18T20:39:35Z server is starting. binding port=8080`

/(?<timestamp>\S+)/ | parseTimestamp(field=timestamp)

Example 2

Parse timestamps in an accesslog with timestamps like `192.168.1.19 [02/Apr/2014:16:29:32 +0200] GET /hello/test/123 ...`

/(?<client>\S+) \[(?<@timestamp>.+)\] (?<method>\S+) (?<url>\S+)/ | parseTimestamp("dd/MMM/yyyy:HH:mm:ss Z", field=timestamp)

Example 3

Parse a timestamp without a timezone like `2015-12-18T20:39:35`

parseTimestamp("yyyy-MM-dd'T'HH:mm:ss", field=timestamp, timezone="America/New_York")

Example 4

Parse an event with a timestamp not containing year like `Feb 9 12:22:44 hello world`

/(?<timestamp>\S+\s+\S+\s+\S+)/ | parseTimestamp("MM [ ]d HH:mm:ss", field=timestamp, timezone="Europe/London")

parseUrl

( Filter Function )

Extracts URL components from a field. The attributes url.scheme, url.username, url.password, url.host, url.port, url.path, url.fragment and url.query are added to the event

Parameters

Name Type Required Default Description
field string No url The field from which to parse URL components.
as string No Use a prefix for the attributes added to the event.

field    is the unnamed parameter

Examples

Example 1

Parses the field named url and adds URL components to the event.

parseUrl()

Example 2

Parses the field named endpoint and adds URL components to the event.

parseUrl(field=endpoint)

Example 3

Parses the field named endpoint and adds URL components to the event with url as a prefix.

url := parseUrl(field=endpoint)

Example 4

Parses the field named endpoint and adds URL components to the event with apiEndpoint as a prefix.

parseUrl(field=endpoint, as=apiEndpoint)

percentile

( Aggregate Function )

calculates percentiles over numbers.
returns one event with a field for each of the percentiles specified in the percentiles parameter. Fields are named like by appending _ to the values specified in the percentiles parameter. For example the event could contain the fields _50, _75 and _99.

Parameters

Name Type Required Default Description
field string Yes Specifies the field for which to calculate percentiles. The field must contain numbers
percentiles [number] No [50, 75, 99] Specifies which percentiles to calculate. An example is [50, 75, 99]
as string No prefix of output fields

field    is the unnamed parameter

Examples

Example 1

Calculate 50 75 99 and 99.9 percentiles for events with the field responsetime.

percentile(field=responsetime, percentiles=[50, 75, 99, 99.9])

Example 2

In a timechart, calculate percentiles for both of the fields r1 and r2.

timechart(function=[percentile(field=r1,as=r1),percentile(field=r2,as=r2)])

range

( Aggregate Function )

Finds numeric range between the smallest and largest numbers for the specified field over a set of events. Result is returned in a field named _range

Parameters

Name Type Required Default Description
field string Yes field to extract a number from
as string No _range name of output field

field    is the unnamed parameter

Examples

Example 1

What was the range in responsetime

range(responsetime)

rdns

( Aggregate Function )

events using RDNS lookup.

Parameters

Name Type Required Default Description
server string No Specifies a DNS server address.
field string Yes Specifies the field to run the RDNS lookup against.
as string No hostname Specifies the field into which the resolved value is stored.

field    is the unnamed parameter

Examples

Example 1

Resolve ipAddress (if present) using the server 8.8.8.8, and store the resulting DNS name in 'dnsName'

rdns(ipAddress, server="8.8.8.8", as=dnsName)

Example 2

Resolve ipAddress (if present) and store the resulting DNS name in 'hostname'

rdns(ipAddress)

regex

( Filter Function )

Extract new fields using a regular expression. The regular expression can contain one or more named capturing groups. Fields with the names of the groups will be added to the events. Using “ in already quoted strings requires escaping. This is sometimes necessary when writing regular expressions. see example 3.
Humio uses Java regular expressions

Parameters

Name Type Required Default Description
regex string Yes Specifies a regular expression. The regular expression can contain one or more named capturing groups. Fields with the names of the groups will be added to the events.
field string No @rawstring Specifies the field to run the regular expression against. Default is running against @rawstring
strict bool No true specifies if events not matching the regular expression should be filtered out of the result set. Strict is the default
flags string No m specifies other regex flags “m” is multi-line, “i” is ignore_case, “d” is dotall i.e., dot includes newline
repeat bool No false If set to true, multiple matches yields multiple events

regex    is the unnamed parameter

Examples

Example 1

extract the domain name of the http referrer field. Often this field contains a full url, so we can have many different URLs from the same site. In this case we want to count all referrels from the same domain. this will add a field named refdomain to events matching the regular expression

regex("https?://(www.)?(?<refdomain>.+?)(/|$)", field=referrer) | groupby(refdomain, function=count()) | sort(field=_count, type=number, reverse=true)

Example 2

extract the userid from the url field. New fields is stored in a field named userid

regex(regex=".*/user/(?<userid>\S+)/pay", field=url)

Example 3

Shows how to escape " in the regular expression. This is necessary because the regular expresssion is itself in quotes. Extract the user and message from events like: 'Peter: "hello"' and 'Bob: "good morning"'

regex("(?<name>\S+): \"(?<msg>\S+)\"")

rename

( Filter Function )

Parameters

Name Type Required Default Description
field string Yes The field name to remove
as string Yes The new name of the field

field    is the unnamed parameter

Examples

Example 1

rename 'badName' to 'goodName'

 rename(field=badName, as=goodName) 

Example 2

rename 'badName' to 'goodName', using assignment syntax

goodName := rename(badName) 

replace

( Filter Function )

Replaces each substring of the specified fields value that matches the given regular expression with the given replacement.
Humio uses Java regular expressions

Parameters

Name Type Required Default Description
regex string Yes The regular expression to match
with string No The string to substitute for each match (defaults to “”)
replacement string No The string to substitute for each match (same as with)
field string No @rawstring Specifies the field to run the replacement on. Default is running against @rawstring

regex    is the unnamed parameter

Examples

Example 1

Correct a spelling mistake

replace(regex=rpoperties, with=properties)

Example 2

Get the integer part of a number. This example uses regex capturing groups

replace("(\d+)\..*", with="$1", field=a)

round

( Filter Function )

Rounds an input field how=round (default), how=floor or how=ceil.

Parameters

Name Type Required Default Description
field string Yes The names of the field to round.
as string No The output name of the field to round (defaults to the same as the input field)
how string No round how to round (round, ceil, floor).

field    is the unnamed parameter

Examples

Example 1

Round a number

round(myvalue)

Example 2

Round a number

timechart(function=max(value)) | round(_max, how=floor)

sample

( Filter Function )

Samples the event stream. Events that do not have the field being sampled are discarded

Parameters

Name Type Required Default Description
field string No @timestamp The names of the field to use for sampling events.
percentage number No 1 Keep this percentage of the events.

percentage    is the unnamed parameter

Examples

Example 1

Sample events keeping only 2% of the events

sample(percentage=2)

Example 2

Sample events keeping only 0.1% of the events to allow groupby to find the most common hosts without hitting the groupby-limit

sample(percentage=0.1) | groupby(host) | sort()

sankey

( Aggregate Function )

A companion function Sankey Widget that produces data compatible with the widget.

Parameters

Name Type Required Default Description
source string Yes The field containing the source node ID.
target string Yes The field containing the target node ID.
weight Aggregate No count(as=_count) A function used to calculate the weight the edges. Good candidates are functions like e.g. sum, count or max.

   is the unnamed parameter

Examples

select

( Filter Function )

Specify a set of fields to select from each event. You most likely want to use the table function instead. Table is an aggregate function that can also sort events while limiting the number of events.
A use-case for select is when you want to export a few fields from a large number of events into e.g. a CSV file. When viewed in the UI, you get the latest 200 events, but when exporting the result, you get all matching events.

Parameters

Name Type Required Default Description
fields [string] Yes The names of the fields to keep.

fields    is the unnamed parameter

Examples

Example 1

Look at HTTP GET methods and create a unsorted table with the fields statuscode and responsetime

method=GET | select([statuscode, responsetime])

Example 2

Get a table of timestamp and rawstring for all events in range. In the humio UI this will get limited to 200 entries, but exporting the result as e.g. CSV will export all matching events in the time window searched.

select([@timestamp, @rawstring])

session

( Aggregate Function )

Collects events into sessions, which are series of events that are no further than maxpause apart (defaults to 15m), and then performs an aggregate function across the events that make up the session.

Parameters

Name Type Required Default Description
maxpause string No 15m defines the maximum pause between sessions i.e., events more than this far apart will become separate sessions. Defaults to 15m
function [Aggregate] No count(as=_count) Specifies which aggregate functions to perform on each session. Default is to count(as=_count) the elements in each group

function    is the unnamed parameter

Examples

Example 1

Count unique visitors (each visitor defined as non-active for 15 minutes)

groupby(client_ip, function=session(maxpause=15m)) | count()

Example 2

Find the visits with most clicks

groupby(cookie_id, function=session(maxpause=15m, count(as=clicks))) | sort(clicks)

Example 3

Find the minimum and maximum values of the field bet within each session

groupby(cookie_id, function=session([max(bet),min(bet)]))

shannonEntropy

( Filter Function )

Calculates a entropy measure from a string of characters.

Parameters

Name Type Required Default Description
field string Yes The name of the input field.
as string No _shannonentropy The output name of the field to set

field    is the unnamed parameter

Examples

Example 1

Shows how to calculate a shannon entropy value for the string in a field. An input event with the field "dns_name" with the value "example.com" results in the field "entropy" having the rounded value 3.095795.

entropy := shannonEntropy(dns_name)

sort

( Aggregate Function )

Sort events by a field.
Setting the type field tells sort how to compare the individual values, either using lexicographical order (strings) or numerical magnitude (number, hex). The default type=any will make sort try to detect the type by looking at the values. type=hex support numbers as strings starting with either “0x” , “0X” or no prefix.
Warning: sorting is done in memory - so do not sort huge amounts of events. This is typicaly not a problem if the result has been aggregated. Typically sort is put last in the query after an aggregating function.

Parameters

Name Type Required Default Description
field [string] No _count Fields to sort by
type string No any type of the field we sort. Can be any, string, number, or hex. When set to any, sort tries to detect the type from the first value it finds for each field. If the value matches regex /-?0[xX]/ then hex is selected.
reverse bool No should sorting be reversed (i.e., descending)
order string No sorting (ascending or descending). descending is default
limit number No limit result size. If no limit is specified a default limit of 200 is used

field    is the unnamed parameter

Examples

Example 1

Count the different http status codes for a webserver and sort them descending by their count

groupby(field=statuscode, function=count()) | sort(field=_count, type=number, order=desc)

Example 2

Find the 50 slowest request from service A

service=my-service-a | sort(responsetime, reverse=true)

Example 3

Sort all results by statuscode, then by response_size within each status_code

#type=accesslog | sort([statuscode, response_size])

split

( Filter Function )

Split an event structure created by json array into distinct events. When Humio ingests JSON arrays, each array entry is turned into a separate attribute named [0], [1], … This function takes such an event and splits it into muliple events based on the prefix of such [N] attributes, allowing for aggregate functions across array values. It is not very efficient, so it should only be used after some agressive filtering.

Parameters

Name Type Required Default Description
field string No _events Field to split by
strip bool No false Strip the field prefix when splitting (default is false)

field    is the unnamed parameter

Examples

Example 1

In GitHub events, a PushEvent contains an array of commits, and each commit which gets expanded into subattributes of payload.commit_0, payload.commit_1, .... Humio cannot sum/count, etc across such attributes. Expands each PushEvent into one PushEvent for each commit so they can be counted.

type=PushEvent | split(payload.commits) | groupby(payload.commits.author.email) | sort()

splitString

( Filter Function )

Split a string by specifying a regular expression to split by

Parameters

Name Type Required Default Description
field string No @rawstring Field that needs splitting
by string Yes String/regex to split by
index number No Emit only this index after splitting. Can be negative: -1 designates the last element.
as string No _splitstring Emit selected attribute using this name

field    is the unnamed parameter

Examples

Example 1

Assuming an event has the @rawstring "2007-01-01 test bar" you can split the string into attributes `part[0]`, `part[1]`, and `part[2]`

 ... | part := splitString(field=@rawstring, by=" ")

Example 2

Assuming an event has the @rawstring "2007-01-01 test bar" you can split pick out the date part using

 ... | date := splitString(field=@rawstring, by=" ", index=0)

Example 3

Assuming an event has the @rawstring "<2007-01-01>test;bar" you can split the string into attributes `part[0]`, `part[1]`, and `part[2]`. In this case, the splitting string is a regex specifying any one of the characters `<`, `>`, or `;`

 ... | part := splitString(field=@rawstring, by="[<>;]")

Example 4

Split an event into multiple events by newlines. The first function `splitString()` creates @rawstring[0], @rawstring[1], ... for each line, and the following `split()` creates the multiple events from the 'array' of rawstrings.

 ... | splitString(by="\n", as=@rawstring) | split(@rawstring)

stats

( Aggregate Function )

Used to compute multiple aggregate functions over the input. ... | stats(function=[min(), max()]) – equivalent to just [min(), max()]. It produces one row of data that contains both min and max results.

Parameters

Name Type Required Default Description
function [Aggregate] No count(as=_count) Specifies which aggregate functions to perform on each group. Default is to count(as=_count) the elements

function    is the unnamed parameter

Examples

Example 1

This is equivalent to just `count()`

stats(function=count())

Example 2

find the maximum and minimum

[min_response := min(responsetime), max_response := max(responsetime)]

stdDev

( Aggregate Function )

Calculates the standard deviation for a field over a set of events. Result is returned in field named _stddev

Parameters

Name Type Required Default Description
field string Yes field to extract a number from and calculate standard deviation over
as string No _stddev name of output field

field    is the unnamed parameter

Examples

Example 1

Find the standard deviation of bytes send in http responses

stdDevBytes := stdDev(field=bytes_send)

stripAnsiCodes

( Filter Function )

Removes ANSI color codes and movement commands.

Parameters

Name Type Required Default Description
field string No @rawstring Specifies the field where to remove ANSI escape codes. Default is running against @rawstring
as string No name of output field, default is to replace contents of input field

field    is the unnamed parameter

Examples

Example 1

Remove the ANSI escape codes from the `message` field.

message := "\x1b[93;41mColor" | stripAnsiCodes(message) | @display := message

Example 2

Remove all ANSI escape codes from `@rawstring`

stripAnsiCodes()

subnet

( Filter Function )

Compute a subnet from a ipv4 field; by default emits a into a _subnet field.

Parameters

Name Type Required Default Description
bits number Yes Specifies the output field (defaults to _subnet.)
as string No _subnet Specifies the prefix bits to include in the subnet, e.g. 23.
field string Yes Specifies the input field.

field    is the unnamed parameter

Examples

Example 1

compute subnet for 'ipAddress' using 23bit prefix; emit into subnet field

subnet(ipAddress, bits=23, as=subnet)

sum

( Aggregate Function )

Calculates the sum for a field over a set of events. Result is returned in a field named _sum

Parameters

Name Type Required Default Description
field string Yes field to extract a number from and sum over”
as string No _sum name of output field

field    is the unnamed parameter

Examples

Example 1

How many bytes did our webserver send per minute

bucket(function=sum(bytes_send))

table

( Aggregate Function )

Represent the data as a table.
Specify a list of fields to select. Columns in the table are sorted in the specified field order. This is an aggregate function and it will limit the number of events returned using the limit parameter. It is possible to specify how the table is sorted using the field parameter.
(See select function for similar tabular output, that does not limit the number of events returned and does not sort the result and is thus better suited for exporting large mount of data to a file)

Parameters

Name Type Required Default Description
fields [string] Yes The names of the fields to select.
sortby [string] No @timestamp Fields to sort by
type string No any type of the field we sort. Can be any, string, number, or hex. When set to any, sort tries to detect the type from the first value it finds for each field. If the value matches regex /-?0[xX]/ then hex is selected.
reverse bool No should sorting be reversed (i.e., descending)
order string No sorting (ascending or descending). descending is default
limit number No limit result size. If no limit is specified a default limit of 200 is used

fields    is the unnamed parameter

Examples

Example 1

Look at HTTP GET methods and create a table with the fields statuscode and responsetime

method=GET | table([statuscode, responsetime])

Example 2

Show the name and responsetime of the 50 slowest requests

table([name, responsetime], sortby=respsonetime, reverse=true)

tail

( Aggregate Function )

Returns the newest events.

Parameters

Name Type Required Default Description
limit number No 200 The maximum number of events to return. The maximum allowed is 10000.

limit    is the unnamed parameter

Examples

Example 1

Select the 10 newest with loglevel=ERROR

loglevel=ERROR | tail(10)

Example 2

Select the 100 latest events and group them by loglevel

tail(limit=100) | groupby(loglevel)

timeChart

( Aggregate Function )

Draw a linechart where the x-axis is time. Time is grouped into buckets

Parameters

Name Type Required Default Description
limit number No Defines the maximum number of series to produce (defaults to 20). A warning is produced if this limit is exceeded, unless the parameter is specified explicitly.
span string No Defines the time span for each bucket. The time span is defined as a relative time modifier like 1hour or “3 weeks”. If not provided the search time interval is divided into 127 buckets
buckets number No Defines the the number of buckets. The time span is defined by splitting the query time interval into this many buckets. 0..1500
timezone string No Defines the time span for each bucket. The time span is defined as a relative time modifier like 1hour or “3 weeks”. If not provided the search time interval is divided into 127 buckets
series string No Each value in the field specified by this parameter becomes a series on the graph
unit [string] No Each value is a unit conversion for the given column. For instance: “bytes/span to Kbytes/day” converts a sum of bytes into Kb/day automatically taking the time span into account. If present, this array must be either length 1 (apply to all series) or have the same length as the function parameter. Default is no conversion. The documentation has a section on this conversion
function [Aggregate] No count(as=_count) Specifies which aggregate functions to perform on each group. Default is to count(as=_count) the elements in each group

series    is the unnamed parameter

Examples

Example 1

Show the number of different http methods over time.<br/>This is done by dividing events into time buckets of 1 minute. Count the http methods (GET, POST, PUT etc). The timechart will have a line for each http method

timechart(span=1min, series=method, function=count())

Example 2

Show the number of different http methods over time.<br/>This is done by dividing events into time buckets of 1 minute. Count the http methods (GET, POST, PUT etc). The timechart will have a line for each http method

timechart(buckets=1000, series=method, function=count())

Example 3

Graph response time percentiles

timechart(function=percentile(field=responsetime, percentiles=[50, 75, 90, 99, 99.9]))

Example 4

We use coda hale metrics to print rates of various events once per minute. Such lines include 1-minute average rates "m1=N" where N is some number. This example displays all such meters, converting the rates from events/sec to Ki/day.

type=METER rate_unit=events/second | timechart(name, function=avg(m1), unit="events/sec to Ki/day", span=5m)

Example 5

Upon completion of every humio request, we issue a log entry which (among other things) prints the size=N of the result. When summing such size's you would need to be aware of the span, but using a unit conversion, we can display the number in Mbytes/hour, and the graph will be agnostic to the span.

timechart(function=sum(size), unit="bytes/bucket to Mbytes/hour", span=30m)

top

( Aggregate Function )

Find the most common values of a field. It is also possible to find the occurrences of a field using the value of another field.

This function is implemented using a streaming approximation algorithm when the data set becomes huge. It is implemented using datasketches. By default a warning is issued if the result’s precision is worse than 5 percent. This can be specified using the error parameter The implementation uses a maxMapSize with value 32768. Details about the precision is found here. Only results falling within the threshold is returned.

Parameters

Name Type Required Default Description
field [string] Yes Which fields to group by count. If none of the fields are not present, the event is not counted. The top function works like groupby([*fields*], function=count()) | sort(_count)
sum string No Change semantics from counting to summing the value of a sum field. If specified, the top works like groupby([*fields*], function=sum(*sum*)) | sort(_sum)
limit number No 10 The number of results to return
as string No The name of the count field created by top. Defaults to _count, but changed to _sum if the sum parameter is used.
rest string No A row is returned holding the count of all the other values not in top
percent bool No false If set to true, add a column named percent containing the count in percentage of the total
error number No 5 Show a warning if the result is not precise enough. This parameter specifies the error treshold in percent. Default is 5 percent

field    is the unnamed parameter

Examples

Example 1

Find top ten users in the logs and show their count

top(user)

Example 2

Find top 20 ip addresses requesting most bytes from webserver

top(field=client, sum=bytes_sent, limit=20, as=bytes)

transpose

( Aggregate Function )

Transpose a (table-like) query result by creating an event (row) for each column (attribute name), in which attributes are named row[1], row[2], …

For example, given a query that returns a table, such as groupby(loglevel)

loglevel _count
ERROR 2
WARN 400
INFO 200

the result can be transposed to groupby(loglevel) | transpose()

column row[1] row[2] row[3]
_count 2 400 200
loglevel ERROR WARN INFO

To use the loglevel row as the header, use ... | transpose(header=loglevel)

column ERROR WARN INFO
_count 2 400 200

Parameters

Name Type Required Default Description
limit number No 5 Maximum number of rows to transpose (limited to 1000).
pivot string No Use this field as header AND column value.
header string No Use this field as header value.
column string No column Field to use as column value.

pivot    is the unnamed parameter

Examples

Example 1

Given a count of different log levels, transpose this into a single row with counts for each log level.

groupby(loglevel) | transpose(header=loglevel) | drop(column)

worldMap

( Aggregate Function )

A helper function to produce data compatible with the World Map widget. It takes either ip-addresses or lat/lon as input and buckets points using a geohashing algorithm.

Parameters

Name Type Required Default Description
ip string No The field containing the ip-address to look up geo-coordinates for.
lat string No A field containing the latitude to use for geohash bucketing.
lon string No A field containing the longitude to use for geohash bucketing
magnitude Aggregate No count(as=_count) A function used to calculate the magnitude (weight) of each bucket. This value is used to determine the size or opacity of the world map markers.

   is the unnamed parameter

Examples

Example 1

Plot ip-addresses on the world map. The magnitude is the number of observations in each bucket (the default).

worldMap(ip=myIpField)

Example 2

Plot existing geo-coordinates (latitude/longitude) on the world map. The `worldmap()` function will automatically bucket the locations to reduce the number of points.

worldMap(lat=location.latitude, lon=location.longitude)

Example 3

Plot ip addresses on the world map and use average latency as magnitude of the points.

worldMap(ip=myIpField, magnitude=avg(latency))