You can define your own custom functions in jq
to encapsulate whatever logic you need.
Functions act just like builtins: they take an input and emit zero, one or more outputs.
You can define a jq
function using the following syntax:
# no arguments
def funcname: expression;
# or with arguments
def funcname(args): expression;
def
keyword,jq
syntax, you can use arbitrary whitespace for readability.Functions must be defined before they are used: this is an error:
def A: B(10);
def B(n): n + 1;
A
# => error: B/1 is not defined
This implies you have to place functions at the top of your jq
code, prior to the "main" expression.
Functions can be nested:
def A:
def B(n): n + 1;
B(10)
;
A
# => 11
Here, the B
function is only visible in the body of A
.
A function introduces a new scope for variables and nested functions.
Function arguments are separated by semi-colons not commas. For example, a function that takes a number, and then adds a number and multiplies by a number:
def add_mul(adder; multiplier): (. + adder) * multiplier;
10 | add_mul(5; 4) # => 60
Semi-colons are needed because comma already has a purpose in jq
: an operator that joins streams.
Using a comma instead of a semi-colon will attempt to make two calls to a 1-argument add_mul
function, which doesn't exist and therefore will fail on the first attempted call:
10 | add_mul(5, 4)
# error: add_mul/1 is not defined
The comma in 5, 4
concatenates the numbers 5 and 4 into a stream.
When we call a function with a stream as an argument, jq
will call that function multiple times, once for each value in the stream.
This is an example of the "implicit iteration" inherent in jq
streams.
10 | add_mul(5, 4)
is equivalent to the following.
(10 | add_mul(5)), (10 | add_mul(4))
Now we can see how the add_mul/1 is not defined
error pops up.
Function arguments are filters, not values. In this sense, they act like what other languages describe as callbacks:
Using the add_mul
function as an example:
10 | add_mul(. + 5; . - 2) # => 200
What's happening here?
adder
argument gets the expression . + 5
. + adder
, that becomes . + . + 5
. == 10
multiplier
argument is the expression . - 2
25 * 8 == 200
Sometimes you'll want to "materialize" an argument into a variable:
def my_func(arg):
arg as $arg
| other stuff ...
;
There's a shorthand for this:
def my_func($arg):
other stuff ...
;
Take note that this is just "syntactic sugar": the name arg
with no $
is still in scope in the function.
For example, I wrote something like this to solve an exercise:
# function that encodes the input value
def code:
# expression here
;
def equals($code):
(. | code) as $this_code
| $code == $this_code
;
("some key value" | code) as $key
| ["array", "of", "values"]
| map(select(equals($key)))
and I was surprised that every value of the array equalled the key.
This happened because jq saw the equals
function as
def equals(code):
code as $code
| (. | code) as $this_code
| $code == $this_code
;
The argument code
overrode the previously defined function code
.
That meant (. | code)
simply outputs the argument instead of calculating a new code based on the input value.
Thus $this_code
and $code
were always the same.
Functions have an arity -- the number of arguments they take.
Functions can use the same name with different arities.
The builtin range
function demonstrates this: range/1
, range/2
and range/3
all co-exist.
This can be useful for defining recursive functions that carry state via arguments.
For example map
could be implemented like:
def my_map($accumulator; func):
if length == 0
then $accumulator
else first as $elem | .[1:] | my_map($accumulator + [$elem | func]; func)
end
;
def my_map(func):
my_map([]; func)
;
[1, 2, 3, 4] | my_map(. * 10) # => [10, 20, 30, 40]
jq
will perform tailcall optimization, but for 0-arity functions only.
A jq
module is a file containing only functions.
Modules are included into a jq program with the include
or import
commands.