Suppose we want to sum an array of numbers. There are many ways to accomplish this goal.
In many languages, this problem is expressed as a loop with an accumulator variable.
This problem can be written as a recursive function. In pseudo-code, we might have this.
function Add(X, Sum=0):
if X is empty then
return Sum
else
return Add(rest(X), Sum + first(X))
end
end
This method of dividing the problem into smaller pieces can also be described as "reducing towards the base case."
Reduce is a way to combine all the elements of a data structure into a single value. The process iterates over the data structure, applying a function to each element to update the accumulated result.
In jq
, this process is implemented in the reduce
filter.
In other languages, it might be called "fold", "fold-left", "inject", or "aggregate".
The jq
reduce
expression looks like this.
reduce STREAM_EXPRESSION as $var (INITIAL_VALUE; UPDATE_EXPRESSION)
STREAM_EXPRESSION
is a stream of items, each stored in the $var
variable in turn.
.[]
: $myArray | .[]
.INITIAL_VALUE
is the starting value of the accumulated result (known as the "accumulator").UPDATE_EXPRESSION
combines ("folds") the current value ($var
) into the accumulator.
.
is the value of the accumulator.reduce
.Let's look at an example: adding up the numbers in an array.
The add
filter does just this, but we'll see how to implement it.
If we use [10, 20, 30, 40]
as the input, and taking zero as the initial state, this is what each step looks like.
# | state | element | reducer | result |
---|---|---|---|---|
1 | 0 | 10 | 0 + 10 | 10 |
2 | 10 | 20 | 10 + 20 | 30 |
3 | 30 | 30 | 30 + 30 | 60 |
4 | 60 | 40 | 60 + 40 | 100 |
In jq
syntax, that looks like this code.
0 + 10 | . + 20 | . + 30 | . + 40
We can express that with the reduce
filter.
[10, 20, 30, 40] | reduce .[] as $n (0; . + $n) # => 100
The add
builtin is actually implemented with reduce
, but uses "null" as the initial state (any data type can be added to null).
def add: reduce .[] as $x (null; . + $x);
In the reducing expression, .
is the accumulator.
If the input is some object that you need to reference inside the reducing function, you need to store it in a variable.
{"apple": 10, "banana": 16, "carrot": 4}
| . as $obj
| reduce (keys | .[]) as $key (0; . + $obj[$key]) # => 30
The accumulator can be of any type of data. For example you may want to reverse an array.
["A", "B", "C", "D"]
| reduce .[] as $elem ([]; [$elem] + .) # => ["D", "C", "B", "A"]