jq works by passing the incoming JSON data through a single expression (written as a pipeline of filters) to achieve the desired transformed data.
jq
command-line optionsThe jq
language is implemented by the jq
program.
This program provides several handy command-line options to control how the input is consumed and how the output is presented.
In the examples below you'll encounter:
-n
or --null-input
Normally the jq
program is given a file to read, or you send data to its input.
The --null-input
option allows you to generate JSON data without any inputs.
-c
or --compact-output
jq
pretty-prints its output by default.
It it extremely useful for humans to view the data when it's nicely formatted.
However that's not necessary for machines: the --compact-output
option removes the formatting whitespace to minimize the size of the resulting JSON.
-f filename
or --from-file filename
Read the jq
program from filename
instead of providing it on the command line.
sed
and awk
both use the -f
option for the same purpose.
You will see this used in the test scripts for the practice exercises.
See the manual for details about all the options.
Filters are also known as Expressions.
A filter takes an input and produces an output.
Like the way you work in a unix shell, you can join filters with a pipe |
to connect the output of one filter to the input of another.
.
This is the simplest filter.
It simply passes its input to its output.
For example, jq
pretty-prints by default, so passing JSON the .
filter gives nicely formatted output for free!
$ echo '[1, 2, 3]' | jq '.'
[
1,
2,
3
]
This will be quick introduction to working with arrays. We will cover this topic in greater detail later.
Array elements are accessed with brackets, and are zero-indexed.
$ echo '[10, 20, 30]' | jq '.[1]'
20
A filter can build an array by wrapping an expression in [
and ]
with a known list of elements:
jq -n '[1, 2, 3]'
to collect a stream of elements: for example,
range
is a function that outputs a stream of numbers
$ jq -n 'range(10; 70; 15)'
10
25
40
55
Using []
collects the results of the expression into an array
$ jq -c -n '[range(10; 70; 15)]'
[10,25,40,55]
The comma is not just syntax that separates array elements. Comma is an operator that joins streams.
For example [1, 2, 3]
is a filter that uses the array constructor []
to collect the result of joining the three expressions 1
, 2
and 3
.
Did you notice the semi-colons in range(10; 70; 15)
above?
Because commas have a specific purpose in the jq
language, functions that take multiple arguments use semi-colons to separate the arguments.
A quick introduction to objects.
Similar to many programming languages, use dots to access object properties
$ echo '{"foo": {"bar": "qux"}}' | jq '.foo.bar'
"qux"
Brackets can be used for objects too, but then quotes are needed for string literals. This is one method to work with keys containing spaces.
$ echo '{"foo bar": "qux"}' | jq '.["foo bar"]'
"qux"
You can construct an object with {}
and key: value
pairs.
Quotes are not required around keys that are "simple" strings.
jq -n '{question: (6 * 9), answer: 42}'
outputs
{
"question": 54,
"answer": 42
}
To treat the key as an expression, you must wrap it in parentheses (the following also outputs the same as above).
echo '[["question", "answer"], [54, 42]]' \
| jq '{(.[0][0]): .[1][0], (.[0][1]): .[1][1]}'
It is quite common to want to extract a subset of keys from a large object.
For example, to extract id
and name
from
{
"id": 101,
"name": "alpha widget",
"specifications": {...}
}
We could write
{id: .id, name: .name}
But this is so common, there is shorthand syntax for it:
{id, name}
For example, given file.json
containing
{
"key1": "value1",
"key2": [5, 15, 25]
}
Let's calculate the length of the key2 array:
$ jq '.key2 | length' file.json
3
We're piping the output of the .key2
expression as the input to length
, which unsurprisingly outputs the number of elements in the array.
This is an aspect of jq
that takes some getting used to --
most (but not all) functions act like filters where you pass data to the filter's input, not as an argument.
In this example, the input JSON data is ignored and has no impact on the output:
$ echo '{"answer": 42}' | jq '6 * 9'
54
A filter can output more than one value.
For example, the .[]
filter outputs each element of an array as a separate value:
$ jq -n -c '[1, 2, 3]'
[1,2,3]
$ jq -n -c '[1, 2, 3] | .[]'
1
2
3
Piping such a filter into another will execute the 2nd filter for each value:
$ jq -n -c '[1, 2, 3] | .[] | . * 2'
2
4
6
This is like implicit iteration.
Once you understand this technique, you'll realize very powerful jq
filters can be very concise.
Parentheses are used to group sub-expressions together to enforce the order of operations, just like in other languages.
In jq
, the need for them can appear to be somewhat surprising.
For example, let's say we want to construct an array with 2 elements: the square root of 9; and e raised to the power 1.
The two individual expressions are 9 | sqrt
and 1 | exp
.
We expect the output to be the array [3, 2.7...]
$ jq -n '[ 9|sqrt, 1|exp ]'
[
20.085536923187668,
2.718281828459045
]
Why didn't we get what we expected? jq
interprets that like this:
[ ((9|sqrt), 1) | exp ]
jq
builds a stream of two elements (3
and 1
) which are each given to exp
.
We need to ensure that exp
only takes one number as input.
In other words, we need to enforce that the pipe is evaluated before the comma.
$ jq -n '[ 9|sqrt, (1|exp) ]'
[
3,
2.718281828459045
]
From the manual
jq supports the same set of datatypes as JSON - numbers, strings, booleans, arrays, objects (which in JSON-speak are hashes with only string keys), and "null".
You'll learn more about these in subsequent exercises.
Whitespace is not significant in jq
.
Use spaces/tabs/newlines as you see fit to format your code.
We are unaware of any existing jq
style guides.
Values in jq
are immutable.
Filters that modify a value will output a new value.
This implies that jq
does not have global variables --
you'll need to get used to passing state from one filter to another.
The values false
and null
are considered false. Any other value
(including the number zero and the empty string/array/object) is true.
Without going into great depth (functions will be a topic for another exercise), here are some useful builtin functions:
Given an array as input, output the number of elements in the array.
$ jq -n '[10, 20, 30, 40] | length'
4
This operator does different things depending on the type of its operands: it adds numbers, it concatenates strings, it appends arrays, it merges objects.
$ jq -c -n '
3 + 4,
"foo" + "bar",
["a", "b"] + ["c"],
{"m": 10} + {"n": 20}
'
7
"foobar"
["a","b","c"]
{"m":10,"n":20}
add
is a function that takes an array and returns an item with all the elements added together using the rules of +
.
[1, 2, 3] | add
outputs 6
.
Given an array as input and a filter as an argument, output an array where the filter is applied to each element
$ jq -c -n '[10, 20, 30, 40] | map(. / 5)'
[2,4,6,8]
Given some input and a filter as an argument:
null
value, truly no output)For example, given some numbers, select the ones divisible by 3
$ jq -n 'range(10) | select(. % 3 == 0)'
0
3
6
9
Recall that range
outputs a stream of numbers.
select
will be invoked once per each number.
Only the numbers "passing" the expression are output.
You often need to select elements of an array. There are a couple of ways to do this.
With the input ["Anne", "Bob", "Cathy", "Dave"]
, select the names having length 4.
use map
and select
together
map(select(length == 4))
explode the array into elements, select
on that stream, and collect the results
[ .[] | select(length == 4) ]
Comments start with a #
character and continue to the end of the line.