Charlists are created using the ~c
Sigil.
~c"hello"
Note that in older versions of Elixir, charlists are represented as 'hello'
with single quotes.
A charlist is a list of integers. The integers represent the Unicode values of a given character — also known as code points.
[65, 66, 67]
# => ~c"ABC"
~c"" === []
# => true
is_list(~c"hello")
# => true
Because charlist are lists, you can work with them just like with any other list - using recursion and pattern matching, or using the List
module.
[first_letter | _] = ~c"cat"
first_letter
# => 99
List.first(~c"hello")
# => 104
List.pop_at(~c"hello", 0)
# => {104, ~c"ello"}
You can concatenate two lists together using ++
.
~c"hi" ++ ~c"!"
# => ~c"hi!"
The longer the first list is, the slower the concatenation, so avoid repeatedly appending to lists of arbitrary length.
If a list of integers contains only integers that are code points of printable character, it will be displayed as a charlist. Even if it was defined using the []
syntax.
~c"ABC"
# => ~c"ABC"
[65, 66, 67]
# => ~c"ABC"
If a list of integers contains even one code point of an unprintable character (e.g. 0-6
, 14-26
, 28-31
), it will be displayed as a list. Even if it was defined using the~c""
syntax.
~c"ABC\0"
# => [65, 66, 67, 0]
[65, 66, 67, 0]
# => [65, 66, 67, 0]
Printability can be checked with List.ascii_printable?
.
List.ascii_printable?([65, 66, 67])
# => true
List.ascii_printable?([65, 66, 67, 0])
# => false
Keep in mind that those are two different ways of displaying the same data. The values are strictly equal.
~c"ABC" === [65, 66, 67]
# => true
When printing a list with IO.inspect
, you can use the :charlists
option to control how lists are printed.
IO.inspect(~c"ABC", charlists: :as_charlists)
# => prints ~c"ABC"
IO.inspect(~c"ABC", charlists: :as_lists)
# => prints [65, 66, 67]
You can prepend a character with ?
to get its code point.
?A
# => 65
[?:, ?)]
# => ~c":)"
Charlists and strings consisting of the same characters are not considered equal.
~c"hello" == "hello"
false
Each value in a charlist is the Unicode code point of a character whereas in a string, the code points are encoded as UTF-8.
IO.inspect(~c"tschüss", charlists: :as_lists)
# => prints [116, 115, 99, 104, 252, 115, 115]
IO.inspect("tschüss", binaries: :as_binaries)
# => prints <<116, 115, 99, 104, 195, 188, 115, 115>>
Note how ü
, code point 252
, is encoded in UTF-8 as 195
and 188
.
In practice, charlists are rarely used. Their main use case is interfacing with Erlang, in particular when using older libraries that do not accept binaries as arguments.
When working with Elixir, use strings to store text. The String
module offers a wide choice of functions to process text, functions not available for charlists.
Charlists can be converted to strings with to_string
.
to_string(~c"hello")
# => "hello"