One piece of advice new bash programmers will hear over and over:
Quote your variables!
Why is that?
When you expand a variable (like $name
) you almost always want to use the value of the variable as a single string (or number).
You don't want the value to be unexpectedly handled as two or more separate strings.
Quoting is the primary mechanism to ensure that a piece of text is treated as a single word. There are two main type of quotes.
Double quotes allow parameter expansion, arithmetic expansion and command substitution within them.
echo "This is bash version $BASH_VERSION"
You can include a literal double quote inside a double-quoted string by preceding it with a backslash.
echo "She said \"Let me teach you about bash.\""
Single quotes do not allow any expansions inside them.
echo 'This is bash version $BASH_VERSION'
# ...^..................................^
# prints exactly: This is bash version $BASH_VERSION
It is not possible to embed a single quote inside a single-quoted string.
What's wrong with echo $name
, exactly?
When the shell parses a line of code, it performs several kinds of expansions, in order. Two of the last expansions are Word Splitting and Filename Expansion.
Any results of parameter expansion, command substitution or arithmetic expansion that did not occur within double quotes are eligible for "word splitting".
The shell takes the expanded result, and splits it on any sequence of characters present in the value of $IFS
.
By default, this is space, tab and newline.
We can see this in action using a shell loop:
sentence="The quick brown fox jumps."
i=0
for word in "$sentence"; do echo "$((++i)) $word"; done
# ..........^.........^
outputs
1 The quick brown fox jumps.
But unquoted:
i=0
for word in $sentence; do echo "$((++i)) $word"; done
# .........^.........^
outputs
1 The
2 quick
3 brown
4 fox
5 jumps.
Let's see it with a different value of $IFS
:
csv="first,second,third"
IFS=","
for word in $csv; do echo "$word"; done
outputs
first
second
third
A techique to disable word splitting is to set $IFS
to an empty string
sentence="The quick brown fox jumps."
IFS= # or explicitly, IFS=""
i=0
for word in $sentence; do echo "$((++i)) $word"; done
# .........^.........^
outputs
1 The quick brown fox jumps.
However, leaving variables unquoted is not recommended unless you also disable Filename Expansion. We'll see how to do that below.
"Glob" patterns are used to express a concise pattern to match a set of files.
For example, you might match all the Markdown files in a directory with *.md
.
Glob patterns have existed since the invention of Unix.
The glob wildcard characters are:
*
matches zero or more of any character,
?
matches exactly one of any character,
[...]
matches exactly one character from the set of characters listed inside the brackets.
[abc]
matches exactly one character that is either an a
or a b
or a c
.
^
)
[][]
is a character set that will match either a close or an open bracket.[[\]]
[x-y]
matches exactly one character that is in the range from x
to y
^
) or the last character before the closing bracket.
[-xy]
or [xy-]
each match exactly one character that is an -
or a x
or a y
[x\-y]
[0-9A-Fa-f]
matches a hexadecimal digit[^abc]
matches exactly one character that is NOT an a
or a b
or a c
,[^x-y]
matches exactly one character that is NOT in the range from x
to y
[[:character_class:]]
matches one character that is in the named "character_class"
lower
-- lowercase lettersupper
-- uppercase lettersalpha
-- lettersdigit
-- digitsalnum
-- letters and digitsspace
-- whitespaceblank
-- horizontal whitespace (space and tab)xdigit
-- hexadecimal digitsword
-- characters allowed in an identifier (letters, numbers, underscore)punct
-- punctuationcntrl
-- control charactersgraph
-- "visible" charactersprint
-- visible characters plus space[[:alpha:]]
class consists of [[:lower:][:upper:]]
[[:alnum:]]
class consists of [[:alpha:][:digit:]]
[[:word:]]
class consists of [[:alnum:]_]
[[:word:]]
and [[:punct:]]
[[:graph:]]
class consists of [[:alnum:][:punct:]]
[[:print:]]
class consists of [[:graph:] ]
-- just space (octal 040) not any other whitespace.You can iterate over a list of files with a for
loop
for file in *.csv; do
do_something_with "$file"
done
Note that the variable "$file"
is quoted.
The glob pattern can return files that contain spaces in the name.
We need to quote the variable so that the do_something_with
command receives exactly one argument, the file name.
Glob patterns are not regular expressions.
.
?
.*
*
.*
means "match a filename where the first character is a literal dot followed by zero or more of any characters"..+
?*
Bash provides extended patterns that are closer to regular expressions.
There are times when you want to suppress filename expansion. Handling user input is one such time. Consider this code snippet
read -p "Enter something: " user_input
echo $user_input
If the user enters an asterisk (*
), what will be output?
The list of files in the current directory.
That's probably not what you want to do.
Another example is to pass patterns as arguments to programs that expect patterns.
The tr
program is often used for text manipulation. To lowercase a variable value, you can do this
var="My Puppet"
echo "$var" | tr '[:upper:]' '[:lower:]'
outputs
my puppet
A common mistake is to forget to quote tr
's arguments: tr [:upper:] [:lower:]
Why is that a problem?
Unquoted glob patterns will be expanded to the list of files that match.
Suppose the person using your script has a files named p
and u
and w
in the current directory:
touch p u w
var="My Puppet"
echo "$var" | tr [:upper:] [:lower:]
This results in an error!
The unquoted [:upper:]
pattern matches any of :
, u
, p
, e
or r
.
The unquoted [:lower:]
pattern matches any of :
, l
, o
, w
, e
or r
.
Bash matches the files p
and u
for the first pattern, and the file w
for the second pattern, and tr
is invoked like
echo "$var" | tr p u w
And that's the wrong number of arguments for tr
.
You can use the set
command to disable filename expansion.
echo * # prints a list of files
set -f # disable filename expansion
echo * # prints a literal asterisk
set +f # enable filename expansion
There are several settings for the builtin shopt
command that control how filename expansion operates.
A couple of interesting ones are:
shopt -s nocaseglob
-- perform case insensitive matchingshopt -s extglob
-- enable extended patternsshopt -s nullglob
-- if no files match, replace the pattern with nothing.
The default behaviour is to leave the pattern in place as a literal string.