A str
in Python is an immutable sequence of Unicode code points.
These could include letters, diacritical marks, positioning characters, numbers, currency symbols, emoji, punctuation, space and line break characters, and more.
Being immutable, a str
object's value in memory doesn't change; methods that appear to modify a string return a new copy or instance of that str
object.
A str
literal can be declared via single '
or double "
quotes. The escape \
character is available as needed.
>>> single_quoted = 'These allow "double quoting" without "escape" characters.'
>>> double_quoted = "These allow embedded 'single quoting', so you don't have to use an 'escape' character".
>>> escapes = 'If needed, a \'slash\' can be used as an escape character within a string when switching quote styles won\'t work.'
Multi-line strings are declared with '''
or """
.
>>> triple_quoted = '''Three single quotes or "double quotes" in a row allow for multi-line string literals.
Line break characters, tabs and other whitespace are fully supported.
You\'ll most often encounter these as "doc strings" or "doc tests" written just below the first line of a function or class definition.
They\'re often used with auto documentation β tools.
'''
Strings can be concatenated using the +
operator.
This method should be used sparingly, as it is not very performant or easily maintained.
language = "Ukrainian"
number = "nine"
word = "Π΄Π΅Π²'ΡΡΡ"
sentence = word + " " + "means" + " " + number + " in " + language + "."
>>> print(sentence)
...
"Π΄Π΅Π²'ΡΡΡ means nine in Ukrainian."
If a list
, tuple
, set
or other collection of individual strings needs to be combined into a single str
, <str>.join(<iterable>)
, is a better option:
# str.join() makes a new string from the iterables elements.
>>> chickens = ["hen", "egg", "rooster"]
>>> ' '.join(chickens)
'hen egg rooster'
# Any string can be used as the joining element.
>>> ' :: '.join(chickens)
'hen :: egg :: rooster'
>>> ' πΏ '.join(chickens)
'hen πΏ egg πΏ rooster'
Code points within a str
can be referenced by 0-based index
number from the left:
creative = 'μ°½μμ μΈ'
>>> creative[0]
'μ°½'
>>> creative[2]
'μ '
>>> creative[3]
'μΈ'
Indexing also works from the right, starting with a -1-based index
:
creative = 'μ°½μμ μΈ'
>>> creative[-4]
'μ°½'
>>> creative[-2]
'μ '
>>> creative[-1]
'μΈ'
There is no separate βcharacterβ or "rune" type in Python, so indexing a string produces a new str
of length 1:
>>> website = "exercism"
>>> type(website[0])
<class 'str'>
>>> len(website[0])
1
>>> website[0] == website[0:1] == 'e'
True
Substrings can be selected via slice notation, using <str>[<start>:stop:<step>]
to produce a new string.
Results exclude the stop
index.
If no start
is given, the starting index will be 0.
If no stop
is given, the stop
index will be the end of the string.
moon_and_stars = 'πππππβ'
sun_and_moon = 'πππππππππ'
>>> moon_and_stars[1:4]
'πππ'
>>> moon_and_stars[:3]
'πππ'
>>> moon_and_stars[3:]
'ππβ'
>>> moon_and_stars[:-1]
'πππππ'
>>> moon_and_stars[:-3]
'πππ'
>>> sun_and_moon[::2]
'πππππ'
>>> sun_and_moon[:-2:2]
'ππππ'
>>> sun_and_moon[1:-1:2]
'ππππ'
Strings can also be broken into smaller strings via <str>.split(<separator>)
, which will return a list
of substrings.
The list can then be further indexed or split, if needed.
Using <str>.split()
without any arguments will split the string on whitespace.
>>> cat_ipsum = "Destroy house in 5 seconds mock the hooman."
>>> cat_ipsum.split()
...
['Destroy', 'house', 'in', '5', 'seconds', 'mock', 'the', 'hooman.']
>>> cat_ipsum.split()[-1]
'hooman.'
>>> cat_words = "feline, four-footed, ferocious, furry"
>>> cat_words.split(', ')
...
['feline', 'four-footed', 'ferocious', 'furry']
Separators for <str>.split()
can be more than one character.
The whole string is used for split matching.
>>> colors = """red,
orange,
green,
purple,
yellow"""
>>> colors.split(',\n')
['red', 'orange', 'green', 'purple', 'yellow']
Strings support all common sequence operations.
Individual code points can be iterated through in a loop via for item in <str>
.
Indexes with items can be iterated through in a loop via for index, item in enumerate(<str>)
.
>>> exercise = 'αα±α·αα»ααΊα·'
# Note that there are more code points than perceived glyphs or characters
>>> for code_point in exercise:
... print(code_point)
...
α
α±
α·
α
α»
α
αΊ
α·
# Using enumerate will give both the value and index position of each element.
>>> for index, code_point in enumerate(exercise):
... print(index, ": ", code_point)
...
0 : α
1 : α±
2 : α·
3 : α
4 : α»
5 : α
6 : αΊ
7 : α·
You are helping your younger sister with her English vocabulary homework, which she is finding very tedious. Her class is learning to create new words by adding prefixes and suffixes. Given a set of words, the teacher is looking for correctly transformed words with correct spelling by adding the prefix to the beginning or the suffix to the ending.
The assignment has four activities, each with a set of text or words to work with.
One of the most common prefixes in English is un
, meaning "not".
In this activity, your sister needs to make negative, or "not" words by adding un
to them.
Implement the add_prefix_un(<word>)
function that takes word
as a parameter and returns a new un
prefixed word:
>>> add_prefix_un("happy")
'unhappy'
>>> add_prefix_un("manageable")
'unmanageable'
There are four more common prefixes that your sister's class is studying:
en
(meaning to 'put into' or 'cover with'),
pre
(meaning 'before' or 'forward'),
auto
(meaning 'self' or 'same'),
and inter
(meaning 'between' or 'among').
In this exercise, the class is creating groups of vocabulary words using these prefixes, so they can be studied together. Each prefix comes in a list with common words it's used with. The students need to apply the prefix and produce a string that shows the prefix applied to all of the words.
Implement the make_word_groups(<vocab_words>)
function that takes a vocab_words
as a parameter in the following form:
[<prefix>, <word_1>, <word_2> .... <word_n>]
, and returns a string with the prefix applied to each word that looks like:
'<prefix> :: <prefix><word_1> :: <prefix><word_2> :: <prefix><word_n>'
.
>>> make_word_groups(['en', 'close', 'joy', 'lighten'])
'en :: enclose :: enjoy :: enlighten'
>>> make_word_groups(['pre', 'serve', 'dispose', 'position'])
'pre :: preserve :: predispose :: preposition'
>> make_word_groups(['auto', 'didactic', 'graph', 'mate'])
'auto :: autodidactic :: autograph :: automate'
>>> make_word_groups(['inter', 'twine', 'connected', 'dependent'])
'inter :: intertwine :: interconnected :: interdependent'
ness
is a common suffix that means 'state of being'.
In this activity, your sister needs to find the original root word by removing the ness
suffix.
But of course there are pesky spelling rules: If the root word originally ended in a consonant followed by a 'y', then the 'y' was changed to 'i'.
Removing 'ness' needs to restore the 'y' in those root words. e.g. happiness
--> happi
--> happy
.
Implement the remove_suffix_ness(<word>)
function that takes in a word
, and returns the root word without the ness
suffix.
>>> remove_suffix_ness("heaviness")
'heavy'
>>> remove_suffix_ness("sadness")
'sad'
Suffixes are often used to change the part of speech a word is assigned to.
A common practice in English is "verbing" or "verbifying" -- where an adjective becomes a verb by adding an en
suffix.
In this task, your sister is going to practice "verbing" words by extracting an adjective from a sentence and turning it into a verb. Fortunately, all the words that need to be transformed here are "regular" - they don't need spelling changes to add the suffix.
Implement the adjective_to_verb(<sentence>, <index>)
function that takes two parameters.
A sentence
using the vocabulary word, and the index
of the word, once that sentence is split apart.
The function should return the extracted adjective as a verb.
>>> adjective_to_verb('I need to make that bright.', -1 )
'brighten'
>>> adjective_to_verb('It got dark as the sun set.', 2)
'darken'
Sign up to Exercism to learn and master Python with 17 concepts, 140 exercises, and real human mentoring, all for free.