def abbreviate(to_abbreviate):
phrase = to_abbreviate.replace('-', ' ').replace('_', ' ').upper().split()
acronym = ''
for word in phrase:
acronym += word[0]
return acronym
- This approach begins by using
str.replace()
to "scrub" (remove) non-letter characters such as'
,-
,_
, and white space fromto_abbreviate
. - The phrase is then upper-cased by calling
str.upper()
, - Finally, the phrase is turned into a
list
of words by callingstr.split()
.
The three methods above are all chained together, with the output of one method serving as the input to the next method in the "chain".
This works because both replace()
and upper()
return strings, and both upper()
and split()
take strings as arguments.
However, if split()
were called first, replace()
and upper()
would fail, since neither method will take a list
as input.
After the phrase is cleaned and split into a word list, we declare an empty acronym
string to hold our final acronym.
The phrase list
is then looped over via for word in phrase
.
The first letter of each word is selected via bracket notation
, and concatenated via +
to the acronym
string.
When the loop is complete, acronym
is returned from the function.
re.findall()
or re.finditer()
can also be used to "scrub" to_abbreviate
.
These two methods from the re
module will return a list
or a lazy iterator
of results, respectively.
As of this writing, both of these methods benchmark slower than using str.replace()
for scrubbing.