def abbreviate(to_abbreviate):
phrase = to_abbreviate.replace('-', ' ').replace('_', ' ').upper().split()
acronym = ''
for word in phrase:
acronym += word[0]
return acronym
- This approach begins by using
str.replace()to "scrub" (remove) non-letter characters such as',-,_, and white space fromto_abbreviate. - The phrase is then upper-cased by calling
str.upper(), - Finally, the phrase is turned into a
listof words by callingstr.split().
The three methods above are all chained together, with the output of one method serving as the input to the next method in the "chain".
This works because both replace() and upper() return strings, and both upper() and split() take strings as arguments.
However, if split() were called first, replace() and upper() would fail, since neither method will take a list as input.
After the phrase is cleaned and split into a word list, we declare an empty acronym string to hold our final acronym.
The phrase list is then looped over via for word in phrase.
The first letter of each word is selected via bracket notation, and concatenated via + to the acronym string.
When the loop is complete, acronym is returned from the function.
re.findall() or re.finditer() can also be used to "scrub" to_abbreviate.
These two methods from the re module will return a list or a lazy iterator of results, respectively.
As of this writing, both of these methods benchmark slower than using str.replace() for scrubbing.