I'm trying to replace some accented letters from Portuguese words in Python using re.
accentedLetters = ['à', 'á', 'â', 'ã', 'é', 'ê', 'í', 'ó', 'ô', 'õ', 'ú', 'ü']
letters = ['a', 'a', 'a', 'a', 'e', 'e', 'i', 'o', 'o', 'o', 'u', 'u']
So the accentedLetters will be replaced by the letter in the letters array.
In this way, my expectated result are for example:
ação => açao
frações => fraçoes
For doing this, I created the following function:
def removeAccents(word):
accentedLetters = ['à', 'á', 'â', 'ã', 'é', 'ê', 'í', 'ó', 'ô', 'õ', 'ú', 'ü']
letters = ['a', 'a', 'a', 'a', 'e', 'e', 'i', 'o', 'o', 'o', 'u', 'u']
for i, al in enumerate(accentedLetters):
rule = re.compile(str(al), re.UNICODE)
word = rule.sub(letters[i], word)
print word
return word
What I'm doing wrong? Is there a better way of doing this?
Aucun commentaire:
Enregistrer un commentaire