Reputation: 889
I'm trying to transform a Python string from its original form to its vowel/consonant combinations.
Eg - 'Dog' becomes 'cvc' and 'Bike' becomes 'cvcv'
In R I was able to employ the following method:
con_vowel <- gsub("[aeiouAEIOU]","V",df$col_name)
con_vowel <- gsub("[^V]","C",con_vowel)
df[["composition"]] <- con_vowel
This would assess whether the character is vowel and if true assign the character 'V', then assess that string and replace anything that wasn't 'V' with 'C', then place the results into a new column called 'composition' within the dataframe.
In Python I have written some code in an attepmpt to replicate the functionality but it does not return the desired result. Please see below.
word = 'yoyo'
for i in word.lower():
if i in "aeiou":
word = i.replace(i ,'v')
else: word = i.replace(i ,'c')
print(word)
The theory here is that each character would be evaluated and, if it isn't a vowel, then by deduction it must be a consonant. However the result I get is:
v
I underastand why this is happening, but I am no clearer as to how to achieve my desired result.
Please note that I also need the resultant code to be applied to a dataframe column and create a new column from these results.
If you could explain the workings of your answer it would help me greatly.
Thanks in advance.
Upvotes: 9
Views: 1429
Reputation: 59579
There's a method for this; it's translate
. It's both efficient and defaults to pass values that are not found in your translation table (like ' '
).
You can use the string
library to get all of the consonants if you want.
import pandas as pd
import string
df = pd.DataFrame(['Cat', 'DOG', 'bike', 'APPLE', 'foo bar'], columns=['words'])
vowels = 'aeiouAEIOU'
cons = ''.join(set(string.ascii_letters).difference(set(vowels)))
trans = str.maketrans(vowels+cons, 'v'*len(vowels)+'c'*len(cons))
df['translated'] = df['words'].str.translate(trans)
words translated
0 Cat cvc
1 DOG cvc
2 bike cvcv
3 APPLE vcccv
4 foo bar cvv cvc
It's made for exactly this, so it's fast.
# Supporting code
import perfplot
import pandas as pd
import string
def with_translate(s):
vowels = 'aeiouAEIOU'
cons = ''.join(set(string.ascii_letters).difference(set(vowels)))
trans = str.maketrans(vowels+cons, 'v'*len(vowels)+'c'*len(cons))
return s.str.translate(trans)
def with_replace(s):
return s.replace({"[^aeiouAEIOU]":'c', '[aeiouAEIOU]':'v'}, regex=True)
perfplot.show(
setup=lambda n: pd.Series(np.random.choice(['foo', 'bAR', 'foobar', 'APPLE', 'ThisIsABigWord'], n)),
kernels=[
lambda s: with_translate(s),
lambda s: with_replace(s),
],
labels=['Translate', 'Replace'],
n_range=[2 ** k for k in range(19)],
equality_check=None,
xlabel='len(s)'
)
Upvotes: 5
Reputation: 14113
use string.replace with some regex to avoid the loop
df = pd.DataFrame(['Cat', 'DOG', 'bike'], columns=['words'])
# use string.replace
df['new_word'] = df['words'].str.lower().str.replace(r"[^aeiuo]", 'c').str.replace(r"[aeiou]", 'v')
print(df)
words new_word
0 Cat cvc
1 DOG cvc
2 bike cvcv
Upvotes: 3
Reputation: 18316
vowels = set("aeiou")
word = "Dog"
new_word = ""
for char in word.lower():
new_word += "v" if char in vowels else "c"
print(new_word)
Note that
this uses set
for vowels for faster membership test. Other than that, we traverse the lowered verison of the word
and add the desired character (v
or c
) to newly formed string via a ternary.
Upvotes: 1
Reputation: 8589
In Python strings are immutable.
Why?
There are several advantages.
One is performance: knowing that a string is immutable means we can allocate space for it at creation time, and the storage requirements are fixed and unchanging. This is also one of the reasons for the distinction between tuples and lists.
Another advantage is that strings in Python are considered as “elemental” as numbers. No amount of activity will change the value 8 to anything else, and in Python, no amount of activity will change the string “eight” to anything else.
In order to reduce confusion and potential errors it is preferable to create a new string instead of changing the original. I have also added the is_alpha() in order to be able to understand if we are dealing with an alphabet letter or a number / symbol and act accordingly.
Here's my code:
word = 'yoyo'
def vocals_consonants_transformation(word):
modified_word = ""
for i in range(0, len(word)):
if word[i].isalpha():
if word[i] in "aeiou":
modified_word += 'v'
else:
modified_word += 'c'
else:
modified_word += word[i]
return modified_word
print(vocals_consonants_transformation(word))
Output
cvcv
Source:
https://docs.python.org/3/faq/design.html#why-are-python-strings-immutable
Upvotes: 2
Reputation: 150785
You can use replace
with regex=True
:
words = pd.Series(['This', 'is', 'an', 'Example'])
words.str.lower().replace({"[^aeiou]":'c', '[aeiou]':'v'}, regex=True)
Output:
0 ccvc
1 vc
2 vc
3 vcvcccv
dtype: object
Upvotes: 4
Reputation: 23
You probably already realized this, but in your solution the for loop determines for each letter whether it is a vowel or not but does not save the result. This is why it only gives you the result of the last iteration (v, since 'o' is a vowel).
You can try creating a new, empty string and then add to it:
word='yoyo'
new_word=''
for i in word.lower():
if i in "aeiou":
new_word+='v'
else:
new_word+='c'
print(new_word)
Upvotes: 1
Reputation: 3503
Try it like this:
word = 'yoyo'
for i in word.lower():
if i in "aeiou":
word=word.replace(i ,'v')
else:
word=word.replace(i ,'c')
print(word)
Upvotes: 1
Reputation: 1672
Try this:
word = 'yoyo'
word = list(word)
for i in range(len(word)):
if word[i] in 'aeiou':
word[i] = 'v'
else:
word[i] = 'c'
print(''.join(word))
Upvotes: 1