Reputation: 2853
I have a pandas dataframe which has the following columns
affix, word, sense and meaning
. Now if I want to obtain all the entries in the column word
, whose fourth charcter from last is a
.
The following snippet provides me the answer
pd[(pd['affix'] == 'man') & (pd['word'].str[-4] == 'a' ) ]
The output is
affix word sense meaning
9900 man cameraman who # somebody who operates a [[movie]] [[camera]]...
9901 man cameraman who # {{l|en|cameraman}} {{gloss|somebody who oper...
But if i want to cobtain the entries whose 4th character from last is a vowel, the following code snippet does not work. Any help would be helpful to achieve the results
pd[(pd['affix'] == 'man') & (pd['word'].str[-4] in ['a','e','i','o','u'] ) ]
The error shown is
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 1
Views: 145
Reputation: 294488
You can match with str.match
pd[(pd['affix'] == 'man') & pd.str.match('.*[aeiou].{3}$')
'.*[aeiou].{3}$'
is a regular expression that says to:
'.*'
match anything any number of times'[aeiou]'
followed by a single character from the list between brackets'.{3}$'
followed by any 3 characters then followed by the end of the string.Upvotes: 3