Amrith Krishna
Amrith Krishna

Reputation: 2853

Obtain words in a cell with a vowel at a specific position in pandas

I have a pandas dataframe which has the following columns affix, word, sense and meaning. Now if I want to obtain all the entries in the column word, whose fourth charcter from last is a.

The following snippet provides me the answer

pd[(pd['affix'] == 'man') & (pd['word'].str[-4] == 'a' )  ]

The output is

        affix   word        sense                  meaning
9900    man     cameraman   who     # somebody who operates a [[movie]] [[camera]]...
9901    man     cameraman   who     # {{l|en|cameraman}} {{gloss|somebody who oper...

But if i want to cobtain the entries whose 4th character from last is a vowel, the following code snippet does not work. Any help would be helpful to achieve the results

  pd[(pd['affix'] == 'man') & (pd['word'].str[-4] in ['a','e','i','o','u'] )  ]

The error shown is

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Upvotes: 1

Views: 145

Answers (2)

piRSquared
piRSquared

Reputation: 294488

You can match with str.match

pd[(pd['affix'] == 'man') & pd.str.match('.*[aeiou].{3}$')

'.*[aeiou].{3}$' is a regular expression that says to:

  • '.*' match anything any number of times
  • '[aeiou]' followed by a single character from the list between brackets
  • '.{3}$' followed by any 3 characters then followed by the end of the string.

Upvotes: 3

jezrael
jezrael

Reputation: 863156

I think you need isin:

pd[(pd['affix'] == 'man') & (pd['word'].str[-4].isin(['a','e','i','o','u']))]

Upvotes: 2

Related Questions