Blue Moon
Blue Moon

Reputation: 4651

pandas: how to eliminate rows with value ending with a specific character?

I have a pandas DataFrame as follows:

mail = DataFrame({'mail' : ['[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]']})

that looks like:

                    mail
0          [email protected]
1        [email protected]
2       [email protected]
3   [email protected]
4  [email protected]
5  [email protected]
6       [email protected]

What I want to do is to filter out (elimiante) all those rows in which the value in the column mail ends with '@gmail.com'.

Upvotes: 8

Views: 12142

Answers (2)

Alex Riley
Alex Riley

Reputation: 176810

You can use str.endswith and negate the result of the boolean Series with ~:

mail[~mail['mail'].str.endswith('@gmail.com')]

Which produces:

                    mail
2       [email protected]
3   [email protected]
4  [email protected]
5  [email protected]
6       [email protected]

Pandas has many other vectorised string operations which are accessible through the .str accessor. Many of these are instantly familiar from Python's own string methods, but come will built in handling of NaN values.

Upvotes: 14

musically_ut
musically_ut

Reputation: 34288

A column with type str has a field .str on it, using which you can access the standard functions defined for a single str:

[6]: mail['mail'].str.endswith('gmail.com')
      Out[6]:
0     True
1     True
2    False
3    False
4    False
5    False
6    False
Name: mail, dtype: bool

Then you can filter using this Series:

[7]: mail[~mail['mail'].str.endswith('gmail.com')]
      Out[7]:
                    mail
2       [email protected]
3   [email protected]
4  [email protected]
5  [email protected]
6       [email protected]

A similar property .dt exists for accessing date/time related properties of a column if it contains date-data.

Upvotes: 2

Related Questions