Eduardo
Eduardo

Reputation: 2514

Get substring from pandas dataframe while filtering

Say I have a dataframe with the following information:

Name    Points          String
John        24     FTS8500001A
Richard     35     FTS6700001B
John        29     FTS2500001A
Richard     35     FTS3800001B
John        34     FTS4500001A

Here is the way to get a DataFrame with the sample above:

import pandas as pd
keys = ('Name', 'Points', 'String')
names = pd.Series(('John', 'Richard', 'John', 'Richard', 'John'))
ages = pd.Series((24,35,29,35,34))
strings = pd.Series(('FTS8500001A','FTS6700001B','FTS2500001A','FTS3800001B','FTS4500001A'))
df = pd.concat((names, ages, strings), axis=1, keys=keys)

I want to select every row that meet the following criteria: Name=Richard And Points=35. And for such rows I want to read the 4th and 5th char of the String column (the two numbers just after FTS).

The output I want is the numbers 67 and 38.

I’ve tried several ways to achieve it but with zero results. Can you please help?

Thank you very much.
Eduardo

Upvotes: 7

Views: 25384

Answers (2)

firelynx
firelynx

Reputation: 32234

Pandas string methods

You can mask it on your criteria and then use pandas string methods

mask_richard = df.Name == 'Richard'
mask_points = df.Points == 35
df[mask_richard & mask_points].String.str[3:5]

1    67
3    38

Upvotes: 7

EdChum
EdChum

Reputation: 394399

Use a boolean mask to filter your df and then call str and slice the string:

In [77]:
df.loc[(df['Name'] == 'Richard') & (df['Points']==35),'String'].str[3:5]

Out[77]:
1    67
3    38
Name: String, dtype: object

Upvotes: 11

Related Questions