MJP
MJP

Reputation: 1597

getting a particular value in pandas data frame

I have a data frame named df

       season   seed    team
1609    2010    W01     1246
1610    2010    W02     1452
1611    2010    W03     1307
1612    2010    W04     1458
1613    2010    W05     1396

I need to a new data frame in the following format:

team  frequency
1246    01 
1452    02 
1307    03
1458    04
1396    05

The frequency value came by taking the value from the column named seed in data frame df

W01 -> 01
W02 -> 02
W03 -> 03

How do I do this in pandas?

Upvotes: 2

Views: 169

Answers (2)

Liam Foley
Liam Foley

Reputation: 7832

The solution below uses a lambda function to apply a regex to remove non-digit characters.

http://pythex.org/?regex=%5CD&test_string=L16a&ignorecase=0&multiline=0&dotall=0&verbose=0

import pandas as pd
import re

index=[1609,1610,1611,1612,1613,1700]
data = {'season':[2010,2010,2010,2010,2010,2010],
        'seed':['W01','W02','W03','W04','W05','L16a'],
        'team':[1246,1452,1307,1458,1396,0000]}

df = pd.DataFrame(data,index=index)

df['frequency'] = df['seed'].apply(lambda x: int(re.sub('\D', '', x)))
df2 = df[['team','frequency']].set_index('team')

Upvotes: 2

Alex
Alex

Reputation: 19124

# Setup your DataFrame
df = pd.DataFrame({'season': [2010]*5, 'seed': ['W0' + str(i) for i in range(1,6)], 'team': [1246, 1452, 1307, 1458, 1396]}, index=range(1609, 1614))
s = pd.Series(df['seed'].str[1:].values, index=df['team'], name='frequency')
print(s)

yields

team
1246    01
1452    02
1307    03
1458    04
1396    05
Name: frequency, dtype: object

Upvotes: 1

Related Questions