Reputation: 2046
I've made a dataframe from a dictionary to manipulate it.
dic_people = defaultdict(dict)
dic_people['A']['language'] = 'English'
dic_people['A']['nationality'] = 'Russia'
dic_people['A']['joined'] = 201010
dic_people['B']['language'] = 'French'
dic_people['B']['nationality'] = 'Canada'
dic_people['B']['joined'] = 201009
dic_people['C']['language'] = 'English'
dic_people['C']['nationality'] = 'Canada'
dic_people['C']['joined'] = 201008
dic_people['D']['language'] = 'French'
dic_people['D']['nationality'] = 'France'
dic_people['D']['joined'] = 201007
dic_people['E']['language'] = 'English'
dic_people['E']['nationality'] = 'Ireland'
dic_people['E']['joined'] = 201011
df = pd.DataFrame.from_dict(dic_people)
>>> df
A B C D E
joined 201010 201009 201008 201007 201011
language English French English French English
nationality Russia Canada Canada France Ireland
I want to select 2 people who 1) joined earliest and 2) speak English. So, the result will be
A C
joined 201010 201008
language English English
nationality Russia Canada
I'd like to know how to do this. I tried df[ df.loc['language'] == 'English']
, but it seems it's somewhat different when referring to rows, not columns.
Upvotes: 2
Views: 59
Reputation: 862406
Use DataFrame.loc
for filter by rows and for first 2 column select by positions by DataFrame.iloc
:
#if necessary sorting by 'joined' value in index
#df = df.sort_values('joined', axis=1)
df1 = df.loc[:, df.loc['language'] == 'English'].iloc[:, :2]
print (df1)
A C
language English English
nationality Russia Canada
joined 201010 201008
Upvotes: 6