user2242044
user2242044

Reputation: 9243

Dataframe Comprehension in Pandas Python to create new Dataframe

I am pretty new to Pandas but would like to create one dataframe from another based on the condition that the name is Mel. It looks like my new dataframe is only a pointer the only old one (based on the index number that is printed out).

I'm essentially looking for the equivalent of this:

BabyDataSet = [['Bob', 968], ['Jessica', 155], ['Mary', 77], ['John', 578], ['Mel', 973]]
filtered_list = [x for x in BabyDataSet if x[0] == 'Mel']
print filtered_list
df = pd.DataFrame(data=filtered_list, columns=['Names', 'Births'])
print df

MyCode:

import pandas as pd

BabyDataSet = [['Bob', 968], ['Jessica', 155], ['Mary', 77], ['John', 578], ['Mel', 973]]
#create dataframe
df = pd.DataFrame(data=BabyDataSet, columns=['Names', 'Births'])

#create a new dataframe for Bob
new_df = df.ix[['Mel' in x for x in df['Names']]]
print new_df

Upvotes: 4

Views: 8087

Answers (1)

EdChum
EdChum

Reputation: 394189

No need to walk the df, just pass a boolean condition to filter the df:

In [216]:
new_df = df[df['Names']=='Mel']
new_df

Out[216]:
  Names  Births
4   Mel     973

EDIT

To reset the index call reset_index(), as to whether new_df is a reference to the orig df or not, it's not:

In [224]:
new_df = df[df['Names']=='Mel']
new_df = new_df.reset_index()
new_df

Out[224]:
   index Names  Births
0      4   Mel     973

In [225]:    
new_df['Names'] = 'asdas'
df

Out[225]:
     Names  Births
0      Bob     968
1  Jessica     155
2     Mary      77
3     John     578
4      Mel     973

Upvotes: 3

Related Questions