Reputation: 9243
I am pretty new to Pandas
but would like to create one dataframe from another based on the condition that the name is Mel
. It looks like my new dataframe is only a pointer the only old one (based on the index number that is printed out).
I'm essentially looking for the equivalent of this:
BabyDataSet = [['Bob', 968], ['Jessica', 155], ['Mary', 77], ['John', 578], ['Mel', 973]]
filtered_list = [x for x in BabyDataSet if x[0] == 'Mel']
print filtered_list
df = pd.DataFrame(data=filtered_list, columns=['Names', 'Births'])
print df
MyCode:
import pandas as pd
BabyDataSet = [['Bob', 968], ['Jessica', 155], ['Mary', 77], ['John', 578], ['Mel', 973]]
#create dataframe
df = pd.DataFrame(data=BabyDataSet, columns=['Names', 'Births'])
#create a new dataframe for Bob
new_df = df.ix[['Mel' in x for x in df['Names']]]
print new_df
Upvotes: 4
Views: 8087
Reputation: 394189
No need to walk the df, just pass a boolean condition to filter the df:
In [216]:
new_df = df[df['Names']=='Mel']
new_df
Out[216]:
Names Births
4 Mel 973
EDIT
To reset the index call reset_index()
, as to whether new_df
is a reference to the orig df or not, it's not:
In [224]:
new_df = df[df['Names']=='Mel']
new_df = new_df.reset_index()
new_df
Out[224]:
index Names Births
0 4 Mel 973
In [225]:
new_df['Names'] = 'asdas'
df
Out[225]:
Names Births
0 Bob 968
1 Jessica 155
2 Mary 77
3 John 578
4 Mel 973
Upvotes: 3