Reputation: 5
Why do I get NaN value when adding values in the b column and not for a? This is the code:
df = pd.DataFrame({'grps': list('aaabbcaabcccbbc'),
'vals': [12,345,3,1,45,14,4,52,54,23,235,21,57,3,87]})
#extract all rows where a is present in the grps column
#for each a in a row, create an entry in a column (index 'a') in newdf from corresponding value on row
newdf = pd.DataFrame()
newdf['a'] = df[df["grps"] == 'a']['vals']
#print(df[df["grps"] == 'b']['vals'])
newdf['b']=df[df["grps"] == 'b']['vals']
print(newdf)
This is the output:
a b
0 12 NaN
1 345 NaN
2 3 NaN
6 4 NaN
7 52 NaN
Upvotes: 0
Views: 158
Reputation: 1394
Try the following:
df = pd.DataFrame({'grps': list('aaabbcaabcccbbc'),
'vals': [12,345,3,1,45,14,4,52,54,23,235,21,57,3,87]})
#extract all rows where a is present in the grps column
#for each a in a row, create an entry in a column (index 'a') in newdf from corresponding value on row
newdf = pd.DataFrame()
newdf['a'] = df[df["grps"] == 'a']['vals'].values
#print(df[df["grps"] == 'b']['vals'])
newdf['b']=df[df["grps"] == 'b']['vals'].values
print(newdf)
The problem is that df[df["grps"] == 'b']['vals']
is a pd.Series
which is an index array and a value array, but newdf
already has an index that it got from the previous line newdf['a'] = df[df["grps"] == 'a']['vals']
. So when you do that twice, the indices are not matching anymore and pd does not know how to handle your command.
By adding the .values
accessor you only append the values array, creating a default index which is now going to be just [0,1,2,3,4]
Upvotes: 1