RaduIoan
RaduIoan

Reputation: 5

Instead of appending values, pandas appends a column of NaNs. Why?

Why do I get NaN value when adding values in the b column and not for a? This is the code:

df = pd.DataFrame({'grps': list('aaabbcaabcccbbc'), 
                   'vals': [12,345,3,1,45,14,4,52,54,23,235,21,57,3,87]})
                 
#extract all rows where a is present in the grps column

#for each a in a row, create an entry in a column (index 'a') in newdf from corresponding value on row 

newdf = pd.DataFrame()
newdf['a'] = df[df["grps"] == 'a']['vals']
#print(df[df["grps"] == 'b']['vals'])
newdf['b']=df[df["grps"] == 'b']['vals']

print(newdf)

This is the output:

     a   b
0   12 NaN
1  345 NaN
2    3 NaN
6    4 NaN
7   52 NaN

Upvotes: 0

Views: 158

Answers (1)

JacoSolari
JacoSolari

Reputation: 1394

Try the following:

df = pd.DataFrame({'grps': list('aaabbcaabcccbbc'), 
               'vals': [12,345,3,1,45,14,4,52,54,23,235,21,57,3,87]})
             
#extract all rows where a is present in the grps column

#for each a in a row, create an entry in a column (index 'a') in newdf from corresponding value on row 

newdf = pd.DataFrame()
newdf['a'] = df[df["grps"] == 'a']['vals'].values
#print(df[df["grps"] == 'b']['vals'])
newdf['b']=df[df["grps"] == 'b']['vals'].values

print(newdf)

The problem is that df[df["grps"] == 'b']['vals'] is a pd.Series which is an index array and a value array, but newdf already has an index that it got from the previous line newdf['a'] = df[df["grps"] == 'a']['vals']. So when you do that twice, the indices are not matching anymore and pd does not know how to handle your command.

By adding the .values accessor you only append the values array, creating a default index which is now going to be just [0,1,2,3,4]

Upvotes: 1

Related Questions