Python pandas column asignment between dataframe and series does not work

Question

I have a df dataframe:

df = pd.DataFrame({'b':[100,100,100], 'a':[1,2,3]})
df['c'] = pd.np.nan
df['d'] = pd.np.nan
df['c'] = df['c'].astype(object)
df['d'] = df['d'].astype(object)

df is:

   a    b    c    d
0  1  100  NaN  NaN
1  2  100  NaN  NaN
2  3  100  NaN  NaN

I want to do a df.apply() with my function func(x) to set values for columns c and d.

func(x) is:

def func(x):
    return pd.Series({'d':{'foo':5, 'bar':10}, 'c':300})

df.apply() is:

df[['d', 'c']] = df.apply(lambda x: func(x), axis=1)

And the result is:

   a    b                      c    d
0  1  100  {'foo': 5, 'bar': 10}  300
1  2  100  {'foo': 5, 'bar': 10}  300
2  3  100  {'foo': 5, 'bar': 10}  300

And my question is that why column c gets the result from the returning series with index d? And how can I achieve the correct column assignment? Of course my function andapply() is much more complicated, that's why I use dictionary at return. So df[['c', 'd']] = df.apply(lambda x: func(x), axis=1) is not a solution to my real problem.

The desired result is:

   a    b    c                      d
0  1  100  300  {'foo': 5, 'bar': 10}
1  2  100  300  {'foo': 5, 'bar': 10}
2  3  100  300  {'foo': 5, 'bar': 10}

Thank you!

jezrael · Accepted Answer

For me works creating new DataFrame df1 and then concat to original df:

def func(x):
    return pd.Series({'d':{'foo':5, 'bar':10}, 'c':300})

df1 = df.apply(lambda x: func(x), axis=1)
print (df1)
     c                      d
0  300  {'bar': 10, 'foo': 5}
1  300  {'bar': 10, 'foo': 5}
2  300  {'bar': 10, 'foo': 5}

print (pd.concat([df[['a','b']], df1], axis=1))
   a    b    c                      d
0  1  100  300  {'bar': 10, 'foo': 5}
1  2  100  300  {'bar': 10, 'foo': 5}
2  3  100  300  {'bar': 10, 'foo': 5}

Python pandas column asignment between dataframe and series does not work

Answers (1)

Related Questions