ffill weird behavior , when have the duplicate columns names

Question

I have a DataFrame as below

df=pd.DataFrame({'A':[np.nan,1,1,np.nan],'B':[2,np.nan,2,2]},index=[1,1,2,2])
df.columns=['A','A']

Now I want to ffill the values groupby the index , first I try

df.groupby(level=0).ffill()

Which returns the error code

> ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

It looks like a bug, then I am trying with apply, which returns the expected output.

df.groupby(level=0).apply(lambda x : x.ffill())
     A    A
1  NaN  2.0
1  1.0  2.0
2  1.0  2.0
2  1.0  2.0

For your reference when the columns is unique , it works just(Q2) fine, however, create one index columns and columns name is NaN

df.columns=['C','D']
df.groupby(level=0).ffill()
   NaN    C    D
1    1  NaN  2.0
1    1  1.0  2.0
2    2  1.0  2.0
2    2  1.0  2.0

Question :
1 Is this a bug ? why apply can still work with this type situation ?

2 why groupby with index and ffill, it creates the additional columns ?

ffill weird behavior , when have the duplicate columns names

Answers (1)

Related Questions