Saraaa
Saraaa

Reputation: 43

AttributeError: 'Series' object has no attribute 'columns'

I have a dataframe that looks like this:

id    userid    int1    int2    string    string2   string3
 1     90       5067    1000     aaa        100       qqq

 1     90       6945    1000     bbb        101       qqq

 1     90       9596    1010     ccc        102       qqq

The desired output:

id    userid    int1                int2               string        string2        string3

1      90      [5067,6945,9596]    [1000,1000,1010]   [aaa,bbb,ccc]  [100,101,102]   qqq

I have tried to create a new dataframe for int1,int2,string,string2 and then merge them but this solution was very slow as I have a big dataset. A different solution is to use 'groupby':

df = df.groupby(['id','userid','string3']).[['int1'],['int2'],['string'],['string2']].apply(list).reset_index()

but this gives me this error:

AttributeError: 'Series' object has no attribute 'columns'

Any help is appreciated.

Upvotes: 2

Views: 45892

Answers (2)

Celius Stingher
Celius Stingher

Reputation: 18367

You are passing individual values (lists) within a list. You don't need to use doble brackets like you do when filtering a dataframe and expect a dataframe output. Hopefully this will work better for you:

df = df.groupby(['id','userid','string3'])['int1','int2','string','string2'].apply(list).reset_index()

Upvotes: 0

ansev
ansev

Reputation: 30920

Use GroupBy.agg with as_index=False + DataFrame.reindex to return the columns in the initial order:

new_df=( df.groupby(['id','userid','string3'],as_index=False)
           .agg(list)
           .reindex(columns=df.columns) )
print(new_df)

If you want you could select the columns:

cols=['int1','int2','string','string2']
new_df=( df.groupby(['id','userid','string3'],as_index=False)[cols]
           .agg(list)
           .reindex(columns=df.columns) )

Output

   id  userid                int1                int2           string  \
0   1      90  [5067, 6945, 9596]  [1000, 1000, 1010]  [aaa, bbb, ccc]   

           string2 string3  
0  [100, 101, 102]     qqq  

Upvotes: 3

Related Questions