Reputation: 53
name date
a [01-01,01-01,01-03]
b [02-01.03-03.03-03,03-05]
.. ..
.. ..
this is my dataframe
data was having a duplicated id and date so i make groupby id
df=DataFrame(data)
uid=df['uid']
dt=df['dt']
df1=pd.Series(uid,name='uid')
df3=pd.Series(dt,name='dt')
df=pd.concat([df1,df3], axis=1,ignore_index=True)
df.groupby(uid, as_index=False).agg(lambda x: x.tolist())
my desired output is like this
name date
a [01-01,01-03]
b [02-01,03-03,03-05]
.. ..
.. ..
Upvotes: 0
Views: 65
Reputation: 4199
if you want to remove duplicates and also sort them based on initial order. see example below:
df = pd.DataFrame.from_dict({'name':['a','b'], 'date': [['01-01','01-01','01-03'],['02-01','03-03','03-03','03-05']]})
print 'before removing duplicates'
print df
print 'after removing duplicates and sorting based on initial order'
df['date'] = df['date'].apply(lambda x: sorted(list(set(x)), key = x.index))
print df
results in
before removing duplicates
date name
0 [01-01, 01-01, 01-03] a
1 [02-01, 03-03, 03-03, 03-05] b
after removing duplicates and sorting based on initial order
date name
0 [01-01, 01-03] a
1 [02-01, 03-03, 03-05] b
Upvotes: 0