Reputation: 155
I want to split up the Jobs, Steve. 01/31 column so that [SPGC-9456, 6.0]] is on its own row.
What my code outputs now:
2017-01-31 2017-02-01
Gates, Bill. [[SPGC-14075, 0.5]] NaN
Jobs, Steve. [[SPGC-14075, 3.5], [SPGC-9456, 6.0]] NaN
White, John ANDERSON. [[SPGC-14075, 1.75]] [[SPGC-9456, 1.75]]
What I want:
2017-01-31 2017-02-01
Gates, Bill. [[SPGC-14075, 0.5]] NaN
Jobs, Steve. [[SPGC-14075, 3.5] NaN
Jobs, Steve. [SPGC-9456, 6.0]] NaN
White, John ANDERSON. [[SPGC-14075, 1.75]] [[SPGC-9456, 1.75]]
Upvotes: 1
Views: 423
Reputation: 323226
I am not using your data, you can try with my temp data.
Temp=pd.DataFrame({'Index':['str1', 'str2', 'str3'],'va':[['x'],[['y'],['z']],['z']],'va2':[np.nan,np.nan,['YY']]}).set_index('Index')
Temp_unnest = pd.DataFrame([[i, x]
for i, y in Temp['va'].apply(list).iteritems()
for x in y], columns=list('IV'))
Temp_unnest['va2']=Temp_unnest.I.map(Temp.va2)
Temp_unnest.set_index('I',inplace=True)
Temp_unnest.columns=Temp.columns
Temp_unnest
Out[121]:
va va2
I
str1 x NaN
str2 [y] NaN
str2 [z] NaN
str3 z [YY]
Upvotes: 1
Reputation: 294258
col = '2017-01-31'
v = df[col].values.tolist()
l = [len(x) for x in v]
d = {col: [[x] for y in v for x in y]}
df.reindex(df.index.repeat(l)).assign(**d)
2017-01-31 2017-02-01
Gates, Bill. [[SPGC-14075, 0.5]] NaN
Jobs, Steve. [[SPGC-14075, 3.5]] NaN
Jobs, Steve. [[SPGC-9456, 6.0]] NaN
White, John ANDERSON. [[SPGC-14075, 1.75]] [[SPGC-9456, 1.75]]
Setup
df = pd.DataFrame([
[[['SPGC-14075', .5]], np.nan],
[[['SPGC-14075', 3.5], ['SPGC-9456', 6.]], np.nan],
[[['SPGC-14075', 1.75]], [['SPGC-9456', 1.75]]]
],
'Gates, Bill.|Jobs, Steve.|White, John ANDERSON.'.split('|'),
['2017-01-31', '2017-02-01']
)
Upvotes: 2