Reputation: 831
Split tag and author then expand to new rows.
df = pd.DataFrame([
{'name': 'book1', 'tag': 'a b c', 'author': 'a1 a2'},
],columns=['name', 'tag', 'author']);
print(df)
name tag author
0 book1 a b c a1 a2
Expected:
[out]
name tag author
0 book1 a a1
1 book1 b a2
2 book1 c NaN
Upvotes: 1
Views: 566
Reputation: 294508
For those with sufficiently updated Python to use the splat unpacking
from itertools import zip_longest
import pandas as pd
pd.DataFrame(
[n + m for *n, t, a in zip(*map(df.get, df))
for *m, in zip_longest(*map(str.split, (t, a)))],
columns=[*df]
)
name tag author
0 book1 a a1
1 book1 b a2
2 book1 c None
Upvotes: 0
Reputation: 863291
Use DataFrame.set_index
by all repeating values of columns, then reshape by DataFrame.stack
, then Series.str.split
with expand=True
for DataFrame
and last reshape by stack
with unstack
:
df1 = (df.set_index('name')
.stack()
.str.split(expand=True)
.stack()
.unstack(1)
.reset_index(level=0)
.reset_index(drop=True))
print (df1)
name tag author
0 book1 a a1
1 book1 b a2
2 book1 c NaN
Another solution:
df1 = (df.set_index('name')
.apply(lambda x: x.str.split(expand=True).stack())
.reset_index(level=0)
.reset_index(drop=True)
)
Upvotes: 2