\n
The output I'd like is something like:
\ncap-diameter | cap-shape\n10 | x\n10 | f\n20 | x\n20 | f\n5 | p\n5 | x\n10 | p\n10 | x\n...\n
\nSo I don't want the cartesian product of all of the entries in each column, just that of the respective rows. I think pd.explode() might be a good place to start but I'm not sure how to accomplish this. Thanks in advance.
\n","author":{"@type":"Person","name":"suse"},"upvoteCount":0,"answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"Explode each column consecutively
\ndf = df.explode('cap-diameter')\ndf = df.explode('cap-shape')\n
\n","author":{"@type":"Person","name":"im_vutu"},"upvoteCount":2}}}Reputation: 25
The data I'm working with (https://mushroom.mathematik.uni-marburg.de/files/PrimaryData/primary_data_edited.csv) has lists in some entries and I'd like to expand these lists as a cartesian product of their elements. The image below is two columns in the dataframe's header.
The output I'd like is something like:
cap-diameter | cap-shape
10 | x
10 | f
20 | x
20 | f
5 | p
5 | x
10 | p
10 | x
...
So I don't want the cartesian product of all of the entries in each column, just that of the respective rows. I think pd.explode() might be a good place to start but I'm not sure how to accomplish this. Thanks in advance.
Upvotes: 0
Views: 107
Reputation: 412
Explode each column consecutively
df = df.explode('cap-diameter')
df = df.explode('cap-shape')
Upvotes: 2