Reputation: 395
I have a dataframe df1 like this:
topic_id topic
1 Sun
2 Cloud
3 Rain
I have an other dataframe df2 that looks like this:
article_id title topics
1 title_1 []
2 title_2 [Sun, Cloud]
3 title_3 [Rain]
My goal is to use df1 and df2 to create a dataframe df3.
I want to create a row in df3 everytime a topic in lists of topics column appears. And in this row I would like to have the article id of df2 when a topic appears and also the topic id of df1. If the list is empty nothing is created in df3.
I would like to have a new dataframe looking like this:
article_id topic_id topics
2 1 Sun
2 2 Cloud
3 3 Rain
How can I do this ?
Upvotes: 1
Views: 27
Reputation: 862511
Use DataFrame.explode
with default inner join in DataFrame.merge
, last change order of columns by list:
cols = ['article_id','topic_id','topics']
df = df2.explode('topics').merge(df1.rename(columns={'topic':'topics'}))[cols]
print (df)
article_id topic_id topics
0 2 1 Sun
1 2 2 Cloud
2 3 3 Rain
Upvotes: 2