Reputation: 461
I have two Dataframes
like the following:
DataFrame 1
ID VALUE DATE
1 google.com 12/28/2015
2 yahoo.com 12/28/2015
3 cnn.com 12/28/2015
4 facebook.com 12/28/2105
DataFrame 2
ID COMMENT
1 Bad Stuff
2 Good Stuff
3 Werid Stuff
4 Crazy Stuff
And the desired result will be here
ID VALUE DATE COMMENT
1 google.com 12/28/2015 Bad Stuff
2 yahoo.com 12/28/2015 Good Stuff
3 cnn.com 12/28/2015 Weird Stuff
4 facebooo.com 12/28/2015 Crazy Stuff
Looks and seems so simple to me but I can't quite figure it out. I've read the Pandas documentation up and down and searched SOF for a while.
Here's what I have right now.
#Now, let's try to merge this data using Pandas
indicatorPanda = pd.read_csv('/tmp/i.csv')
commentPanda = pd.read_csv(open('/tmp/e.csv', 'rU'), encoding='utf-8')
mergedPanda = pd.concat([indicatorPanda, commentPanda], axis=1)
mergedPanda.to_csv('/tmp/output.csv', index=False, header=False)
I've come close but it looks like concating on Axis 1 still keeps that second 'ID' column at the end.
EDIT: So I'm close now, I am using this:
mergedPanda = indicatorPanda.merge(commentPanda, on='id', how='left')
But I am having an issue with duplicate rows with the same ID because of multiple lines in the Dataframe2
So now, here's what I want my end result to be:
DataFrame 1
ID VALUE DATE
1 google.com 12/28/2015
2 yahoo.com 12/28/2015
3 cnn.com 12/28/2015
4 facebook.com 12/28/2105
DataFrame 2
ID COMMENT
1 Bad Stuff
1 Much Worse Stuff
2 Good Stuff
3 Werid Stuff
4 Crazy Stuff
And the desired result will be here
ID VALUE DATE COMMENT
1 google.com 12/28/2015 Bad Stuff | Much Worse Stuff
2 yahoo.com 12/28/2015 Good Stuff
3 cnn.com 12/28/2015 Weird Stuff
4 facebook.com 12/28/2015 Crazy Stuff
Is there a way to just concat the comments onto each other with some sort of character? I know we are out of the realm of regular SQL and Panda.
Upvotes: 3
Views: 77
Reputation: 42875
It looks like your DataFrames
don't have ID
set as index, so .merge()
would work as (see docs):
df3 = dataframe1.merge(dataframe2, on='ID', how='left)
Your pd.concat()
merges on index
, so it should work fine (axis=1
means horizontal, as opposed to vertical combination) if you did df.set_index('ID', inplace=True)
for both DataFrame
before merging (see docs).
Upvotes: 2
Reputation: 66
I think Merge would work well for your case.
dataframe3 = pd.merge(dataframe1, dataframe2, left_index=True, right_index=True)
Upvotes: 2