Reputation: 85
I've seen this question a few times, but the answers are not working for me. I have two dataframes, split_df
and csv_df
that I;m trying to merge on a column that is called key
in each of them.
Here's the top 5 items from split_df
key tags
2785 30000 ipv6
2785 30000 networking
2785 30000 wireless
2785 30000 standards
2785 30000 satcom
Here's the top 5 items from the csv_df
key
0 30000
1 30002
2 31288
3 33630
4 31663
When I run this merge statement:
common_df = pd.merge(split_df,csv_df, on=['key'], suffixes=('_left', '_right'))
I get:
Empty DataFrame
Columns: [key, tags]
Index: []
Process finished with exit code 0
I can't figure out why the merge isn't working for me. Any suggestions?
I have indexes on both (and have run the merge with them out as well)
csv_df.set_index('key')
split_df.set_index('key')
I am very new to Pyton and Pandas
Thanks!
Upvotes: 3
Views: 111
Reputation: 294536
The key
column in one of your dataframes is int
while the other is str
split_df.key = split_df.key.astype(int)
csv_df.key = csv_df.key.astype(int)
pd.merge(split_df,csv_df, on=['key'], suffixes=('_left', '_right'))
Upvotes: 3
Reputation: 375915
From the above code this works for me:
In [11]: pd.merge(split_df,csv_df, on=['key'], suffixes=('_left', '_right'))
Out[11]:
key tags
0 30000 ipv6
1 30000 networking
2 30000 wireless
3 30000 standards
4 30000 satcom
My suspicion is that the column type key
is different in your dataframes (and the same int
in mine).
Most likely one of them is a string (and the other a float/int).
In [12]: split_df["key"] = split_df["key"].apply(str)
In [13]: pd.merge(split_df,csv_df, on=['key'], suffixes=('_left', '_right'))
Out[13]:
Empty DataFrame
Columns: [key, tags]
Index: []
If they're both strings:
In [14]: csv_df["key"] = csv_df["key"].apply(str)
In [15]: pd.merge(split_df,csv_df, on=['key'], suffixes=('_left', '_right'))
Out[15]:
key tags
0 30000 ipv6
1 30000 networking
2 30000 wireless
3 30000 standards
4 30000 satcom
Upvotes: 3