Reputation: 1777
Have two dataframes:
df1 =
Col Date Days
0 A 20180830 30
1 A 20180927 58
2 A 20181025 86
3 B 20180830 30
4 B 20180927 58
5 B 20181025 86
6 C 20180802 2
7 C 20180809 9
8 C 20180816 16
9 C 20180823 23
df2 =
Col Lot Pct
13 A 4000 16.19
184 B 600 7.51
206 C 250 5.00
...
How to make a single dataframe:
df =
Col Date Days Lot Pct
0 A 20180830 30 4000 16.19
1 A 20180927 58 4000 16.19
2 A 20181025 86 4000 16.19
3 B 20180830 30 600 7.51
df1.join(df2, on='Col') gives KeyError: 'Col' # Check for duplicates
Upvotes: 2
Views: 7622
Reputation: 146
You can instead use pd.merge
to join to data frames in Pandas.
Example below:
import pandas as pd
data_1 = {'Col': ['A', 'A', 'B', 'B', 'C', 'C'], 'Date': [20180830, 20180927, 20180830, 20180927, 20180809, 20180816], 'Days':[1,2,3,4,5,6]}
df1 = pd.DataFrame(data=data_1)
df1
df =
Col Date Days
0 A 20180830 1
1 A 20180927 2
2 B 20180830 3
3 B 20180927 4
4 C 20180809 5
5 C 20180816 6
Create df2
data_2 = {'Col': ['A', 'B', 'C'], 'Lot': [4000, 600, 250], 'PCT': [16.19, 7.51, 5.00]}
df2 = pd.DataFrame(data=data_2)
df2 =
Col Lot PCT
0 A 4000 16.19
1 B 600 7.51
2 C 250 5.00
You can join the above two dataframes using:
pd.merge(df1, df2, on="Col")
The output will be:
Col Date Days Lot PCT
0 A 20180830 1 4000 16.19
1 A 20180927 2 4000 16.19
2 B 20180830 3 600 7.51
3 B 20180927 4 600 7.51
4 C 20180809 5 250 5.00
5 C 20180816 6 250 5.00
Upvotes: 5