Reputation: 25
I've got 3 csv files (june.csv,july.csv, august.csv)
July.csv is like:
company_name | number_of_employers | money_spend
comp1 | 20 | 1200
comp2 | 30 | 1000
August.csv is like:
company_name | number_of_employers | money_spend
comp1 | 25 | 1800
comp3 | 50 | 2000
I wanna combine these csvs to one with the structure like that:
| august |july
company_name | number_of_employers | money_spend | number_of_employers | money_spend
comp1 | 25 | 1800 |20 |1200
comp2 | 0 | 0 |30 |1000
comp3 | 50 | 2000 |0 |0
How can I do this?
Upvotes: 0
Views: 27
Reputation: 482
for the marge, you can do as following
July_df= pd.read_csv("july.csv")
August_df = pd.read_csv("august.csv")
merge(July_df, August_df, on='company_name', suffixes=['_july', '_august'])
not sure how to create the multi-index automatically
Upvotes: 0
Reputation: 195438
Try pd.read_csv
with .set_index()
. Then concatenate the dataframe using pd.concat
:
# adjust `sep=` parameter accordingly:
df_1 = pd.read_csv("july.csv", sep=r",", engine="python").set_index(
"company_name"
)
df_2 = pd.read_csv("august.csv", sep=r",", engine="python").set_index(
"company_name"
)
df_out = pd.concat({"july": df_1, "august": df_2}, axis=1).fillna(0)
print(df_out)
Prints:
july august
number_of_employers money_spend number_of_employers money_spend
company_name
comp1 20.0 1200.0 25.0 1800.0
comp2 30.0 1000.0 0.0 0.0
comp3 0.0 0.0 50.0 2000.0
Upvotes: 1