Reputation: 327
I have a few csv files for my homework. I want to combine them as in the example below. But I didn't know how to do it.
Exp1.csv
"DATE","NOW","OPEN","HIGH","LOW","Hac.","VOL %"
"01.09.2019","23,78","25,54","25,54","23,78","-","-7,04%"
"25.08.2019","25,58","23,96","26,00","23,56","2,14M","4,07%"
Exp2.csv
"DATE","NOW","OPEN","HIGH","LOW","Hac.","VOL %"
"01.09.2019","4,16","4,15","4,23","4,12","-","0,73%"
"25.08.2019","4,13","4,05","4,19","4,03","6,48M","1,98%"
I want to merge 2 files like this. I just want to get the VOL% column.
"DATE","Exp1","Exp2"
"01.09.2019","-7,04%","0,73%"
"25.08.2019","4,07%","1,98%"
Thank you everyone:) I found a solution like this and applied it.
import glob
import os
import pandas.io
path =r'/Users/baris/Documents/Files/'
all_files = glob.glob(os.path.join(path, "*.csv"))
df_from_each_file = (pandas.read_csv(f) for f in all_files)
concatenated_df = pandas.concat(df_from_each_file, axis=1)
concatenated_df_clean = (concatenated_df.drop('DATE',1).drop('NOW',1).drop('OPEN',1).drop('HIGH.',1).drop('Low',1).drop('Hac.',1)
df_dates_file = pandas.read_csv('/Users/baris/Documents/Files/Exp1.csv')
df_date_export = concatenated_df.iloc[:, 0]
final_result = pandas.concat([df_date_export,concatenated_df_clean], axis=1)
print(final_result)
Upvotes: 0
Views: 461
Reputation: 129
Use the csv module.
https://docs.python.org/3/library/csv.html
read this tutorial:
https://realpython.com/python-csv/
something like this will do the work: (educational code)
import io
import csv
target = {}
file_one_string =\
""""DATE","NOW","OPEN","HIGH","LOW","Hac.","VOL %"
"01.09.2019","23,78","25,54","25,54","23,78","-","-7,04%"
"25.08.2019","25,58","23,96","26,00","23,56","2,14M","4,07%"
"""
file_two_string = \
""""DATE","NOW","OPEN","HIGH","LOW","Hac.","VOL %"
"01.09.2019","4,16","4,15","4,23","4,12","-","0,73%"
"25.08.2019","4,13","4,05","4,19","4,03","6,48M","1,98%"
"""
with io.StringIO(file_one_string) as file_one:
csv_reader = csv.DictReader(file_one,delimiter=',',quotechar='"')
for row in csv_reader:
if 'VOL %' in row:
target[row['DATE']] ={'Exp1': row['VOL %']}
with io.StringIO(file_two_string) as file_two:
csv_reader = csv.DictReader(file_two,dialect="excel")
for row in csv_reader:
if row['DATE'] in target:
target[row['DATE']]['Exp2'] = row['VOL %']
else:
print('Missing DATE {} in file_two'.format(row['DATE']))
lines2 = [row for row in csv_reader]
with io.StringIO() as output_file:
fieldnames = ['DATE', 'Exp1', 'Exp2']
csv_writer = csv.DictWriter(output_file, fieldnames=fieldnames)
csv_writer.writeheader()
for key, value in target.items():
csv_writer.writerow({
'DATE': key,
'Exp1': value['Exp1'],
'Exp2': value['Exp2']
})
print(output_file.getvalue())
Upvotes: -1
Reputation: 153
You can use the pandas package to read and save csv. However you cannot remove columns while merging the csv files, but you can save the columns that you need Have a look at my code below. Replace the csv filename and column name with yours.
import pandas as pd
# create list of files you want to merge
all_filenames = ['test.csv','test1.csv']
# use pandas concat function to merge csv's
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])
# export the csv
combined_csv.to_csv( "combined_csv.csv", index=False, encoding='utf-8-sig',columns=['test1'])
Upvotes: 1
Reputation: 1607
try something like this:
df = pd.read_csv('Exp1.csv')
df1 = pd.read_csv('Exp2.csv')
df['DATE'] = pd.to_datetime(df['DATE'])
df1['DATE'] = pd.to_datetime(df['DATE'])
final_df = pd.merge(df[['DATE', 'VOL %']], df1[['DATE', 'VOL %']], on='DATE')
print(final_df)
DATE VOL %_x VOL %_y
2019-01-09 -7,04% 0,73%
2019-08-25 4,07% 1,98%
Upvotes: 0
Reputation: 2773
import csv
with open('Exp1.csv', 'r') as f1:
csv_reader = csv.reader(f1, delimiter=',')
lines1 = [row for row in csv_reader]
with open('Exp2.csv', 'r') as f2:
csv_reader = csv.reader(f2, delimiter=',')
lines2 = [row for row in csv_reader]
del lines1[0]
del lines2[0]
with open('output.csv', 'w+') as output_file:
output_file.write('"DATE","Exp1","Exp2"\n')
for index, _ in enumerate(lines1):
date = lines1[index][0]
vol1 = lines1[index][6]
vol2 = lines2[index][6]
output_file.write(f'"{date}","{vol1}","{vol2}"\n')
This assumes the following:
VOL %
is always going to be in the 7th column (like in your example)DATE
is always going to be in the 1st column (like in your example)Exp1.csv
and Exp2.csv
"DATE"
will always be the same in Exp1.csv
and Exp2.csv
Read more about the CSV module: https://docs.python.org/3/library/csv.html
Upvotes: 1