Reputation: 57
I have seen some answers to this type of question here, but not enough to really help me. I did a split on a 9- column .csv - file & wrote them to vectors for other work in c++. They are subsequently written back to a folder as Single- column .csv files that basically look similar to this:
date
20171012
20171011
20171010
20171009
20171006
20171005
20171004
Now I want to combine all those 9 simple csv files into 1 file again so that they stack up horizontally next to each other,like this in the new file:
date,value,etc...
20171012,2501593,etc..
20171011,2176309,etc..
20171010,3484064,etc..
20171009,1785852,etc..
20171006,1785852,etc..
20171005,16476641,etc..
20171004,1235406,etc..
I hope this is easy enough to understand. My code below is as follows:
import csv
data = [] # Buffer list
files = ['./CalculatedOutput/quote_date.csv', './CalculatedOutput/paper.csv', './CalculatedOutput/exch.csv', './CalculatedOutput/open.csv', './CalculatedOutput/high.csv', './CalculatedOutput/low.csv', './CalculatedOutput/close.csv', './CalculatedOutput/volume.csv', './CalculatedOutput/value.csv']
for filename in files:
with open(filename, 'r') as csvfile:
stocks = csv.reader(csvfile)
for row in stocks:
new_row = [row[0]]
data.append(new_row)
with open("CalculatedOutput/Opera.csv", "w+") as to_file:
writer = csv.writer(to_file , delimiter=",")
for new_row in data:
writer.writerow(new_row)
This code DOES move all the rows of the columns into 1 new file, but it just puts them underneath one another. How would I go about writing the columns next to each other,comma separated ? I have tried extensively with Pandas, numpy and the csv lib according to concat, merge and others, but i just can't find the right way. I don't think I'm that far off, but my python is not the best unfortunately!
Upvotes: 3
Views: 2019
Reputation: 4243
use the os to find all files in the a directory endingwith input* then using the column name in the csv automatic stack the data by row using pd.concat. assume an encoding of iso-8859-1 if not unicode.
path= 'C:\\Users\\your_username\\stacked_csv'
# get all csv files in input directory
csv_files = glob.glob(os.path.join(path, 'input*.csv'))
# read all csv files
df_list = []
for csv_file in csv_files:
df = pd.read_csv(csv_file,encoding='iso-8859-1')
df_list.append(df.dropna())
# stack all csv files
df_stacked = pd.concat(df_list, axis=0)
output_file=path+"\\result_all.csv"
# write stacked csv file
df_stacked.to_csv(output_file, index=False)
print(pd.read_csv(output_file))
Upvotes: 0
Reputation: 78556
You can open all the files using a single context manager with contextlib.ExitStack
(in Python 3) and then write to the output file after applying zip
on the iterable of files:
import csv
from contextlib import ExitStack
outfile = "CalculatedOutput/Opera.csv"
with ExitStack() as stack, open(outfile, "w+") as to_file:
# open all files
fs = [stack.enter_context(open(fname)) for fname in files]
fs = map(csv.reader, fs)
# write all rows from all files
csv.writer(to_file).writerows(zip(*fs))
Update:
If the file contains characters that cannot be decoded to UTF-8 (the default encoding for open
), you can use intermediate surrogate characters when reading, which are replaced with their original form at write time:
with ExitStack() as stack, open(outfile, "w+", errors='surrogateescape') as to_file :
fs = [stack.enter_context(open(fname, errors='surrogateescape')) for fname in files]
...
Upvotes: 3
Reputation: 18916
I read you tried pandas, what went wrong there?. With pandas we can simply use pd.concat([df1,df2....]). So let's read them in and bunch them together:
import pandas as pd
df = pd.concat((pd.read_csv(f) for f in files),axis=1) # axis1 for horizontal
df.to_csv("CalculatedOutput/Opera.csv",index=False)
Example:
Let's first create two imaginary files:
file1 = """date
20171012
20171011
20171010
20171009
20171006
20171005
20171004"""
file2 = """number
1
2
3
4
5
6
7"""
files = [io.StringIO(f) for f in [file1,file2]]
import pandas as pd
df = pd.concat([pd.read_csv(f) for f in files],axis=1)
print(df)
date number
0 20171012 1
1 20171011 2
2 20171010 3
3 20171009 4
4 20171006 5
5 20171005 6
6 20171004 7
Upvotes: 2