Reputation: 41
I am trying to parse data from several *.csv files and save them as list for later manipulation, but keep failing.
I have read numerous tutorials and related topics on SO and other sites, but couldn't find the solution for my problem. After several days of working on the code, I am stuck and don't know how to proceed.
# saves filepaths of *.csv files in lists (constant)
CSV_OLDFILE = glob.glob("./oldcsv/*.csv")
assert isinstance(CSV_OLDFILE, list)
CSV_NEWFILE = glob.glob("./newcsv/*.csv")
assert isinstance(CSV_NEWFILE, list)
def get_data(input):
"""copies numbers from *.csv files, saves them in list RAW_NUMBERS"""
for i in range(0, 5): # for each of the six files
with open(input[i], 'r') as input[i]: # open as "read"
for line in input[i]: # parse lines for data
input.append(int(line)) # add to list
return input
def write_data(input):
"""writes list PROCESSED_NUMBERS_FINAL into new *.csv files"""
for i in range(0, 5): # for each of the six files
with open(input[i], 'w') as data: # open as "write"
data = csv.writer(input[i])
return data
RAW_NUMBERS = get_data(CSV_OLDFILE)
# other steps for processing data
write_data(PROCESSED_NUMBERS_FINAL)
Actual result:
TypeError: object of type '_io.TextIOWrapper' has no len()
Expected result: save data from *.csv files, manipulate and write them to new *.csv files.
I think the problem is probably located in my trying to call len
of a file
object, but I don't know what the correct implementation should look like.
Complete backtrace:
Traceback (most recent call last):
File "./solution.py", line 100, in <module>
PROCESSED_NUMBERS = slowsort_start(RAW_NUMBERS)
File "./solution.py", line 73, in slowsort_start
(input[i], 0, len(input[i])-1))
TypeError: object of type '_io.TextIOWrapper' has no len()
Upvotes: 0
Views: 169
Reputation: 41
So this is the solution I found, after lots of trial-and-error and research:
# initializing lists for later use
RAW_DATA = [] # unsorted numbers
SORTED_DATA = [] # sorted numbers
PROCESSED_DATA = [] # sorted and multiplied numbers
def read_data(filepath): # from oldfiles
"""returns parsed unprocessed numbers from old *.csv files"""
numbers = open(filepath, "r").read().splitlines() # reads, gets input from rows
return numbers
def get_data(filepath): # from oldfiles
"""fills list raw_data with parsed input from old *.csv files"""
for i in range(0, 6): # for each of the six files
RAW_DATA.append(read_data(filepath[i])) # add their data to list
def write_data(filepath): # parameter: newfile
"""create new *.csv files with input from sorted_data and permission 600"""
for i in range(0, 6): # for each of the six files
with open(filepath[i], "w", newline="\n") as file: # open with "write"
writer = csv.writer(file) # calls method for writing
for item in SORTED_DATA[i]: # prevents data from being treated as one object
writer.writerow([item]) # puts each entry in row
os.chmod(filepath[i], 0o600) # sets permission to 600 (octal)
This lets me read from files, as well as create and write to files. Given that I need a specific setup, with data only ever being found in "column A", I chose this solution. But thanks again to everybody who answered and commented!
Upvotes: 0
Reputation: 15513
Question: Expected result: read data from
*.csv
, manipulate numbers and write them to new*.csv
.
OOP
solution that holds the numbers
in a dict
of dict:list
.
Initialize the class object
with the in_path
and out_path
import os, csv
class ReadProcessWrite:
def __init__(self, in_path, out_path):
self.in_path = in_path
self.out_path = out_path
self.number = {}
Read all files from self.in_path
, filter .csv
files.
Create a dict
with key ['raw']
and assign all numbers
from this *.csv
to a list
.
Note: Assuming, one
number
per line!
def read_numbers(self):
for fname in os.listdir(self.in_path):
if fname.endswith('.csv'):
self.number[fname] = {}
with open(os.path.join(self.in_path, fname)) as in_csv:
self.number[fname]['raw'] = [int(number[0]) for number in csv.reader(in_csv)]
print('read_numbers {} {}'.format(fname, self.number[fname]['raw']))
return self
Process the ['raw']
numbers and assigen the result to the key ['final']
.
def process_numbers(self):
def process(numbers):
return [n*10 for n in numbers]
for fname in self.number:
print('process_numbers {} {}'.format(fname, self.number[fname]['raw']))
# other steps for processing data
self.number[fname]['final'] = process(self.number[fname]['raw'])
return self
Write the results from key ['final']
to self.out_path
, using the same .csv
filenames.
def write_numbers(self):
for fname in self.number:
print('write_numbers {} {}'.format(fname, self.number[fname]['final']))
with open(os.path.join(self.out_path, fname), 'w') as out_csv:
csv.writer(out_csv).writerows([[row] for row in self.number[fname]['final']])
Usage:
if __name__ == "__main__":
ReadProcessWrite('oldcsv', 'newcsv').read_numbers().process_numbers().write_numbers()
Output:
read_numbers 001.csv [1, 2, 3] read_numbers 003.csv [7, 8, 9] read_numbers 002.csv [4, 5, 6] process_numbers 003.csv [7, 8, 9] process_numbers 002.csv [4, 5, 6] process_numbers 001.csv [1, 2, 3] write_numbers 003.csv [70, 80, 90] write_numbers 002.csv [40, 50, 60] write_numbers 001.csv [10, 20, 30]
Tested with Python: 3.4.2
Upvotes: 1