Reputation: 109
I have daily temperature files which I'd like to combine into one yearly file.
e.g. Input files
Day_1.dat
Toronto -22.5
Montreal -10.6
Day_2.dat
Toronto -15.5
Montreal -1.5
Day_3.dat
Toronto -5.5
Montreal 10.6
desired output file
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6
This is the code I've written for this section of the program so far:
#Open files for reading (input) and appending (output)
readFileObj = gzip.open(readfilename, 'r') #call built in utility to unzip file for reading
appFileObj = open(outFileName, 'a')
for line in readfileobj:
fileString = readFileObj.read(line.split()[-1]+'\n') # read last 'word' of each line
outval = "" + str(float(filestring) +"\n" #buffer with a space and then signal end of line
appFileObj.write(outval) #this is where I need formatting help to append outval
Upvotes: 2
Views: 1037
Reputation: 250921
Here the iteration over fileinput.input
allows us to iterate over all the files, fetching one line at a time. Now we split each line at white-space, and then using the city name as the key we store the corresponding temperature(or whatever value is that) inside the list.
import fileinput
d = {}
for line in fileinput.input(['Day_1.dat', 'Day_2.dat', 'Day_3.dat']):
city, temp = line.split()
d.setdefault(city, []).append(temp)
Now d
contains:
{'Toronto': ['-22.5', '-15.5', '-5.5'],
'Montreal': ['-10.6', '-1.5', '10.6']}
Now, we can simply iterate over this dictionary and write the data to the output file.
with open('output_file', 'w') as f:
for city, values in d.items():
f.write('{} {}\n'.format(city, ' '.join(values)))
Output:
$ cat output_file
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6
Note that dictionaries don't have any particular order. So, the output here could have been Montreal
first and then Toronto
. In case that order is important then you need to use collections.OrderedDict
.
Working version of your code:
d = {}
#Considering you've a list of all `gzip` files to be opened.
for readfilename in filenames:
#populate the dictionary by collecting data from each file
with gzip.open(readfilename, 'r') as f:
for line in f:
city, temp = line.split()
d.setdefault(city, []).append(temp)
#Now write to the output file
with open(outFileName, 'w') as f:
for city, values in d.items():
f.write('{} {}\n'.format(city, ' '.join(values)))
Upvotes: 2