MapleMatrix
MapleMatrix

Reputation: 109

Python: append last word from lines in input file to to end of lines of output file

I have daily temperature files which I'd like to combine into one yearly file.
e.g. Input files

 Day_1.dat              
 Toronto  -22.5     
 Montreal -10.6  

 Day_2.dat            
 Toronto  -15.5  
 Montreal  -1.5  

 Day_3.dat      
 Toronto   -5.5
 Montreal  10.6  

desired output file

Toronto  -22.5 -15.5 -5.5  
Montreal -10.6  -1.5 10.6

This is the code I've written for this section of the program so far:

    #Open files for reading (input) and appending (output)
    readFileObj = gzip.open(readfilename, 'r') #call built in utility to unzip file for    reading
    appFileObj = open(outFileName, 'a')
      for line in readfileobj:
        fileString = readFileObj.read(line.split()[-1]+'\n') # read last 'word' of each line
        outval = "" + str(float(filestring) +"\n" #buffer with a space and then signal end of line
        appFileObj.write(outval) #this is where I need formatting help to append outval

Upvotes: 2

Views: 1037

Answers (1)

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250921

Here the iteration over fileinput.input allows us to iterate over all the files, fetching one line at a time. Now we split each line at white-space, and then using the city name as the key we store the corresponding temperature(or whatever value is that) inside the list.

import fileinput
d = {}
for line in fileinput.input(['Day_1.dat', 'Day_2.dat', 'Day_3.dat']):
    city, temp = line.split()
    d.setdefault(city, []).append(temp)

Now d contains:

{'Toronto': ['-22.5', '-15.5', '-5.5'],
 'Montreal': ['-10.6', '-1.5', '10.6']}

Now, we can simply iterate over this dictionary and write the data to the output file.

with open('output_file', 'w') as f:
    for city, values in d.items():
       f.write('{} {}\n'.format(city, ' '.join(values)))

Output:

$ cat output_file 
Toronto -22.5 -15.5 -5.5
Montreal -10.6 -1.5 10.6

Note that dictionaries don't have any particular order. So, the output here could have been Montreal first and then Toronto. In case that order is important then you need to use collections.OrderedDict.


Working version of your code:

d = {}
#Considering you've a list of all `gzip` files to be opened.
for readfilename in filenames:  
    #populate the dictionary by collecting data from each file
    with gzip.open(readfilename, 'r') as f:
        for line in f:
            city, temp = line.split()
            d.setdefault(city, []).append(temp)

#Now write to the output file
with open(outFileName, 'w') as f:
    for city, values in d.items():
       f.write('{} {}\n'.format(city, ' '.join(values)))

Upvotes: 2

Related Questions