Reputation: 303
I know that when using python to write dicts to csv files, the headers will be put in alphabetic order. So is there a way that I can write the header with the order I want?
The code and outputs of tsv file is below.
I have a dict:
my_data = {"name": name[:], "city": city[:], "state": state[:],
"stars": stars[:], "review_count": review_count[:],
"main_category": new_cat[:]}
And I used following code to write them in csv file:
with open('test.tsv','w') as file:
writer = csv.writer(file, delimiter='\t')
writer.writerow(my_data.keys())
for row in zip(*my_data.values()):
writer.writerow(list(row))
And the first several rows of the output of the tsv file is below:
city review_count name main_category state stars
Los Angeles 2 Southern California Medical Group Medical Centers CA 3.5
Cambridge 4 Harvard Square Shiatsu Massage MA 4.0
Kitchener 2 Faith & Glory Collective Tattoo ON 4.0
You can see that they are put in an alphabetic order, but what I really want is that they can be put in the order of keys in my_data like this:
name city state stars review_count main_category
Upvotes: 1
Views: 737
Reputation: 123501
The following shows a couple of ways to output the data to the tsv file in the order you want:
with open('test.tsv', 'wb') as file:
FIELDNAMES = 'name city state stars review_count main_category'.split()
writer = csv.writer(file, delimiter='\t')
writer.writerow(FIELDNAMES)
writer.writerows(row for row in zip(*(my_data[key] for key in FIELDNAMES)))
As others have suggested, this could also be done by using a csv.DictWriter
:
with open('test.tsv', 'wb') as file:
FIELDNAMES = 'name city state stars review_count main_category'.split()
writer = csv.DictWriter(file, delimiter='\t', fieldnames=FIELDNAMES)
writer.writeheader()
writer.writerows(dict(zip(FIELDNAMES, row))
for row in zip(*(my_data[key] for key in FIELDNAMES)))
As you can see there both about the same amount of code, although the first, non-DictWriter
version is a little more efficient since producing the data for each row
requires less effort, so accordingly it's probably a little faster.
Regardless of which type of writer is used, the contents of the test.tsv
file created will be exactly the same, as shown below (where →
represents the tab separators):
name→city→state→stars→review_count→main_category
Southern California Medical Group→Los Angeles→CA→3.5→2→Medical Centers
Harvard Square Shiatsu→Cambridge→MA→4.0→4→Massage
Faith & Glory Collective→Kitchener→ON→4.0→2→Tattoo
Note: If you're using Python 2, you open the output file for writing with mode 'wb'
. For Python 2, you should use mode 'w'
and also add a newline=''
keyword argument.
Upvotes: 0
Reputation: 177971
If you have a dictionaries for the fields of each row, then use a DictWriter. It has options to fill in defaults for missing fields, or to ignore extra fields that you don't want in the output.
Example:
import csv
# Sample data converted to a list of lines.
datalines = '''\
Los Angeles,2,Southern California Medical Group,Medical Centers,CA,3.5
Cambridge,4,Harvard Square Shiatsu,Massage,MA,4.0
Kitchener,2,Faith & Glory Collective,Tattoo,ON,4.0
'''.splitlines()
# Specify the fields you want in the output.
# If your dictionaries have fields you don't want in the output, use extrasaction='ignore'
fields = 'name city state stars review_count main_category'.split()
# Python 3, use newline=''. Python 2 use 'wb' instead.
with open('test.tsv','w',newline='') as file:
writer = csv.DictWriter(file, fields, delimiter='\t', extrasaction='ignore')
writer.writeheader() # Writes headers specified by "fields"
for line in datalines:
# Converts input line into a dict in OP's format.
my_data = dict(zip('city review_count name main_category state stars'.split(),line.split(',')))
print(my_data)
writer.writerow(my_data)
Output to verify dictionaries are in OP's format:
{'stars': '3.5', 'review_count': '2', 'name': 'Southern California Medical Group', 'city': 'Los Angeles', 'state': 'CA', 'main_category': 'Medical Centers'}
{'stars': '4.0', 'review_count': '4', 'name': 'Harvard Square Shiatsu', 'city': 'Cambridge', 'state': 'MA', 'main_category': 'Massage'}
{'stars': '4.0', 'review_count': '2', 'name': 'Faith & Glory Collective', 'city': 'Kitchener', 'state': 'ON', 'main_category': 'Tattoo'}
Output to test.tsv (→ for tab):
name→city→state→stars→review_count→main_category
Southern California Medical Group→Los Angeles→CA→3.5→2→Medical Centers
Harvard Square Shiatsu→Cambridge→MA→4.0→4→Massage
Faith & Glory Collective→Kitchener→ON→4.0→2→Tattoo
Upvotes: 0
Reputation: 21663
Just for one record.
import csv
output=open('temp.csv', 'w')
outputCSV = csv.DictWriter(output, delimiter = '\t', \
fieldnames = [ 'name', 'city', 'state', 'stars', 'review_count', 'main_category' ] )
outputCSV.writerow( {
'name': 'Southern Cal Med Group',
'city': 'Los Angeles',
'state': 'CA',
'review_count': '2',
'main_category': 'medical',
'stars': '3.5',
} )
output.close()
Upvotes: 1