VXMH
VXMH

Reputation: 61

How to keep leading zeros when writing to CSV file in python

I have some code that reads through a list of HTML files, takes some information from each of these, and then presents them in a CSV file.

Part of this is using the number from the file's name to display which hour that line of data is from (e.g. file name is Report_H1.html). However, in order to get these to display in the correct order, I have padded out each number with 0's to make them all 2 decimal places.

When writing these to the CSV file, the leading 0's are removed. How can I prevent this from happening?

Edited code down to just the part that is causing my issue:

import re
import csv
import os

BASE_DIRECTORY = "C:\\Users\\MHE36\\Documents"
fileList = []

for (dirpath, dirnames, filenames) in os.walk(BASE_DIRECTORY):
    for f in filenames:
        if 'html' in str(f):
            e = os.path.join(str(dirpath), str(f))
            boxName = os.path.dirname(e)
            box = os.path.basename(boxName)
            fileDict = {}
            fileList.append(fileDict)
            match = re.search(r'\d+', f)
            if match:
                fileDict['Hour'] = str(match.group(0).zfill(2))

with open('C:\\Users\\MHE36\\Documents\\test.csv', 'w') as f:
    w = csv.DictWriter(f, ['Hour'])
    w.writeheader()
    for fileDict in fileList:
        w.writerow(fileDict)

print(fileList)
print(fileDict)

and this is the output of the print commands:

[{'Hour': '01'}, {'Hour': '10'}, {'Hour': '11'}, {'Hour': '12'}, {'Hour': '13'}, {'Hour': '14'}, {'Hour': '15'}, {'Hour': '16'}, {'Hour': '02'}, {'Hour': '03'}, {'Hour': '04'}, {'Hour': '05'}, {'Hour': '06'}, {'Hour': '07'}, {'Hour': '08'}, {'Hour': '09'}, {'Hour': '01'}, {'Hour': '02'}, {'Hour': '03'}, {'Hour': '04'}, {'Hour': '05'}]
{'Hour': '05'}

In the above prints, I can see the leading zeros are being added as I want them. However, in the CSV file, I get the following:

Hour                
1   

I want it as:

Hour                
01  

Thanks in advance.

Upvotes: 0

Views: 6291

Answers (2)

Shawn Jamal
Shawn Jamal

Reputation: 180

I found this in another answer and it worked for me:

df['column']=df['column'].apply('="{}"'.format)

Upvotes: 2

bruno desthuilliers
bruno desthuilliers

Reputation: 77902

Your issue has nothing to do with Python's csv module (as you would have found out reading the csv file directly in your code editor), but with Excel considering unquoted numerics as, well, numerics.

I don't use nor have Excel (linux user here) so I can't double-check, but passing quoting=csv.QUOTE_ALL when creating the writer should fix the issue (this will add quote chars around all your numerics so excel should consider them as strings and leave them alone)

The doc: https://docs.python.org/3/library/csv.html#csv.writer

NB : if you only want to have SOME fields quoted, you'll have to do so manually when building the fileDict.

Upvotes: 3

Related Questions