Claudiu Dragan
Claudiu Dragan

Reputation: 39

Write data to .json file in python

I'm new to python and I'm trying to read all the files in a folder over a certain size and export the data (file path and size) to a .json

What I have so far:

import os       
import json
import sys
import io

testPath = str(sys.argv[1])
testSize = int(sys.argv[2])

try:
    to_unicode = unicode
except NameError:
    to_unicode = str

filesList = []
x = 1
j = "1"
data = {}

for path, subdirs, files in os.walk(testPath):
    for name in files:
        filesList.append(os.path.join(path, name))

for i in filesList:
    fileSize = os.path.getsize(str(i))
    if int(fileSize) >= int(testSize):
        data['unit'] = 'B'
        data['path' + j] = str(i)
        data['size' + j] = str(fileSize)
        x = x + 1
        j = str(x)


with io.open('Files.json', 'w', encoding='utf8') as outfile:
    str_ = json.dumps(data,
                      indent=4, sort_keys=True,
                      separators=(',', ': '), ensure_ascii=False)
    outfile.write(to_unicode(str_))

The problem is that the output is:

{
    "path1": "C:\\Folder\\diager.xml",
    "path2": "C:\\Folder\\diag.xml",
    "path3": "C:\\Folder\\setup.log",
    "path4": "C:\\Folder\\ESD\\log.txt",
    "size1": "1908",
    "size2": "4071",
    "size3": "5822",
    "size4": "788",
    "unit": "B"
}

But it needs to be something like this:

{
"unit": "B",
"files": [{"path":"C:\Folder\file1.txt", "size": "10"}, {"path":"C:\Folder\file2.bin", "size": "400"}]
}

I added the j variable because it would just replace the first value and I would just end up with something like this:

{
    "path": "C:\\Folder\\diager.xml",
    "size": "1908",
    "unit": "B"
}

I have no idea how to proceed... Help?

Upvotes: 1

Views: 658

Answers (2)

gchelfi
gchelfi

Reputation: 89

Initialize your data dictionary with:

data = {"unit": "B", "files": []}

You can then replace your main loop:

for i in filesList:
    fileSize = os.path.getsize(str(i))
    if int(fileSize) >= int(testSize):
        data['unit'] = 'B'
        data['path' + j] = str(i)
        data['size' + j] = str(fileSize)
        x = x + 1
        j = str(x)

by

for i in filesList:
    fileSize = os.path.getsize(str(i))
    if int(fileSize) >= int(testSize):
        data['files'].append({"path": str(i), "size": str(filesize)})

Note that you no longer need your x and j variables.

Edit: In order to control the order of the fields, you can see this question. In particular, according to this nice answer, if you are using python 3.6, you can import OrderedDict (from collections import OrderedDict) and replace data = {"unit": "B", "files": []} by data = OrderedDict(unit="B", files=[])

Upvotes: 0

amuttsch
amuttsch

Reputation: 1244

You can do something like this:

files = []
for i in filesList:
    fileSize = os.path.getsize(str(i))
    if int(fileSize) >= int(testSize):
        files.append({'path': str(i), 'size': fileSize})

data['unit'] = 'B'
data['files'] = files

This way, you create a list containing all paths and add it to the data dict later.

Upvotes: 2

Related Questions