Reputation: 39
I'm new to python and I'm trying to read all the files in a folder over a certain size and export the data (file path and size) to a .json
What I have so far:
import os
import json
import sys
import io
testPath = str(sys.argv[1])
testSize = int(sys.argv[2])
try:
to_unicode = unicode
except NameError:
to_unicode = str
filesList = []
x = 1
j = "1"
data = {}
for path, subdirs, files in os.walk(testPath):
for name in files:
filesList.append(os.path.join(path, name))
for i in filesList:
fileSize = os.path.getsize(str(i))
if int(fileSize) >= int(testSize):
data['unit'] = 'B'
data['path' + j] = str(i)
data['size' + j] = str(fileSize)
x = x + 1
j = str(x)
with io.open('Files.json', 'w', encoding='utf8') as outfile:
str_ = json.dumps(data,
indent=4, sort_keys=True,
separators=(',', ': '), ensure_ascii=False)
outfile.write(to_unicode(str_))
The problem is that the output is:
{
"path1": "C:\\Folder\\diager.xml",
"path2": "C:\\Folder\\diag.xml",
"path3": "C:\\Folder\\setup.log",
"path4": "C:\\Folder\\ESD\\log.txt",
"size1": "1908",
"size2": "4071",
"size3": "5822",
"size4": "788",
"unit": "B"
}
But it needs to be something like this:
{
"unit": "B",
"files": [{"path":"C:\Folder\file1.txt", "size": "10"}, {"path":"C:\Folder\file2.bin", "size": "400"}]
}
I added the j variable because it would just replace the first value and I would just end up with something like this:
{
"path": "C:\\Folder\\diager.xml",
"size": "1908",
"unit": "B"
}
I have no idea how to proceed... Help?
Upvotes: 1
Views: 658
Reputation: 89
Initialize your data dictionary with:
data = {"unit": "B", "files": []}
You can then replace your main loop:
for i in filesList:
fileSize = os.path.getsize(str(i))
if int(fileSize) >= int(testSize):
data['unit'] = 'B'
data['path' + j] = str(i)
data['size' + j] = str(fileSize)
x = x + 1
j = str(x)
by
for i in filesList:
fileSize = os.path.getsize(str(i))
if int(fileSize) >= int(testSize):
data['files'].append({"path": str(i), "size": str(filesize)})
Note that you no longer need your x and j variables.
Edit: In order to control the order of the fields, you can see this question. In particular, according to this nice answer, if you are using python 3.6, you can import OrderedDict (from collections import OrderedDict
) and replace data = {"unit": "B", "files": []}
by data = OrderedDict(unit="B", files=[])
Upvotes: 0
Reputation: 1244
You can do something like this:
files = []
for i in filesList:
fileSize = os.path.getsize(str(i))
if int(fileSize) >= int(testSize):
files.append({'path': str(i), 'size': fileSize})
data['unit'] = 'B'
data['files'] = files
This way, you create a list containing all paths and add it to the data
dict later.
Upvotes: 2