Reputation: 4972
I am trying to loop over folders and subfolder to access and read CSV files before transforming them into JSON. Here is the code I am working on:
cursor = conn.cursor()
try:
# Specify the folder containing needed files
folderPath = 'C:\\Users\\myUser\\Desktop\\toUpload' # Or using input()
fwdPath = 'C:/Users/myUser/Desktop/toUpload'
for countries in os.listdir(folderPath):
for sectors in os.listdir(folderPath+'\\'+countries):
for file in os.listdir(folderPath+'\\'+countries+'\\'+sectors):
data = pd.DataFrame()
filename, _ext = os.path.splitext(os.path.basename(folderPath+'\\'+countries+'\\'+file))
print(file + ' ' + filename+ ' ' + sectors + ' ' + countries)
data = pd.read_csv(file)
# cursor.execute('SELECT * FROM SECTORS')
# print(list(cursor))
finally:
cursor.close()
conn.close()
The following print line is returning the file with its filename without the extension, and then sectors and countries folder names:
print(file + ' ' + filename+ ' ' + sectors + ' ' + countries)
myfile.csv myfile WASHSector CTRYIrq
Now when it comes to reading the CSV, it will take lots and lots of time and at the end O get the following error:
[Errno 2] File myfile.csv does not exist
Upvotes: 0
Views: 57
Reputation: 2647
Before reading the csv file, you should compose the whole path to the file, otherwise, pandas won't be able to read that file.
import os
# ...
path = os.path.join(folderPath, countries, sectors, file)
data = pd.read_csv(path)
Also instead of using three nested for loops I recommend you using the os.walk
method. It will automatically recurse through directories
>>> folderPath = 'C:\\Users\\myUser\\Desktop\\toUpload'
>>> for root, _, files in os.walk(folderPath):
>>> ... for f in files:
>>> ... pd.read_csv(os.path.join(root, f))
Upvotes: 1
Reputation: 2313
you need to give pd.read_csv
the full path of the file, so change it to:
data = pd.read_csv(folderPath+'\\'+countries+'\\'+sectors + '\\' +file)
Upvotes: 1