Reputation: 720
I have the following step in my python script that is pulling in every csv file from a directory to then be combined into a single dataframe. I keep on getting errors for a few files in the list (list is about 4000 long) and I'm looking for a way to skip those with errors.
I tried to add a try: except: into this line, but that did not work either. Any ideas?
combined_csv = pd.concat([pd.read_csv(os.path.join(export, l)) for l in os.listdir(export) ])
Error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
[Finished in 178.1s with exit code 1]
Upvotes: 0
Views: 411
Reputation: 531625
You'll have to build the list item by item, so that you can resume the loop after catching an exception. Something like
dataframes = []
for l in os.listdir(export):
try:
df = pd.read_csv(os.path.join(export, l))
except UnicodeDecodeError:
continue
dataframes.append(df)
combined_csv = pd.concat(dataframes)
There is, unfortunately, no expression that can catch an exception at this time in Python, only the statement. One was proposed, but rejected.
If you want, you can wrap the try
statement in a function that return a value you can skip. For example:
def make_csv(path):
try:
return pd.read_csv(path)
except UnicodeDecodeError:
return None
combined_csv = pd.concat([x for x in [make_csv(os.path.join(export, f) for f in os.listdir(export)] if x is not None]
or
combined_csv = pd.concat([make_csv(os.path.join(export, f)) for f in os.listdir(export)])
if you can make make_csv
return an empty CSV file instead of None
.
Upvotes: 1