Reputation: 121
I'm trying to organize a dataset that I downloaded. Currently, I have a directory of images and an Excel spreadsheets with values for those images. I'm trying to combine those into a single dataframe that has one column for image filename and another for the actual file. Here's the code I have currently:
for filename in os.listdir("C:\\Users\\arnav\\Dataset\\Images"):
new_filename = filename
if(filename[0] == '.'):
new_filename = filename[2:]
picture = open(new_filename)
filename_file.loc[filename_file['Filename'] == new_filename,'File'] = picture
The filename_file
dataframe has 5500 rows with 2 columns, one for filename and one for the file. I managed to load the filenames in, so right now it has all zeros for the File
column. When I run the loop I get this error:
Traceback (most recent call last):
File "data.py", line 16, in <module>
filename_file.loc[filename_file['Filename'] == new_filename,'File'] = picture
File "C:\Users\arnav\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 190, in __setitem__
self._setitem_with_indexer(indexer, value)
File "C:\Users\arnav\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 604, in _setitem_with_indexer
elif can_do_equal_len():
File "C:\Users\arnav\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 554, in can_do_equal_len
values_len = len(value)
TypeError: object of type '_io.TextIOWrapper' has no len()
I don't know why this is happening, can anyone help?
Upvotes: 1
Views: 1025
Reputation: 3318
You are trying to assign the file handler for your picture files, which fails due to internal checks by pandas using len()
. If you want to assign the actual file contents to cells in the File
column, you have to use picture.read()
.
Sidenote: You might want to use a context manager when reading from files in Python, because otherwise you would leave a lot of unclosed IO objects.
for filename in os.listdir("C:\\Users\\arnav\\Dataset\\Images"):
new_filename = filename
if(filename[0] == '.'):
new_filename = filename[2:]
with open(new_filename, encoding='utf-8') as picture:
filename_file.loc[filename_file['Filename'] == new_filename, 'File'] = picture.read()
Upvotes: 1