achandra03
achandra03

Reputation: 121

Pandas dataframe - TypeError: object of type '_io.TextIOWrapper' has no len()

I'm trying to organize a dataset that I downloaded. Currently, I have a directory of images and an Excel spreadsheets with values for those images. I'm trying to combine those into a single dataframe that has one column for image filename and another for the actual file. Here's the code I have currently:

for filename in os.listdir("C:\\Users\\arnav\\Dataset\\Images"):
    new_filename = filename
    if(filename[0] == '.'):
        new_filename = filename[2:]
    picture = open(new_filename)
    filename_file.loc[filename_file['Filename'] == new_filename,'File'] = picture

The filename_file dataframe has 5500 rows with 2 columns, one for filename and one for the file. I managed to load the filenames in, so right now it has all zeros for the File column. When I run the loop I get this error:

Traceback (most recent call last):
  File "data.py", line 16, in <module>
    filename_file.loc[filename_file['Filename'] == new_filename,'File'] = picture
  File "C:\Users\arnav\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 190, in __setitem__
    self._setitem_with_indexer(indexer, value)
  File "C:\Users\arnav\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 604, in _setitem_with_indexer
    elif can_do_equal_len():
  File "C:\Users\arnav\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 554, in can_do_equal_len
    values_len = len(value)
TypeError: object of type '_io.TextIOWrapper' has no len()

I don't know why this is happening, can anyone help?

Upvotes: 1

Views: 1025

Answers (1)

Milo
Milo

Reputation: 3318

You are trying to assign the file handler for your picture files, which fails due to internal checks by pandas using len(). If you want to assign the actual file contents to cells in the File column, you have to use picture.read().

Sidenote: You might want to use a context manager when reading from files in Python, because otherwise you would leave a lot of unclosed IO objects.

for filename in os.listdir("C:\\Users\\arnav\\Dataset\\Images"):
    new_filename = filename
    if(filename[0] == '.'):
        new_filename = filename[2:]
    with open(new_filename, encoding='utf-8') as picture:
        filename_file.loc[filename_file['Filename'] == new_filename, 'File'] = picture.read()

Upvotes: 1

Related Questions