darkknight555
darkknight555

Reputation: 33

How to fix TypeError: cannot concatenate object of type '<class 'pandas.io.parsers.TextFileReader'>'; only Series and DataFrame objs are valid?

I am trying to read csv files and concatenate them and output them as one csv file. I keep getting this error:

TypeError: cannot concatenate object of type '< class 'pandas.io.parsers.TextFileReader'>'; only Series and DataFrame objs are valid;

I am not sure how to fix it. I am a beginner, so I would appreciate any help! Thank you! Here is the code I wrote:

csv.field_size_limit(sys.maxsize)
df1 = pd.read_csv('file1.csv', chunksize=20000)
df2 = pd.read_csv('file2.csv', chunksize=20000)
df3 = pd.read_csv('file3.csv', chunksize=20000)
df4 = pd.read_csv('file4.csv', chunksize=20000)
df5 = pd.read_csv('file5.csv', chunksize=20000)
df6 = pd.read_csv('file6.csv', chunksize=20000)

frames = [df1, df2, df3, df4, df5, df6]
result = pd.concat(frames, ignore_index=True, sort=False)
result.to_csv('new.csv')

Upvotes: 2

Views: 2332

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30991

If you call read_csv passing chunksize parameter, then:

  • it returns a TextFileReader object,
  • which can be used, e.g. in a loop, to read and process consecutive chunks.

An example of how to use "chunked" CSV file reading:

reader = pd.read_csv('input.csv', chunksize=20000)
for chunk in reader:
    # Process the chunk (DataFrame)

Or maybe you want to:

  • read only initial 20000 rows from each source file,
  • concatenate them into a new DataFrame?

If this is the case, pass nrows=20000 (instead of chunksize), while reading from each file. Then all returned objects will be just DataFrames and you will be able to concat them.

Upvotes: 2

Related Questions