How to build a dataframe row by row, where each row comes from a different csv?

Question

I have searched through perhaps a dozen variations of the question "How to build a dataframe row by row", but none of the solutions have worked for me. Thus, though this is a frequently asked question, my case is unique enough to be a valid question. I think the problem might be that I am grabbing each row from a different csv. This code demonstrates that I am successfully making dataframes in the loop:

onlyfiles = list_of_csvs 
for idx, f in enumerate(onlyfiles):
    row = pd.read_csv(mypath + f,sep="|").iloc[0:1]

But the rows are individual dataframes and cannot be combined (so far). I have attempted the following:

df = pd.DataFrame()
for idx, f in enumerate(onlyfiles):
    row = pd.read_csv(path + f,sep="|").iloc[0:1]
    df.iloc(idx) = row

Which returns

    df.loc(idx) = row
    ^
SyntaxError: can't assign to function call

I think the problem is that each row, or dataframe, has its own headers. I've also tried df.loc(idx) = row[1] but that doesn't work either (where we grab row[:] when idx = 0). Neither iloc(idx) or loc(idx) works.

In the end, I want one dataframe that has the header (column names) from the first data frame, and then n rows where n is the number of files.

yulGM · Accepted Answer

Try pd.concat().

Note, you can read just the first line from the file directly, instead of reading in the file and then limiting to first row. pass parameter nrows=1 in pd.read_csv.

onlyfiles = list_of_csvs 
df_joint = pd.DataFrame()
for f in enumerate(onlyfiles):
    df_ = pd.read_csv(mypath + f,sep="|", nrows=1)
    df_joint = pd.concat([df_joint, df_])

How to build a dataframe row by row, where each row comes from a different csv?

Answers (1)

Related Questions