Reputation: 471
I have searched through perhaps a dozen variations of the question "How to build a dataframe row by row", but none of the solutions have worked for me. Thus, though this is a frequently asked question, my case is unique enough to be a valid question. I think the problem might be that I am grabbing each row from a different csv. This code demonstrates that I am successfully making dataframes in the loop:
onlyfiles = list_of_csvs
for idx, f in enumerate(onlyfiles):
row = pd.read_csv(mypath + f,sep="|").iloc[0:1]
But the rows are individual dataframes and cannot be combined (so far). I have attempted the following:
df = pd.DataFrame()
for idx, f in enumerate(onlyfiles):
row = pd.read_csv(path + f,sep="|").iloc[0:1]
df.iloc(idx) = row
Which returns
df.loc(idx) = row
^
SyntaxError: can't assign to function call
I think the problem is that each row, or dataframe, has its own headers. I've also tried df.loc(idx) = row[1]
but that doesn't work either (where we grab row[:] when idx = 0
). Neither iloc(idx) or loc(idx)
works.
In the end, I want one dataframe that has the header (column names) from the first data frame, and then n rows where n is the number of files.
Upvotes: 1
Views: 266
Reputation: 1094
Try pd.concat()
.
Note, you can read just the first line from the file directly, instead of reading in the file and then limiting to first row. pass parameter nrows=1
in pd.read_csv.
onlyfiles = list_of_csvs
df_joint = pd.DataFrame()
for f in enumerate(onlyfiles):
df_ = pd.read_csv(mypath + f,sep="|", nrows=1)
df_joint = pd.concat([df_joint, df_])
Upvotes: 1