CandleWax
CandleWax

Reputation: 2219

How to read in multiple files into pandas?

I have a folder that has hundreds or files which contain comma separated data, however, the files themselves have no file extensions (i.e., EPI or DXPX; NOT EPI.csv or DXPX.csv).

I am trying to create a loop that reads in only certain files that I need (between 15-20 files). I do not want to concat or append the dfs. I merely want to read each df into memory and be able to call the df by name.

Even though there is no extension, I can read the file in as .csv

YRD = pd.read_csv('YRD', low_memory=False)

My expected result from the loop below is two dfs: one labeled YRD and another labeled HOUSE. However, I only get one df named df_raw and it is only the final file in the list. Sorry if this is a silly question, but I cannot figure out what I am missing.

df_list = ['YRD','HOUSE']

for raw_df in df_list:
    raw_df = pd.read_csv(raw_df, low_memory=False)

Upvotes: 1

Views: 180

Answers (1)

Denis
Denis

Reputation: 1543

This is because you reassign the value raw_df every time you encounter a new file... You should create new variables, not reuse the old ones:

mydfs=[]
for raw_df in df_list:
    mydfs.append( pd.read_csv(raw_df, low_memory=False))

or you can put them into a dictionnary:

mydfs={}
for raw_df in df_list:
    mydfs[raw_df]= pd.read_csv(raw_df, low_memory=False)

Upvotes: 2

Related Questions