Reputation: 2219
I have a folder that has hundreds or files which contain comma separated data, however, the files themselves have no file extensions (i.e., EPI or DXPX; NOT EPI.csv or DXPX.csv).
I am trying to create a loop that reads in only certain files that I need (between 15-20 files). I do not want to concat or append the dfs. I merely want to read each df into memory and be able to call the df by name.
Even though there is no extension, I can read the file in as .csv
YRD = pd.read_csv('YRD', low_memory=False)
My expected result from the loop below is two dfs: one labeled YRD and another labeled HOUSE. However, I only get one df named df_raw and it is only the final file in the list. Sorry if this is a silly question, but I cannot figure out what I am missing.
df_list = ['YRD','HOUSE']
for raw_df in df_list:
raw_df = pd.read_csv(raw_df, low_memory=False)
Upvotes: 1
Views: 180
Reputation: 1543
This is because you reassign the value raw_df
every time you encounter a new file...
You should create new variables, not reuse the old ones:
mydfs=[]
for raw_df in df_list:
mydfs.append( pd.read_csv(raw_df, low_memory=False))
or you can put them into a dictionnary:
mydfs={}
for raw_df in df_list:
mydfs[raw_df]= pd.read_csv(raw_df, low_memory=False)
Upvotes: 2