Reputation: 11
I have a folder containing 3 csv files:
a.csv
b.csv
c.csv
To read all the csv's in this folder and create a dataframe, I'm currently doing this:
df1 = pd.read_csv('a.csv')
df2 = pd.read_csv('b.csv')
df3 = pd.read_csv('c.csv')
Is there any way to automate the naming of the dataframes (df1, df2 and df3) and reading of all the csv files in that folder. Say, I have 10 csv files, I don't want to manually write 10 read statements in pandas.
For example, I don't want to write this:
df1 = pd.read_csv('a.csv')
......
......
......
df10 = pd.read_csv('j.csv')
Thanks!
Upvotes: 1
Views: 3866
Reputation: 719
You can create a dictionary of DataFrames:
import os
import pandas as pd
from glob import glob
dfs = {os.path.splitext(os.path.basename(f))[0]: pd.read_csv(f) for f in glob('*.csv')}
# df1 equivalent dfs['a']
dfs['a']
Upvotes: 1
Reputation: 17550
You can do this quite easily if you're willing to access a list of dataframes rather than have df1...dfn explicitly declared:
root= "YOUR FOLDER"
csvs= [] #container for the various csvs contained in the directory
dfs = [] #container for temporary dataframes
# collect csv filenames and paths
for dirpath, dirnames, filenames in os.walk(root):
for file in filenames:
csvs.append(dirpath + '\\' + file)
# store each dataframe in the list
for f in csvs:
dfs.append(pd.read_csv(f))
Then access like dfs[0] ... dfs[n]
Upvotes: 2
Reputation: 1207
People may downvote this solution since I am asking you to play with global
variables. But, this solves your problem.
dir= 'myDir'
for root, dirs, filenames in os.walk(dir):
for a, f in enumerate(filenames):
fullpath = os.path.join(dir, f)
globals()['df%s' % str(a+1)] = pd.read_csv(fullpath)
Upvotes: 0