Reputation: 55
I am trying to import all .csv files within a directory. I would like to store them in array for each file (for example named as file_name). I tried following code as suggested in thread import all csv files in directory as pandas dfs and name them as csv filenames:
import pandas as pd
import glob
import os
path = "E:\\9sem\\INO\\Dane\\input\\"
all_files = glob.glob(os.path.join(path, "*.csv")) #make list of paths
for file in all_files:
# Getting the file name without extension
file_name = os.path.splitext(os.path.basename(file))[0]
# Reading the file content to create a DataFrame
dfn = pd.read_csv(file)
# Setting the file name (without extension) as the index name
dfn.index.name = file_name
And I am stuck. I imported the data into single DataFrame but I dont know how to convert it do separate numpy arrays.
Thank you for any suggestions.
Best regards, Maks
Upvotes: 0
Views: 613
Reputation: 4521
Your code would always overwrite the dataframe by the data of the next csv, right?
So either you could use pandas.concat
to make one big dataframe, or you could store the data in a dictionary. If you want to store it in a dictionary, you could change your code like this:
df_dict= dict()
for file in all_files:
# Getting the file name without extension
file_name = os.path.splitext(os.path.basename(file))[0]
# Reading the file content to create a DataFrame
df_dict[file_name]= pd.read_csv(file)
# Setting the file name (without extension) as the index name
df_dict[file_name].index.name = file_name
Then you can get the dataframe by df_dict[base_name]
. Where base_name
is the name of the source file of the dataframe.
Upvotes: 1