Tina
Tina

Reputation: 69

Read multiple CSV files then rename files based on the filenames

Currently the below code reads all the csv files in the path, then saved in a list.

I want to save each dataframe with the name of the filename e.g. echo.csv

path = r'M:\Work\Experimental_datasets\device_ID\IoT_device_captures\packet_header_features' # use your path
all_files = glob.glob(os.path.join(path, "*.csv"))
li = []

for filename in all_files:
    df = pd.read_csv(filename, skiprows=15, sep='[|]',
        skipfooter=2, engine='python', header=None, 
        names=["sum_frame_len","avg_frame_len","max_frame_len","sum_ip_len"],
        usecols=[2,3,4,5]
        )
    li.append(df)

The output I get is a list of dataframes - but I want each of these dataframes with the name of the filename e.g. echo

How do I access each dataframe from the dictionary

Upvotes: 1

Views: 281

Answers (1)

Denver
Denver

Reputation: 639

As you mentioned a dictionary would be useful for this task. For example:

import os
all_files = glob.glob(os.path.join(path, "*.csv"))
df_dict = {}

for filename in all_files:

    df = pd.read_csv(filename, skiprows=15, sep='[|]',
        skipfooter=2, engine='python', header=None, 
        names=["sum_frame_len","avg_frame_len","max_frame_len","sum_ip_len"],
        usecols=[2,3,4,5]
        )

    name = os.path.basename(filename).split('.')[0]
    df_dict[name] = df

What you will be left with is the dictionary df_dict where the keys correspond to the name of the file and the value corresponds to the data within a given file.

You can view all the keys in the dictionary with df_dict.keys() and select a given DataFrame with df_dict[key].

Upvotes: 2

Related Questions