Reputation: 191
I have files that have the pattern
XXXX____________030621_120933_D.csv
YYYY____________030621_120933_E.csv
ZZZZ____________030621_120933_F.csv
I am using glob.glob and for loop to parse each file to pandas to create Data frame of which i will merge at the end. I want to add a column which will add the XXXX,YYYY, and ZZZZ to each data frame accordingly
I can create the column called ID with df['ID'] and want to pick the value from the filenames. is the easiest way to grab that from the filename when reading the CSV and processing via pd
Upvotes: 0
Views: 70
Reputation: 1048
If the file names are as what you have presented, then use this code:
dir_path = #path to your directory
file_paths = glob.glob(dir_path + '*.csv')
result = pd.DataFrame()
for file_ in file_paths :
df = pd.read_csv(file_)
df['ID'] = file_[<index of the ID>]
result = result.append(df, ignore_index=True)
Finding the right index might take a bit of time, but that should do it.
Upvotes: 1