Reputation: 91
My goal is to group .csv files in a directory by shared characteristics in the file name. My directory contains files with names:
I would like to sort these files into groups on the numbers following the "Source" and "Receiver" sections of the file name (as shown below) so I can later concatenate them.
Group 1
Group 2
Any ideas?
Upvotes: 0
Views: 834
Reputation: 14226
It says you want to do this in pandas
so here is a pandas
solution.
fnames = ['After_Source1_Receiver1.csv',
'After_Source1_Receiver2.csv',
'Before_Source1_Receiver1.csv',
'Before_Source1_Receiver2.csv',
'During1_Source1_Receiver1.csv',
'During1_Source1_Receiver2.csv',
'During2_Source1_Receiver1.csv',
'During2_Source1_Receiver2.csv']
df = pd.DataFrame(fnames, columns=['names'])
I don't know what you want to do with your end results but this is how you group them.
pattern = r'Source(\d+)_Receiver(\d+)'
for _, g in pd.concat([df, df['names'].str.extract(pattern)], axis=1).groupby([0,1]):
print(g.names)
0 After_Source1_Receiver1.csv
2 Before_Source1_Receiver1.csv
4 During1_Source1_Receiver1.csv
6 During2_Source1_Receiver1.csv
Name: names, dtype: object
1 After_Source1_Receiver2.csv
3 Before_Source1_Receiver2.csv
5 During1_Source1_Receiver2.csv
7 During2_Source1_Receiver2.csv
Name: names, dtype: object
Upvotes: 1