Reputation: 13
I have a lot of files and I have saved all filenames to filelists.txt
. Here is an example file:
cpu_H1_M1_S1.out
cpu_H1_M1_S2.out
cpu_H2_M1_S1.out
cpu_H2_M1_S2.out
When the program detects _H
, _M
, _S
in the file name. I need to output the numbers that appear afterwards. For example:
_H _M _S
1 1 1
1 1 2
2 1 1
2 1 2
Thank you.
Upvotes: 0
Views: 147
Reputation: 113955
Though I have nothing against regex itself, I think it's overkill for this problem. Here's a lighter solution:
five = operator.itemgetter(5)
seven = operator.itemgetter(7)
nine = operator.itemgetter(9)
with open("filelists.txt") as f:
for line in f:
return [(int(five(line)), int(seven(line)), int(nine(nine))) for line in f]
Hope that helps
Upvotes: 0
Reputation: 142146
You could use a regexp:
>>> s = 'cpu_H2_M1_S2.out'
>>> re.findall(r'cpu_H(\d+)_M(\d+)_S(\d+)', s)
[('2', '1', '2')]
If it doesn't match the format exactly, you'll get an empty list as a result, which can be used to ignore the results. You could adapt this to convert the str's to int's if you wished:
[int(i) for i in re.findall(...)]
Upvotes: 2
Reputation: 250951
something like this using regex
:
In [13]: with open("filelists.txt") as f:
for line in f:
data=re.findall(r"_H\d+_M\d+_S\d+",line)
if data:
print [x.strip("HMS") for x in data[0].split("_")[1:]]
....:
['1', '1', '1']
['1', '1', '2']
['2', '1', '1']
['2', '1', '2']
Upvotes: 0