Reputation: 555
I have a list of files like this:
my_list=['l.txt','PPT_6_202008062343HLC.txt','PPT_6_202008070522HLC.txt','PPT_12_202008062343HLC.txt','PPT_12_202008070522HLC.txt']
and I want to have a final list with the latest that files that begins with ppt_6 and ppt_12 and keep the other elements items, like this:
final_list=
['PPT_6_202008070522HLC.txt', 'PPT_12_202008070522HLC.txt', 'l.txt']
right now I'm doing this:
from datetime import datetime
now = datetime.now()
new_arc=[]
time_6=[]
time_12=[]
for i in my_list:
if i[4:5]=='6':
time_6.append(i)
elif i[4:5]=='1':
time_12.append(i)
else:
new_arc.append(i)
time_6 = [max(t for t in time_6 if datetime.strptime(t[-15:-3], '%Y%m%d%H%M') < now)]
time_12 = [max(t for t in time_12 if datetime.strptime(t[-15:-3], '%Y%m%d%H%M') < now)]
final_list=time_6+time_12+new_arc
is there a better way of doing this ?
Upvotes: 0
Views: 102
Reputation: 1355
The best I could come up with was this:
import re
my_list = [
'l.txt','PPT_6_202008062343HLC.txt','PPT_6_202008070522HLC.txt',
'PPT_12_202008062343HLC.txt','PPT_12_202008070522HLC.txt'
]
patterns = (re.compile("PPT_6"), re.compile("PPT_12"))
final_list = [sorted(list(filter(pattern.match, problem_list)))[0]
for pattern in patterns]
final_list += list(filter(re.compile("[^PPT]").match, problem_list))
Depending on how many file names you're going to be working with, I don't think it should be too bad.
Upvotes: 0
Reputation: 5975
The datetime format into these filenames allows you not to use datetime functions, alphabetical order is enough.
You can remove all items matching the two patterns and finally append the most recent of them, which are the maximum (alphabetically) elements.
p1 = [x for x in my_list if x.startswith("PPT_6")]
p2 = [x for x in my_list if x.startswith("PPT_12")]
result = [x for x in my_list if x not in p1 and x not in p2]
result.append(max(p1))
result.append(max(p2))
print(result)
Upvotes: 1
Reputation: 6474
Since the file names already have a date order, you could simply sort on them. Then group by the prefix (PPT_6
and PPT_12
). Finally get the top row from each group.
from itertools import groupby
#get prefix up to nth _
def split_nth(text, n):
grp = text.split('_')
return '_'.join(grp[:n])
my_list =['l.txt','PPT_6_202008062343HLC.txt','PPT_6_202008070522HLC.txt',
'PPT_12_202008062343HLC.txt','PPT_12_202008070522HLC.txt']
sorted_list = sorted(my_list[1:], reverse=True)
groups = groupby(sorted_list, key=lambda x: split_nth(x, 2))
result = [next(v) for _, v in groups]
result.append(my_list[0])
Upvotes: 1