Reputation: 159
I have a list of strings as such:
['text_1.jpg', 'othertext_1.jpg', 'text_2.jpg', 'othertext_2.jpg', ...]
In reality, there are more entries than 2 per number but this is the general format. I would like to split this list into list of lists as such:
[['text_1.jpg', 'othertext_1.jpg'], ['text_2.jpg', 'othertext_2.jpg'], ...]
These sub-lists being based on the integer after the underscore. My current method to do so is to first sort the list based on the numbers as shown in the first list sample above and then iterate through each index and copy the values into new lists if it matches the value of the previous integer.
I am wondering if there is a simpler more pythonic way of performing this task.
Upvotes: 1
Views: 226
Reputation: 4335
Similiar solution to @Andrej:
import itertools
import re
def find_number(s):
# it is said that python will compile regex automatically
# feel free to compile first
return re.search(r'_(\d+)\.jpg', s).group(1)
l = ['text_1.jpg', 'othertext_1.jpg', 'text_2.jpg', 'othertext_2.jpg']
res = [list(v) for k, v in itertools.groupby(l, find_number)]
print(res)
#[['text_1.jpg', 'othertext_1.jpg'], ['text_2.jpg', 'othertext_2.jpg']]
Upvotes: 1
Reputation: 195543
Try:
import re
lst = ["text_1.jpg", "othertext_1.jpg", "text_2.jpg", "othertext_2.jpg"]
r = re.compile(r"_(\d+)\.jpg")
out = {}
for val in lst:
num = r.search(val).group(1)
out.setdefault(num, []).append(val)
print(list(out.values()))
Prints:
[['text_1.jpg', 'othertext_1.jpg'], ['text_2.jpg', 'othertext_2.jpg']]
Upvotes: 2