Reputation: 110572
I have the following list of titles:
titles = ['Saw (US)', 'Saw (AU)', 'Dear Sally (SE)']
How would I get the following:
titles = ['Saw (US)', 'Dear Sally (SE)']
Basically, I need to remove the duplicate titles. It doesn't matter which territory shows, as long as it is on (i.e., I can remove any duplicate).
Here is what I have tried, unsuccessfully:
[title for title in localized_titles if title.split(' (')[0] not in localized_titles]
Upvotes: 2
Views: 207
Reputation: 5812
If that is really the exact format of your titles, make sure that your localized_titles
is right:
generic_titles = [t.split('(')[0] for t in titles]
titles = [title for title in titles if title.split(' (')[0] not in generic_titles]
But, this all breaks when there are other parentheses in the titles.
Upvotes: 1
Reputation: 133764
>>> from collections import OrderedDict
>>> titles = ['Saw (US)', 'Saw (AU)', 'Dear Sally (SE)']
>>> list(OrderedDict((t.rpartition(' (')[0], t) for t in titles).values())
['Saw (AU)', 'Dear Sally (SE)']
Upvotes: 1
Reputation: 5830
fast, and preserves order
seen = set()
[title for title in titles
if title.split(' (')[0] not in seen and not seen.add(title.split(' (')[0])]
Upvotes: 0
Reputation: 7329
For the sake of code golf:
titles = ['('.join(x) for x in dict([x.split('(') for x in titles]).items()]
Assumes only one (
character per title, at the beginning of the country.
Upvotes: 0
Reputation: 110572
Here's a roundabout way of getting there:
localized_titles, existing_stems = [], []
for item in localized:
stem = item.split(' (')[0]
if stem not in existing_stems:
existing_stems.append(stem)
localized_titles.append(item)
Upvotes: 1
Reputation: 37364
I'm not sure this is the most elegant solution, but it should work - you can use your non-territory version of the title as a dict key.
unique_titles = dict((title.rsplit(' (', 1)[0], title) for title in titles)
Or if you need to preserve order, an OrderedDict.
unique_titles.values() would be the titles including territories (one per title).
Using the optional argument to rsplit to limit it to at most one split, and rsplit to start looking for parens from the end rather than beginning of the string.
Upvotes: 2
Reputation: 1816
Try using a dictionary to keep track of how many instances of each item in the array you have. Let the key in the dictionary be the value in the array, and the value of dictionary either true or false depending whether it has seen that item yet.
You can then iterate through the array, adding to the dictionary and removing items from the array if they exist in the dictionary. It's how I do it, but I'm still learning.
Upvotes: 0