Reputation: 817
For a test program I'm crawling a webpage. I'd like to crawl all activites for specifid ID´s which are associated to the respective cities.
For example, my initial code:
RegionIDArray = {522: "London", 4745: "London", 2718: "London", 3487: "Tokio"}
Im now wondering if its possible to sum up all IDs (values) which are related to e.g. London into one key:
RegionIDArray = {522, 4745, 2718: "London}
If I´m trying this, I get no results
My full code so far
RegionIDArray = {522: "London", 4745: "London", 2718: "London", 3487: "Tokio"}
for reg in RegionIDArray:
r = requests.get("https://www.getyourguide.de/-l" +str(reg) +"/")
soup = BeautifulSoup(r.content, "lxml")
g_data = soup.find_all("span", {"class": "intro-title"})
for item in g_data:
POI_final = (str(item.text))
end_final = ("POI: " + POI_final)
if end_final not in already_printed:
print(end_final)
already_printed.add(end_final)
Is there any smart way.Appreciate any feedback.
Upvotes: 1
Views: 89
Reputation: 71471
You can use itertools.groupby
:
import itertools
RegionIDArray = {522: "London", 4745: "London", 2718: "London", 3487: "Tokio"}
new_results = {tuple(c for c, _ in b):a for a, b in itertools.groupby(sorted(RegionIDArray.items(), key=lambda x:x[-1]), key=lambda x:x[-1])}
Output:
{(3487,): 'Tokio', (4745, 522, 2718): 'London'}
Upvotes: 0
Reputation: 164783
You can do this in 2 steps:
The first step is optimally processed via collections.defaultdict
.
For the second step, you can use either tuple
or frozenset
. I opt for the latter since it is not clear that ordering is relevant.
from collections import defaultdict
RegionIDArray = {522: "London", 4745: "London", 2718: "London", 3487: "Tokio"}
d = defaultdict(list)
for k, v in RegionIDArray.items():
d[v].append(k)
res = {frozenset(v): k for k, v in d.items()}
print(res)
{frozenset({522, 2718, 4745}): 'London',
frozenset({3487}): 'Tokio'}
Upvotes: 2
Reputation: 5425
What you can do is make a reverse lookup table from the values to all working keys, like so:
def reverse(ids):
table = {}
for key in ids:
if ids[key] not in table:
table[ids[key]] = []
table[ids[key]].append(key)
return table
Upvotes: -1