Reputation: 135
I am wiki scraping and an disambigious error comes up because there are multiple articles with the same title. How to do I go through all of them and pull them? Additional question how do I skip them?
text = list('child', 'pca, 'united states')
df = []
for x in text:
wiki = wikipedia.page(x)
df.append(wiki.content)
and multiple results come up for some of them and it will error out, any ideas? I am thinking a try: except: else: ?
Upvotes: 0
Views: 104
Reputation: 54708
The disambiguation notes have a very specific format. It should be easy for you to find them and extract the links they contain. Indeed, the disambiguation links themselves have a unique class that you can search for.
As to whether you pull them or skip them, that's entirely up to you, depending on your need.
Upvotes: 1