Arica Christensen
Arica Christensen

Reputation: 135

Wikipedia Scraping ambiguous results

I am wiki scraping and an disambigious error comes up because there are multiple articles with the same title. How to do I go through all of them and pull them? Additional question how do I skip them?

 text = list('child', 'pca, 'united states')

 df = []

 for x in text:
      wiki = wikipedia.page(x)
      df.append(wiki.content)

and multiple results come up for some of them and it will error out, any ideas? I am thinking a try: except: else: ?

Upvotes: 0

Views: 104

Answers (1)

Tim Roberts
Tim Roberts

Reputation: 54708

The disambiguation notes have a very specific format. It should be easy for you to find them and extract the links they contain. Indeed, the disambiguation links themselves have a unique class that you can search for.

As to whether you pull them or skip them, that's entirely up to you, depending on your need.

Upvotes: 1

Related Questions