Vinay Sharma
Vinay Sharma

Reputation: 381

How To Extract URLs From a Hindi Newspaper Amar Ujala

I am trying to extract the url of articles present on a Hindi Newspaper called Amar Ujala. Link: https://www.amarujala.com/india-news?src=mainmenu

In 'Network' section of Dev Tools, It seems required API is 'https://electionresultapi.amarujala.com/get-article-reactions' and under the 'payload' section we have the required URL. enter image description here

I tried the following code to fetch the URL

import requests
import json

# Assuming you have an API key, replace 'YOUR_API_KEY' with your actual API key
headers = {
    'Authorization': 'Bearer Your_Auth'
}

api_response = requests.post('https://electionresultapi.amarujala.com/get-article-reactions', headers=headers)
print(api_response.status_code)

data = api_response.text
parse_json = json.loads(data)
print(parse_json)

But as expected it is giving output from 'Response' section of Dev Tools. How to extract URL from the weblink shared above.

Upvotes: 0

Views: 97

Answers (1)

lib
lib

Reputation: 31

What you have figured out so far is partial info.

Vinay so I have analyzed the website a bit. As you may have noticed on left side of the pane, consists of cards that have news in them.

So every card is basically a section that wraps 5 divs in it. We are interested in div with class='image_description'. there you will find an anchor tag that holds the URL you require.

Little more info on what you have figured out- So the URL stirng I mentioned above is passed to http endpoint you mentioned: https://electionresultapi.amarujala.com/get-article-reactions and URL string is an input given to the above endpoint with the POST method.

Hope this helps. in case of further queries, happy to help.

Upvotes: 0

Related Questions