Satyam Singh
Satyam Singh

Reputation: 21

I need to scrape data with python from webpage but by giving inputs

I need to scrape data from the website - Link and store it in database or any one but it requires some input parameters like i need to enter the vehicle number then submit it after submitting , I need to check again if it is my car.

After that i need to extract the red , yellow and green text .

Can anyone help me please it's important to me.Image to extract

Need to extract

I'm expecting the red, yellow and green data to scrape

Upvotes: 0

Views: 696

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195573

Try:

import requests
from bs4 import BeautifulSoup


num = "LE02WVX"

with requests.session() as s:
    soup = BeautifulSoup(
        s.get("https://vehicleenquiry.service.gov.uk/").content, "html.parser"
    )

    authenticity_token = soup.select("form")[1].select_one(
        '[name="authenticity_token"]'
    )["value"]

    payload = {
        "utf8": "✓",
        "authenticity_token": authenticity_token,
        "wizard_vehicle_enquiry_capture_vrn[vrn]": num,
    }

    soup = BeautifulSoup(
        s.post(
            "https://vehicleenquiry.service.gov.uk/?locale=en",
            data=payload,
        ).content,
        "html.parser",
    )

    authenticity_token = soup.select("form")[1].select_one(
        '[name="authenticity_token"]'
    )["value"]

    payload = {
        "utf8": "✓",
        "authenticity_token": authenticity_token,
        "wizard_vehicle_enquiry_capture_confirm_vehicle[confirmed]": "Yes",
    }

    soup = BeautifulSoup(
        s.post(
            "https://vehicleenquiry.service.gov.uk/ConfirmVehicle?locale=en",
            data=payload,
        ).content,
        "html.parser",
    )

    yellow = soup.select_one("main h1")
    red = yellow.find_next("h2")
    green = red.find_next("h2")

    print(yellow.get_text(strip=True))
    print(red.get_text(strip=True), red.find_next("div").get_text(strip=True))
    print(
        green.get_text(strip=True), green.find_next("div").get_text(strip=True)
    )

Prints:

LE02 WVX
✗Untaxed Tax due:1 July 2022
✓MOT Expires:3 April 2023

Upvotes: 2

Related Questions