Reputation: 35
I've been using BeautifulSoup to scrape TopCashBack website links for a few years, but when I change the URL to a Screwfix link I don't get back any data.
s = requests.get("https://www.screwfix.com/p/128hf")
soup = BeautifulSoup(s.text,'lxml')
print(soup)
My specific question is this: Am I receiving empty data because the Screwfix website is detecting and preventing scraping, or because their website needs a different parser to be specified other than 'lxml'?
Upvotes: 1
Views: 73
Reputation: 3056
You just need to pass the User-Agent
while making the request and it works.
import requests
from bs4 import BeautifulSoup
s = requests.get("https://www.screwfix.com/p/128hf", headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36"})
soup = BeautifulSoup(s.text,'lxml')
print(soup.prettify())
output:
<!DOCTYPE html>
<html lang="en-GB" prefix="og: https://ogp.me/ns#">
<head>
<link href="//tags.tiqcdn.com" rel="dns-prefetch"/>
<link href="//media.screwfix.com" rel="dns-prefetch"/>
<meta charset="utf-8"/>
<script type="text/javascript">
/*
.......
.......
}
</style>
<link href="https://media.screwfix.com/" rel="dns-prefetch"/>
<link crossorigin="anonymous" href="https://media.screwfix.com/" rel="preconnect"/>
<script async="" src="https://tags.tiqcdn.com/utag/kingfisher/screwfix-fusionx/prod/utag.js">
</script>
<meta content="telephone=no" name="format-detection"/>
<link href="/favicon.ico" rel="icon"/>
<meta content="Product Details Page" name="page-name"/>
<meta content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" name="viewport"/>
<title>
Evolution R255SMS-DB 255mm Electric Double-Bevel Sliding Multi-Material Mitre Saw 220-240V - Screwfix
</title>
<link href="https://www.screwfix.com/p/evolution-r255sms-db-255mm-electric-double-bevel-sliding-multi-material-mitre-saw-220-240v/128hf" rel="canonical"/>
<meta content="https://www.screwfix.com/p/evolution-r255sms-db-255mm-electric-double-bevel-sliding-multi-material-mitre-saw-220-240v/128hf" property="og:url"/>
<meta content="Screwfix.com" property="og:site_name"/>
<meta content="product" property="og:type"/>
<meta content="Evolution R255SMS-DB 255mm Electric Double-Bevel Sliding Multi-Material Mitre Saw 220-240V - Screwfix" property="og:title"/>
<meta content="Order online at Screwfix.com. Uses a single blade to cut mild steel, non-ferrous metals, plastic and wood, even if nails are embedded in the material. Provides clean and precise cuts no matter the material. Bevels to 45° in both directions and offers a maximum cross cut of 300 x 80mm both ways. Integrated laser cutting guide and positive bevel stops provide accuracy with every cut. Durable die-cast aluminium base supports a variety of materials. Features powerful 2000W motor and ergonomic over-moulded, in-line handle. On-board tool storage for convenient storage of the blade change hex key. FREE next day delivery available, free collection in 1 minute." property="og:description"/>
<meta content="221855807852136" property="fb:app_id"/>
<meta content="https://media.screwfix.com/is/image/ae235/128HF_P" property="og:image"/>
<meta content="Order Evolution R255SMS-DB 255mm Electric Double-Bevel Sliding Multi-Material Mitre Saw 220-240V at Screwfix.com. Screwfix customers rate this product 4.7/5. FREE next day delivery available, free collection in 1 minute." name="description"/>
<script data-qaid="seo-properties" type="application/ld+json">
{"@context":"https://schema.org/","@type":"Product","@id":"https://www.screwfix.com/p/evolution-r255sms-db-255mm-electric-double-bevel-sliding-multi-material-mitre-saw-220-240v/128hf","image":["https://media.screwfix.com/is/image/ae235/128HF_P","https://media.screwfix.com/is/image/ae235/128HF_A1","https://media.screwfix.com/is/image/ae235/128HF_A2","https://media.screwfix.com/is/image/ae235/128HF_A3"],"name":"Evolution R255SMS-DB 255mm Electric Double-Bevel Sliding Multi-Material Mitre Saw 220-240V","url":"https://www.screwfix.com/p/evolution-r255sms-db-255mm-electric-double-bevel-sliding-multi-material-mitre-saw-220-240v/128hf","description":"Uses a single blade to cut mild steel, non-ferrous metals, plastic and wood, even if nails are embedded in the material. Provides clean and precise cuts no matter the material. Bevels to 45° in both directions and offers a maximum cross cut of 300 x 80mm both ways. Integrated laser cutting guide and positive bevel stops provide accuracy with every cut. Durable die-cast aluminium base supports a variety of materials. Features powerful 2000W motor and ergonomic over-moulded, in-line handle. On-board tool storage for convenient storage of the blade change hex key.","sku":"128HF","brand":{"@type":"brand","name":"Evolution"},"offers":[{"@type":"Offer","url":"https://www.screwfix.com/p/evolution-r255sms-db-255mm-electric-double-bevel-sliding-multi-material-mitre-saw-220-240v/128hf","itemCondition":"https://schema.org/NewCondition","price":199.99,"priceCurrency":"GBP","potentialAction":{"@type":"Action","url":"https://schema.org/BuyAction"},"availableDeliveryMethod":"https://schema.org/OnSitePickup","availability":"https://schema.org/InStock"},{"@type":"Offer","url":"https://www.screwfix.com/p/evolution-r255sms-db-255mm-electric-double-bevel-sliding-multi-material-mitre-saw-220-240v/128hf","itemCondition":"https://schema.org/NewCondition","price":199.99,"priceCurrency":"GBP","potentialAction":{"@type":"Action","url":"https://schema.org/BuyAction"},"availableDeliveryMethod":"https://schema.org/ParcelService","availability":"https://schema.org/InStock"}]}
</script>
<script type="application/ld+json">
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https://www.screwfix.com/"},{"@type":"ListItem","position":2,"name":"Tools","item":"https://www.screwfix.com/c/tools/cat830034"},{"@type":"ListItem","position":3,"name":"Power Tools","item":"https://www.screwfix.com/c/tools/power-tools/cat830692"},{"@type":"ListItem","position":4,"name":"Saws","item":"https://www.screwfix.com/c/tools/saws/cat830716"},{"@type":"ListItem","position":5,"name":"Mitre Saws","item":"https://www.screwfix.com/c/tools/mitre-saws/cat830858"}]}
</script>
<meta content="26" name="next-head-count"/>
<script type="text/javascript">
/* Polyfill service v4.5.0
* Disable minification (remove `.min` from URL path) for more info */
</script>
<script async="" src="/fusionx/srvinit.js">
</script>
<link as="style" href="/_next/static/css/a67c635984c510c9.css" rel="preload"/>
<link data-n-g="" href="/_next/static/css/a67c635984c510c9.css" rel="stylesheet"/>
<link as="style" href="/_next/static/css/740fd7f34264ccbf.css" rel="preload"/>
<link data-n-p="" href="/_next/static/css/740fd7f34264ccbf.css" rel="stylesheet"/>
<link as="style" href="/_next/static/css/1e3ede1424ebbe1e.css" rel="preload"/>
<link data-n-p="" href="/_next/static/css/1e3ede1424ebbe1e.css" rel="stylesheet"/>
<link as="style" href="/_next/static/css/ffb741840bbe8528.css" rel="preload"/>
<link data-n-p="" href="/_next/static/css/ffb741840bbe8528.css" rel="stylesheet"/>
<noscript data-n-css="">
</noscript>
<script defer="" nomodule="" src="/_next/static/chunks/polyfills-78c92fac7aa8fdd8.js">
</script>
<script defer="" src="/_next/static/chunks/webpack-2c90c422d382d168.js">
</script>
<script defer="" src="/_next/static/chunks/framework-3299c5364e0ec6ff.js">
</script>
<script defer="" src="/_next/static/chunks/main-b4a7336d1ecee5f3.js">
</script>
<script defer="" src="/_next/static/chunks/pages/_app-c2b792dea6a1ff69.js">
</script>
<script defer="" src="/_next/static/chunks/3467-6975bab37d2da2eb.js">
</script>
<script defer="" src="/_next/static/chunks/8814-159c3e8774f70836.js">
</script>
<script defer="" src="/_next/static/chunks/8052-47a55f18f0e34837.js">
</script>
<script defer="" src="/_next/static/chunks/4839-2cca491ccf37c23b.js">
</script>
<script defer="" src="/_next/static/chunks/2429-658db6f90127981f.js">
</script>
<script defer="" src="/_next/static/chunks/9139-0b6116c7425866d9.js">
</script>
<script defer="" src="/_next/static/chunks/pages/p/%5B...id%5D-edf4957b337aff93.js">
</script>
<script defer="" src="/_next/static/4b0be1b1738b369fc073e17e157b925034c5bb88/_buildManifest.js">
</script>
<script defer="" src="/_next/static/4b0be1b1738b369fc073e17e157b925034c5bb88/_ssgManifest.js">
</script>
</head>
<body>
<div id="__next">
<div>
<div class="x_28qp">
<a href="#container-main">
Skip to content
</a>
</div>
<header class="hRCyfb">
<div class="rBy7NP">
</div>
<div class="_0R2uIG">
<div class="tYP_3K">
<div class="mnfaDp U_v2oc">
..
..
</body>
</html>
Upvotes: 2