re.findall with requests doesn't match copied and pasted html (generated by requests.text)

Question

I'm trying to capture some elements from the html code of a certain url. When I copy and paste the contents of the html directly to into my python code it works well.

import re

# Sample HTML content
html_content = """
<<>>
"""

# Regex pattern
pattern = r'{"order":\d+,"url":"(https:[^"]+\.webp)"}'

# Find matches
matches = re.findall(pattern, html_content)

# Print matches
for match in matches:
    print(match)

^^ works well. But when I try to do the same by directly using requests.get it doesn't work:

import re
import requests
url = "https://asuracomic.net/series/bloodhounds-regression-instinct-2d0edc16/chapter/59"
response = requests.get(url)
html_content = response.text

# Regex pattern
pattern = r'{"order":\d+,"url":"(https:[^"]+\.webp)"}'

# Find matches
matches = re.findall(pattern, html_content)

# Print matches
for match in matches:
    print(match)

Keeping in mind that the html I'm copying and pasting is actually generated using requests.get:

with open('raw_html.html', 'w', encoding='utf-8') as f:
    f.write(html_content)

re.findall with requests doesn't match copied and pasted html (generated by requests.text)

Answers (1)

Related Questions

re.findall with requests doesn&#39;t match copied and pasted html (generated by requests.text)

Answers (1)

Related Questions

re.findall with requests doesn't match copied and pasted html (generated by requests.text)