Damian Kowalski
Damian Kowalski

Reputation: 402

How to delete lines from a text until a keyword

I am requesting a wikipedia page that returns all the text from that website like so:

def my_function(addr):
    response = requests.get(addr)
    print(response.text)

my_function("https://en.wikipedia.org/wiki/Web_scraping")

Right now what im trying to do is basically delete unwanted parts, basically all text before the id with the class 'See_also'. Is there a right and easy way to do so? I could not just delete a certain amount of lines since this code is meant to work for different wiki sites.

Upvotes: 2

Views: 177

Answers (1)

Jakub Dóka
Jakub Dóka

Reputation: 2625

You can use REGEX (huraay).

import requests
import re

def my_function(addr):
    response = requests.get(addr)
    print(re.findall("See_also[\\s\\S]*", response.text))

my_function("https://en.wikipedia.org/wiki/Web_scraping")

Upvotes: 2

Related Questions