user14003529
user14003529

Reputation:

How do I scrape a website and put data into a file?

I have a apple music link as seen here, I want to get all the song names and put them into a file.

This is what I have tried:

for i in soup.findAll('div', {'class':'song-name typography-body-tall'}):
    with open("playlist.txt", "w") as f:
         f.write(i)

But nothing is writing into the file, please can I get some help for this - thanks in advance.

Upvotes: 0

Views: 199

Answers (2)

Moody
Moody

Reputation: 31

Besides using Beautiful Soup, if you want to scrape content of a website in details then one of the best libraries in the business is Scrapy. An easy way by which Scrapy crawls content from a website is by Xpath selectors.

Here's the scrapy documentation: https://docs.scrapy.org/en/latest/

Xpaths tutorial for scraping meta content, with Scrapy: https://linuxhint.com/scrapy-with-xpath-selectors/

Upvotes: 1

Insula
Insula

Reputation: 953

Firstly make sure you are actually scraping the website:

import requests
import sys
from bs4 import BeautifulSoup

a = requests.get("https://music.apple.com/gb/playlist/eminem-essentials/pl.9200aa618dc24867b2aa7f00466fd404")
soup = BeautifulSoup(a.text,features="html.parser")

Then collect the songs:

songs = soupfindAll('div', {'class':'song-name typography-body-tall'})

And finally put it in a loop to go through it all and put them into a file:

    song = [song.get_text() for song in songs if song]

    original_stdout = sys.stdout
    with open('playlist.txt', 'w') as f:
        sys.stdout = f
        for idx in range(len(song)):
            print(f'{song[idx]}')
        sys.stdout = original_stdout

Make sure to import everything I have, importing sys is important to print out everything into the file

Upvotes: 0

Related Questions