Reputation: 1
So I'm using this python package lyricsgenius
to scrape lyrics from website genius.com using the website's API. In this script, I want it to scrape 300 songs of Drake:
import lyricsgenius
genius = lyricsgenius.Genius(API_KEY)
artist = genius.search_artist("Drake", max_songs=300, sort="title")
However, it stopped at song 106 and displayed the error message:
Song 106: "Draft Day"
"Drake & DJ Semtex Interview" is not valid. Skipping.
Timeout raised and caught:
HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-575a1d98f954> in <module>
1 genius = lyricsgenius.Genius(API_KEY)
----> 2 artist = genius.search_artist("Drake", max_songs=300, sort="title")
~/anaconda3/lib/python3.7/site-packages/lyricsgenius/api.py in search_artist(self, artist_name, max_songs, sort, per_page, get_full_info, allow_name_change, artist_id)
329 else:
330 info = {'song': song_info}
--> 331 song = Song(info, lyrics)
332
333 # Attempt to add the Song to the Artist
~/anaconda3/lib/python3.7/site-packages/lyricsgenius/song.py in __init__(self, json_dict, lyrics)
24 save_lyrics: Save the song lyrics to a JSON or TXT file.
25 """
---> 26 self._body = json_dict['song'] if 'song' in json_dict else json_dict
27 self._body['lyrics'] = lyrics
28 self._url = self._body['url']
TypeError: argument of type 'NoneType' is not iterable
How do I set up so that it will stop scraping when it reaches 300 songs?
Upvotes: 0
Views: 1062
Reputation: 6078
Realize this is an issue within the lyricsgenius
package being used (even if there is an internal failure, it should report that properly).
Check if it works with the most recent release of the package. What is your lyricsgenius.__version__
? Most recent seems to be 1.6.0 which was only created 3 days ago. Try to install it manually (pip install lyricsgenius
).
If the problem persists, look up the official repository for contact details. The Python Package Index sends you to https://github.com/johnwmillr/LyricsGenius/issues. The project's README says to "just open an issue".
Upvotes: 1