Nj3
Nj3

Reputation: 119

Python: Youtube-dl status after conversion to mp3

I'm writing a script to download mp3 songs from web. first i'll be scraping in youtube. if found, download it using youtube-dl and convert it to mp3. If not found(done by using os.path.isfile), scrap in beemp3(for this sample) or mp3skulls etc. The script for only ytdownload and file check is below:

from bs4 import BeautifulSoup
from urllib.request import urlopen,Request,urlretrieve
import re
import youtube_dl
import sys
import os

def ytscrape(searchurl,baseurl):
    """normal scraping"""
    req = Request(searchurl, headers={'User-Agent':'Mozilla/5.0'})
    lst[:] = []
    url = urlopen(req)
    soup = BeautifulSoup(url, 'lxml')
    for i in soup.find_all('div',{'class':['yt-lockup-content','yt-lockup-meta-info']},limit=10):
        for link,views in zip(i.select('h3 > a'),i.select('ul > li')):
            if views is not None and views.next_sibling is not None:
                lst.append([baseurl+link.get('href'),views.next_sibling.text])
    for i in lst:
        i[1] = int(re.sub(r' views|,','',i[1]))
    lst.sort(key = lambda x:x[1])
    url.close()
    return lst[-1][0]

def dl_frm_youtube(yt_lnk,dlpath):
    """passes the youtube url of the song. it extracts audio alone and saves it
    in local.
    yt_lnk : youtube url for song which is priortised based on channel/views.
    """
    ydl_opts = {'format':'bestaudio/best','outtmpl':dlpath+'\\%(title)s.%(ext)s','postprocessors':[{'key':'FFmpegExtractAudio','preferredcodec':'mp3','preferredquality':'192',}]}
    with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        ydl.download([yt_lnk])
        if os.path.isfile(dlpath+'\\%(title)s.%(ext)s'):
            print('found')
        else:
            print('not found')

def main():
    song = 'numb' 
    artist = 'linkin park'
    baseurl = 'https://www.youtube.com'
    if sys.platform == 'win32':
        dlpath = os.path.join(os.environ['USERPROFILE'],'Music','spd')
        if not os.path.exists(dlpath):
            os.mkdir(dlpath)
    else:
        dlpath = '~/Music/' + song + '.mp3'
    searchurl = baseurl + '/results?search_query=' + '+' + artist.replace(chr(32),'+') + '+' + song.replace(chr(32),'+')
    dl_frm_youtube(ytscrape(searchurl,baseurl),dlpath)


lst = []
main()

When I tried to do file check, it failed eventhough the song downloaded and is present in the path. since it failed, it went to next function and downloaded that as well causing me to have 2 songs in my path.

So, my question is how to setup that file check so that it should print found when its present in the dlpath.

TIA

EDIT: As per phihag comments, I removed all useless info, changed code to have only problem part and hardcoded the inputs.

Upvotes: 0

Views: 1693

Answers (1)

Nj3
Nj3

Reputation: 119

Finally, I was able to find out why. It seems %(title)s.%(ext)s only seems to hold values inside ydl_opts dict. this answer helped me.

I changed my code inside dl_frm_youtube to this:

with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        #ydl.download([yt_lnk])
        info = ydl.extract_info(yt_lnk, download=True)
        songname = info.get('title', None)
        #print(songname)
        if os.path.isfile(dlpath+'\\'+songname+'.mp3'):
            print('found')

and it works perfectly. Answering it in case if someone find it useful.

Upvotes: 2

Related Questions