theboy
theboy

Reputation: 353

Pytube only works periodically (KeyError: 'assets')

Five out of ten times Pytube will send me this error when attempting to run my small testing script.

Here's the script:

import pytube
import urllib.request


from pytube import YouTube
yt = YouTube('https://www.youtube.com/watch?v=3NCyD3XoJgM')

print('Youtube video title is: ' + yt.title + '! Downloading now!')

Here's what I get:

Traceback (most recent call last):
  File "youtube.py", line 6, in <module>
    yt = YouTube('https://www.youtube.com/watch?v=3NCyD3XoJgM')
  File "C:\Users\test\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pytube\__main__.py", line 91, in __init__
    self.prefetch()
  File "C:\Users\test\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pytube\__main__.py", line 183, in prefetch
    self.js_url = extract.js_url(self.watch_html)
  File "C:\Users\test\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pytube\extract.py", line 143, in js_url
    base_js = get_ytplayer_config(html)["assets"]["js"]
KeyError: 'assets'

I am very confused. I attempted to reinstall Python plus pytube but I can't seem to remedy this issue. It's increasingly perplexing that the script works half of the time, but not the other half.

Upvotes: 5

Views: 6399

Answers (11)

shekhar chander
shekhar chander

Reputation: 618

Here's a permanent fix to that! You can try tube_dl.

pip install tube_dl
from tube_dl import Youtube
yt = Youtube('url')
yt.Formats()[0].download()

It uses modular approach and is up to date

More about this can be found at : https://github.com/shekharchander/tube_dl/

Upvotes: 0

carloshkayser
carloshkayser

Reputation: 197

I had the same problem and updating pytube to the latest version available currently the problem disappeared.

pip install pytube==10.0.0

or

pip install --upgrade pytube

Upvotes: 3

KetZoomer
KetZoomer

Reputation: 2915

If you are using the package pytube or pytube3, I would recommend uninstalling that and installing pytubeX. No need to change imports. I have found it works much more reliably.

Edit: From the comments, if none of these work, try pytube4

Edit: pytube is now being maintained again!

Upvotes: 2

Rahul A Ranger
Rahul A Ranger

Reputation: 554

It seems Pytube module is updated.

It works fine for pytube package

i.e try pip install pytube uninstall pytube variations

Upvotes: 5

Jean-Pierre Schnyder
Jean-Pierre Schnyder

Reputation: 1944

In order to avoid this pytube problem, you can use youtube_dl instead. Here's a code which were tested on Windows and on an Android tablet (with the Pydroid3 app). The purpose is to download the audio track of videos referred to in a public playlist.

import os, re
import youtube_dl
from pytube import Playlist

YOUTUBE_STREAM_AUDIO = '140'
if os.name == 'posix':
    targetAudioDir = '/storage/emulated/0/Download/Audiobooks/test_youtube_dl'
    ydl_opts = {
    'outtmpl': targetAudioDir + '/%(title)s.mp3',
    'format': 'bestaudio/best',
    'quiet': False
    }
else:
    targetAudioDir = 'D:\\Users\\Jean-Pierre\\Downloads\\Audiobooks\\test_youtube_dl'
    ydl_opts = {
    'outtmpl': targetAudioDir + '\\%(title)s.%(ext)s',
    'format': 'bestaudio/best',
    'postprocessors': [{
                        'key': 'FFmpegExtractAudio',
                        'preferredcodec': 'mp3',
                        'preferredquality': '128',
                    }],
    'quiet': False
    }

playlistUrl = 'https://www.youtube.com/playlist?list=PLzwWSJNcZTMSFWGrRGKOypqN29MlyuQvn'
playlistObject = Playlist(playlistUrl)
playlistObject._video_regex = re.compile(r"\"url\":\"(/watch\?v=[\w-]*)")
    
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
    for videoUrl in playlistObject.video_urls:
        meta = ydl.extract_info(videoUrl, download=False)
        videoTitle = meta['title']
        print('Video title: ' + videoTitle)
        ydl.download([videoUrl])

Upvotes: 0

Idan Cohen
Idan Cohen

Reputation: 104

Add this function to extract.py

def get_ytplayer_js(html: str) -> Any:
    """Get the YouTube player base JavaScript path.

    :param str html
    The html contents of the watch page.
    :rtype: str
    :returns:
    Path to YouTube's base.js file.
    """
    js_url_patterns = [
        r"\"jsUrl\":\"([^\"]*)\"",
    ]
    for pattern in js_url_patterns:
        regex = re.compile(pattern)
        function_match = regex.search(html)
        if function_match:
            logger.debug("finished regex search, matched: %s", pattern)
            yt_player_js = function_match.group(1)
            return yt_player_js

    raise RegexMatchError(
       caller="get_ytplayer_js", pattern="js_url_patterns"
    )

and change the function "js_url" in extract.py from:

def js_url(html: str) -> str:
    """Get the base JavaScript url.

    Construct the base JavaScript url, which contains the decipher
    "transforms".

    :param str html:
        The html contents of the watch page.
    """
    base_js = get_ytplayer_config(html)["assets"]["js"]
    return "https://youtube.com" + base_js

to:

def js_url(html: str) -> str:
    """Get the base JavaScript url.

    Construct the base JavaScript url, which contains the decipher
    "transforms".

    :param str html:
        The html contents of the watch page.
    """
    base_js = get_ytplayer_js(html)
    return "https://youtube.com" + base_js

Upvotes: 5

Arpit Diwan
Arpit Diwan

Reputation: 41

Fixed

The extract.py codebase is now updated if you're still getting the error after running this command in a terminal or cmd: python -m pip install git+https://github.com/nficano/pytube is because it didn't updated your pytube/extract.py file.

The fix is Copy all the code from codebase and replace in your extract.py file. I hope this will work.

Upvotes: 1

theboy
theboy

Reputation: 353

For now fixed 100% with this:

https://github.com/nficano/pytube/pull/767#issuecomment-716184994

With anyone else getting this error or issue, run this command in a terminal or cmd: python -m pip install git+https://github.com/nficano/pytube

An update to pytubeX that hasn't been released with the pip installation yet. The GitHub link is the current dev explaining the situation.

Upvotes: 11

Daniel
Daniel

Reputation: 485

It's an issue with the pytube library files. You can fix this by manually modifying the "extract.py" file inside the pytube folder. Copy and paste this inside the file instead: https://github.com/nficano/pytube/blob/master/pytube/extract.py

Upvotes: 1

Jubiluleu
Jubiluleu

Reputation: 61

I'm in the same trouble, but I guarantee that the top answer doesn't solve anything, just hide the problem until it pops up again. I investigated this scope of "extract.py" file, and found an error. This scope searches for a "string" snippet in the source code of the Youtube page where the video is, through a dictionary search, such as:

#Example ---------------
Vars = {
    'name':'luis'
    'age':'27'
}
print(Vars['name'])

result: 'luis'

#Extract.py Code -------

def js_url(html: str) -> str:
"""Get the base JavaScript url.

Construct the base JavaScript url, which contains 
the decipher
"transforms".

:param str html:
    The html contents of the watch page.
"""
base_js = get_ytplayer_config(html)["assets"]["js"]
return "https://youtube.com" + base_js

The error:

base_js = get_ytplayer_config(html)["assets"]["js"]
KeyError: 'assets'

It is given because this snippet of the source code does not support a search as dicionario, so 'KeyError' key error, because 'assets' is not a valid key, and the source code is not a dictionary. So I did this script, which I believe replace this original, but in mine, particularly, appeared other errors.

def js_url(html: str) -> str:
"""Get the base JavaScript url.

Construct the base JavaScript url, which contains 
the decipher
"transforms".

:param str html:
    The html contents of the watch page.
"""
base_js = html[html.find('js') + 4:html.find('.js') 
+ 4]
return "https://youtube.com" + base_js

The above script searches for what the function wants as a string, not as dictionary.

I hope I have contributed to a more complete future solution :)

Upvotes: 5

Try to replace the line 143

    base_js = get_ytplayer_config(html)["assets"]["js"]

with

    try:
        base_js = get_ytplayer_config(html)["assets"]["js"]
    except Exception:
        pass

Upvotes: -2

Related Questions