asdwasow
asdwasow

Reputation: 91

youtube-dl: download youtube videos info.json in a playlist by ignoring videos specified in archive.txt

I am trying to download json metadata(NOT VIDEOs) for all videos in a youtube playlist through youtube-dl. I also want to ignore downloading already downloaded json metadata for the videos in the playlist while running the same command again. so, here is the command that i have tried,

youtube-dl -i --write-info-json --skip-download --download-archive archive.txt {youtube-playlist-url}  

--write-info-json writes the video info.json

--skip-download Do not download the video

--download-archive archive.txt archive.txt contains a list of already downloaded video ids, so youtube-dl will not download those videos again

However, including --skip-download argument to youtube-dl results in video ids not being added to archive.txt, which suggests yt-dl only adds video id to archive.txt after downloading the video. Are these two commands(--skip-download & --download-archive archive.txt) able to run together? or is there any other way to accomplish it?

Upvotes: 3

Views: 9413

Answers (1)

Ajithkumar_sekar
Ajithkumar_sekar

Reputation: 651

yt-dl will add an entry into archive.txt only if the video is downloaded. So, i think your use case cannot be achived solely through yt-dl.

Howerver this behaviour can be achived using some command line magic,

youtube-dl --skip-download --write-info-json --download-archive archive.txt https://www.youtube.com/playlist\?list\=PLMCXHnjXnTnuFUfiWF4D0pYmJsMROz4sA |tee /dev/tty|grep "\[info] Writing video description metadata as JSON to:" |gawk '{ match($0, /-([a-zA-Z0-9_-]+)\.info\.json/, arr); if(arr[1] != "") print "youtube "arr[1] }' >> archive.txt

youtube-dl --skip-download --write-info-json --download-archive archive.txt {youtube-playlist-url} will download playlist videos .info.json data except for the video_ids in archive.txt

tee /dev/tty will stream youtube-dl output to stdout and also pipe it to next command

grep "[info] Writing video description metadata as JSON to:" will get the line containing downloaded .info.json file name from yt-dl output

gawk '{ match($0, /-([a-zA-Z0-9_-]+).info.json/, arr); if(arr[1] != "") print "youtube "arr[1] }' will print the videoid in the format youtube {video_id} from the file name

>> archive.txt will concatenate the output to archive.txt file

Here is what happens when you run that command,
    Downloads the info.json for videos in the playlist except for videos in archive.txt and also appends the downloaded info.json video id to the archive.txt. so, if you run the same command again, yt-dl will download all video's info.json in the playlist except for video ids mentioned in archive.txt

Upvotes: 11

Related Questions