Reputation: 20891
I'm trying to extract an URL to download by using regular expression but I cannot deal with the quotation marks in the lookbehind and the positive lookahead.
Can you fix it?
Input:document.getElementsByClassName('mdui-textfield-input')[1].innerHTML
Output:"<video><source src=\"https://drivebutler.drk1.workers.dev/0:/Cartoon%20Collection/Naruto%20Shippuden%20(Complete%20Series%20001-500)%20Naruto%20Shippuuden%20[1080p]%20[HEVC]%20[x265]%20[Batch]%20[pseudo]/Season%2015%20(Episodes%20321-348)/[AnimeRG]%20Naruto%20Shippuden%20-%20338%20[1080p]%20[x265]%20[pseudo].mkv\" type=\"video/mp4\"></video>"
The regex I use to grab the url,
(?<=src=\\\").*?(?=\\\")
What I've tried,
document.getElementsByClassName('mdui-textfield-input')[1].innerHTML.match((?<=src=\\\").*?(?=\\\"))[0]
But the indication of the console makes me feel that something is wrong.
Upvotes: 0
Views: 114
Reputation: 13356
... /src=\\"(?<url>https?:\/\/[^"]+)"/
... and always bear in mind how backslashes "behave" when having to be written within a string for input reasons and how a system does handle them as part of output values ...
const sample = "<video><source src=\\\"https://drivebutler.drk1.workers.dev/0:/Cartoon%20Collection/Naruto%20Shippuden%20(Complete%20Series%20001-500)%20Naruto%20Shippuuden%20[1080p]%20[HEVC]%20[x265]%20[Batch]%20[pseudo]/Season%2015%20(Episodes%20321-348)/[AnimeRG]%20Naruto%20Shippuden%20-%20338%20[1080p]%20[x265]%20[pseudo].mkv\" type=\"video/mp4\"></video>"
const regXExtractUrl = (/src=\\"(?<url>https?:\/\/[^"]+)"/);
console.log(
regXExtractUrl.exec(sample)?.groups.url
);
console.log(
regXExtractUrl.exec("")?.groups.url
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
different escaping ... different regex ...
const sample_A = "<video><source src=\"https://drivebutler.drk1.workers.dev/0:/Cartoon%20Collection/Naruto%20Shippuden%20(Complete%20Series%20001-500)%20Naruto%20Shippuuden%20[1080p]%20[HEVC]%20[x265]%20[Batch]%20[pseudo]/Season%2015%20(Episodes%20321-348)/[AnimeRG]%20Naruto%20Shippuden%20-%20338%20[1080p]%20[x265]%20[pseudo].mkv\" type=\"video/mp4\"></video>"
const sample_B = `<video><source src="https://drivebutler.drk1.workers.dev/0:/Cartoon%20Collection/Naruto%20Shippuden%20(Complete%20Series%20001-500)%20Naruto%20Shippuuden%20[1080p]%20[HEVC]%20[x265]%20[Batch]%20[pseudo]/Season%2015%20(Episodes%20321-348)/[AnimeRG]%20Naruto%20Shippuden%20-%20338%20[1080p]%20[x265]%20[pseudo].mkv" type="video/mp4"></video>`
const regXExtractUrl = (/src="(?<url>https?:\/\/[^"]+)"/);
console.log(
regXExtractUrl.exec(sample_A)?.groups.url
);
console.log(
regXExtractUrl.exec(sample_B)?.groups.url
);
console.log(
regXExtractUrl.exec("")?.groups.url
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
Upvotes: 1
Reputation: 1006
You didn't enclose your regular expression between slashes like this:
.match(/(?<=src=\\\").*?(?=\\\")/)
Check how to create regular expressions using literal notation in JavaScript here.
If you want to escape a special character you should use single backslash, because now you are escaping one backslash and one quotation mark, so I think you want it to be like this:
.match(/(?<=src=\").*?(?=\")/)
But you do not need to escape characters like quotation marks anyway.
Upvotes: 1
Reputation: 20891
I fix it by the following
document.getElementsByClassName('mdui-textfield-input')[1].innerHTML.match(/(?<=src=\").*?(?=\")/g)[0]
Upvotes: 0