Reputation: 353
I have a script that needs to extract a YouTube URL from a text file.
Here's what I have in the text file (output.txt):
---------- NUMBER11.TXT
<link itemprop="url" href="http://www.youtube.com/channel/UCnxGkOGNMqQEUMvroOWps6Q">
Note the text file has a line of empty space to start, which is annoying, and the URL is on line 3. Something that doesn't show up in the formatting for this site is the 11 spaces before the actual href
starting as well. I'd like to separate it from the mass of other junk.
I've tried something like this:
set /p long= < output.txt
echo %long%
set short1=%long:^<link itemprop^="url" href^="=%
echo %short1% > o1.txt
I thought this would remove the selected text from the file, but I think this is a little over my head.
I'm getting the output.txt
from firstly a curl of a youtube video page, and secondly from a find
command here:
find "href=""http://www.youtube.com/channel/" %vd% > output.txt
Maybe I'm making this more complicated than it is?
Upvotes: 0
Views: 2056
Reputation: 38622
I would suggest you parse the results directly from your curl
command instead of outputting them to a text file, and then using find
against that output.
However, instead of using find.exe
, I would suggest you use the following method using findstr.exe
instead, to get the URL assigned to any line containing href=
followed by "http:
or "https
and subsequently followed by youtube.com
.
@Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
For /F Tokens^=*EOL^= %%G In (
'%__APPDIR__%findstr.exe /IR "href=\"http[s:].*youtube\.com" "output.txt"'
) Do (Set "Line=%%G" & SetLocal EnableDelayedExpansion
For /F Tokens^=2Delims^=^" %%H In ("!Line:*href=!") Do EndLocal & Echo %%H)
Pause
If you want the output stored as a variable, instead of Echo
ing it, change Echo %%H
to Set "URL=%%H"
. You could then use %URL%
, (or "%URL%"
if you need it doublequoted), elsewhere in your script.
Upvotes: 0
Reputation: 80033
@ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "filename1=%sourcedir%\q64572433.txt"
set "url="
FOR /f "tokens=4,5delims=>= " %%a IN (%filename1%) DO if "%%~a"=="href" set "url=%%~b"
echo URL=%url%
GOTO :EOF
You would need to change the setting of sourcedir
to suit your circumstances. The listing uses a setting that suits my system.
I used a file named q64572433.txt
containing your data for my testing.
The for
command tokenises each line of the file, using =
, >
and space as delimiters (the 3 characters between delims=
and "
)
On the line of interest, token 4 would be href
and token 5 the url - and this is the only line where href
is the fourth token. When that is detected, assign the 5th token (in %%b
) to the variable, removing the quotes with ~
for good measure.
Upvotes: 0
Reputation:
Using batch-files to access files with special characters, like redirect, it can cause some problems, so it is not recommended, but I felt like posting an answer anyway, so given you exact example, here is one way. If your example is not as per your post, which I highly expect it to be, then this probably would not work.
@echo off
setlocal enabledelayedexpansion
for /f "usebackq delims=" %%i in ("output.txt") do for %%a in (%%i) do (
set "var=%%~a"
set "var=!var:>=!"
set "var=!var:"=!"
if "!var:~0,4!" == "http" echo !var!
)
Upvotes: 1