mmo
mmo

Reputation: 4236

Windows cmd: how to extract a pattern?

I have a question pertaining to the Windows cmd program and its string manipulation capabilities:

To write me a script that automatically removes an old Java version and installs the latest I am trying to extract the so-called update number from a filename. The filenames have the form 'jdk-8u121-windows-x64.exe' or 'jdk-8u121-windows-i586.exe'. The update version number is the number between the 'u' and '-windows' ('121' in this example).

Given a variable 'fname' containing the filename I found that I can extract that number e.g. using: echo %fname:~6,3%

But this expression works only as long as the update number is exactly 3 digits long, but not when it is one or two digits only, i.e. if the filenames later were, say, 'jdk-9u1-windows-x64.exe' or 'jdk-9u12-windows-x64.exe' this would yield a wrong result ('1-w' or '12-' in the above cases).

Is there a syntax or possibility to state: "starting from character position 6 and as long as the characters are numerical digits"?

Something like "echo %fname:~6,\d+%"

'\d+' meaning "one or ore digits" - like for regular expressions.

Hope I could make myself clear...

Upvotes: 0

Views: 242

Answers (1)

aschipfl
aschipfl

Reputation: 34899

What about removing extracting everything between the first and second -, then split off everything up to and including the first u/U, so the update number (121) is left?

set "FILENAME=jdk-8u121-windows-x64.exe"

rem // Extract everything between first and second `-`:
for /F "tokens=2 delims=-" %%F in ("%FILENAME%") do set "UPDATE=%%F"
rem // Remove everything up to and including first `u` (case-insensitively):
set "UPDATE=%UPDATE:*u=%"

echo %UPDATE%

In case the u must be treated in a case-sensitive manner (just to show the possibility, although not relevant here, because Windows treats file names case-insensitively), change the approach to this:

set "FILENAME=jdk-8u121-windows-x64.exe"

rem // Extract everything between first and second `-`:
for /F "tokens=2 delims=-" %%F in ("%FILENAME%") do set "UPDATE=%%F"
rem // Remove everything up to and including first `u` (case-sensitively):
for /F "tokens=1,* delims=u" %%E in ("%UPDATE%") do set "UPDATE=%%F"

echo %UPDATE%

If the part in front of the first - cannot contain a u/U, it all can be simplified:

set "FILENAME=jdk-8u121-windows-x64.exe"

rem // Extract everything between first `u` (case-insensitively) and second `-`:
for /F "tokens=3 delims=-Uu" %%F in ("%FILENAME%") do set "UPDATE=%%F"

echo %UPDATE%

Upvotes: 1

Related Questions