Piotr Dobrogost
Piotr Dobrogost

Reputation: 42415

How to split double quoted strings with embedded spaces deliminated with spaces in a batch file?

I'm struggling with improving script which I proposed as an answer to How to write a batch file showing path to executable and version of Python handling Python scripts on Windows? question. To prevent Open With dialog box I'd like to read output of ftype command, extract path of an executable from it and check if it exists.

After this

@echo off
setlocal EnableDelayedExpansion 
rem c:\ftype Python.File ->
rem Python.File="c:\path with spaces, (parentheses) and % signs\python.exe" "%1" %*
for /f "tokens=2 delims==" %%i in ('ftype Python.File') do (
    set "reg_entry=%%i"
)

reg_entry's contents is

"c:\path with spaces and (parentheses) and % signs\python.exe" "%1" %*

How do I split this to get "c:\path with spaces, (parentheses) and % signs\python.exe", "%1" and %*?

EDIT
I tried using call after reading Aacini's answer and it almost works. It doesn't handle % sign, however.

@echo off
setlocal EnableDelayedExpansion 
set input="c:\path with spaces and (parentheses) and %% signs\python.exe" "%%1" %%*
echo !input!
call :first_token output !input!
echo !output!
goto :eof

:first_token
set "%~1=%2"
goto :eof

Output

"c:\path with spaces and (parentheses) and % signs\python.exe" "%1" %*
"c:\path with spaces and (parentheses) and 1"

Upvotes: 2

Views: 974

Answers (4)

dbenham
dbenham

Reputation: 130819

An alternative parser that is very similar to the CALL parser is the simple FOR. There are two complicating factors:

1- The FOR must not be expanded while delayed expansion is enabled in case it contains !. This is easily handled.

2- The content must not contain wildcards * or ?. The ? can be temporarily substituted for and then returned. But there is no easy way to search and replace *.

Since this problem is trying to parse out a path, and paths cannot contain wildcards, this problem is easy to solve without using a CALL. I added ! to the test case for completeness.

@echo off
setlocal disableDelayedExpansion
set input="c:\path with spaces, ampersand &, carets ^ and (parentheses)! and %% signs\python.exe" "%%1" %%*
set input
set "output="
setlocal enableDelayedExpansion
for %%A in (!input!) do if not defined output endlocal & set output=%%A
set output

If we can rely on the fact that the first token will always be enclosed in quotes, then the solution is even easier. We can use FOR /F with both EOL and DELIMS set to ".

@echo off
setlocal disableDelayedExpansion
set input="c:\path with spaces, ampersand &, carets ^ and (parentheses)! and %% signs\python.exe" "%%1" %%*
set input
set "output="
setlocal enableDelayedExpansion
for /f eol^=^"^ delims^=^" %%A in ("!input!") do endlocal & set output="%%A"
set output

However, I just looked at my FTYPE output, and discovered some entries were not quoted, even if they contain spaces in the path! I don't think any of the answers on this page will handle this. In fact the entire premise behind the question may be flawed.

Upvotes: 2

jeb
jeb

Reputation: 82247

As Aacini said, your problem can be solved with the internal parameter splitting by using the CALL statement.

To avoid losing % signs by the call you can double them just before the call expansion.
The keyline is set "input=!input:%%=%%%%!", the percent signs are halfed in one of the parser phases, so there are replaced single % by %%.

But even then this solution isn't perfect!

This solution has problems with special characters like &<>|, in your case only & as this is the only legal character in a filename/path.
That can be avoided by changing the line set "%~1=%2" to set ^"%~1=%2", this ensures that %2 uses the surrounding quotes.

But now you got another problem, all carets are doubled,
so I have to do another replacement for the output with set "output=!output:^^=^!".

The new code would look like this

@echo off
setlocal EnableDelayedExpansion 
set input="c:\path with spaces, exlcamation mark^!, ampersand &, carets ^ and (parentheses) and %% signs\python.exe" "%%1" %%*
echo !input!
set "input=!input:%%=%%%%!"
call :first_token output !input!
set "output=!output:^^=^!"
echo !output!
goto :eof

:first_token
set ^"%~1=%2"
goto :eof

EDIT: For handling also exclamation marks !
You need to change the :first_token function to

:first_token
setlocal DisableDelayedExpansion
set ^"temp=%2"
set ^"temp=%temp:!=^!%"
(
endlocal
set ^"%~1=%temp%"
)
goto :eof

Upvotes: 2

Ira Baxter
Ira Baxter

Reputation: 95324

Essentially what you have to do is to convert the entire string into its elements, much as a parser would do it. In your case, lexical analysis would probably do the trick due to Windows rules about where spaces are allowed.

Fundamentally you need to build a finite state machine in your .cmd file with labels and conditional gotos. The FSA has states which process the various parts of the element you wish to collect. In a start state, you decide if you see a blank (skip and go back to start), a double quote (go to the part of the FSA that handles doubly-quoted strings), or something nonblank (go the the part of the FSA that collects nonblank characters).

The FSA part that collects double quoted strings picks off characters until it finds another double quote; that is what lets you capture blanks inside doubly quoted strings. I think you have to check for an "escaped" double quote (two of them in a row) and if found, replace them by a single double quote and continue collecting characters.

This is pretty ugly because the CMD script has truly awful string processing capabilities. Every (ugly) thing you need to know can be found by typing HELP SET to the DOS Command prompt. In particular, substringing is of the form %VAR:~n,m% which picks off m characters starting at index n in the environment variable %VAR%. I've found it useful to SET TEMP=%VAR% and then peel characters off of %TEMP% one by one by simple sequences such as

SET CHAR=%TEMP:~0,1%
SET TEMP=%TEMP:~1%

Enjoy.

Upvotes: 1

Aacini
Aacini

Reputation: 67216

That is direct capability of Batch. In Batch the parameters of a Batch file are separated by spaces, and a parameter may be enclosed in quotes, so just pass the value of reg_entry as parameters of a Batch file an inside it take each parameter:

C:\>type test.bat
@echo off
:loop
echo %1
shift
if not "%1" == "" goto loop

.

C:\>echo %reg_entry%
"c:\path with spaces and (parentheses) and % signs\python.exe" "%1" %*

.

C:\>test %reg_entry%
"c:\path with spaces and (parentheses) and % signs\python.exe"
"%1"
%*

Upvotes: 2

Related Questions