小太郎
小太郎

Reputation: 5620

Parsing command line string in to argv format

I need to parse a command line string in to the argv format so I can pass it in to execvpe. Basically a linux equivilant to CommandLineToArgvW() from Windows. Is there any function or library I could call to do this? Or do I have to write my own parser? (I was hoping I could steal from BASH if I needed to do this since my program is GPL...)

Example: I have three variables:

const char* file = "someapplication";
const char* parameters = "param1 -option1 param2";
const char* environment[] = { "Something=something", NULL };

and I want to pass it to execvpe:

execvpe(file, /* parsed parameters */, environment);

PS: I do not want filename expansion but I want quoting and escaping

Upvotes: 3

Views: 6197

Answers (5)

Mike Finch
Mike Finch

Reputation: 877

The following is my first attempt at duplicating what the CommandLineToArgvW() function in the Windows shell library does. It uses only standard functions and types, and does not use Boost. Except for one call to strdup(), it is platform independent and works in both Windows and Linux environments. It handles arguments that are single quoted or double quoted.

// Snippet copied from a larger file.  I hope I added all the necessary includes.
#include <string>
#include <string.h>
#include <vector>

using namespace std;

char ** CommandLineToArgv( string const & line, int & argc )
{
    typedef vector<char *> CharPtrVector;
    char const * WHITESPACE_STR = " \n\r\t";
    char const SPACE = ' ';
    char const TAB = '\t';
    char const DQUOTE = '\"';
    char const SQUOTE = '\'';
    char const TERM = '\0';


    //--------------------------------------------------------------------------
    // Copy the command line string to a character array.
    // strdup() uses malloc() to get memory for the new string.
#if defined( WIN32 )
    char * pLine = _strdup( line.c_str() );
#else
    char * pLine = strdup( line.c_str() );
#endif


    //--------------------------------------------------------------------------
    // Crawl the character array and tokenize in place.
    CharPtrVector tokens;
    char * pCursor = pLine;
    while ( *pCursor )
    {
        // Whitespace.
        if ( *pCursor == SPACE || *pCursor == TAB )
        {
            ++pCursor;
        }

        // Double quoted token.
        else if ( *pCursor == DQUOTE )
        {
            // Begin of token is one char past the begin quote.
            // Replace the quote with whitespace.
            tokens.push_back( pCursor + 1 );
            *pCursor = SPACE;

            char * pEnd = strchr( pCursor + 1, DQUOTE );
            if ( pEnd )
            {
                // End of token is one char before the end quote.
                // Replace the quote with terminator, and advance cursor.
                *pEnd = TERM;
                pCursor = pEnd + 1;
            }
            else
            {
                // End of token is end of line.
                break;
            }
        }

        // Single quoted token.
        else if ( *pCursor == SQUOTE )
        {
            // Begin of token is one char past the begin quote.
            // Replace the quote with whitespace.
            tokens.push_back( pCursor + 1 );
            *pCursor = SPACE;

            char * pEnd = strchr( pCursor + 1, SQUOTE );
            if ( pEnd )
            {
                // End of token is one char before the end quote.
                // Replace the quote with terminator, and advance cursor.
                *pEnd = TERM;
                pCursor = pEnd + 1;
            }
            else
            {
                // End of token is end of line.
                break;
            }   
        }

        // Unquoted token.
        else
        {
            // Begin of token is at cursor.
            tokens.push_back( pCursor );

            char * pEnd = strpbrk( pCursor + 1, WHITESPACE_STR );
            if ( pEnd )
            {
                // End of token is one char before the next whitespace.
                // Replace whitespace with terminator, and advance cursor.
                *pEnd = TERM;
                pCursor = pEnd + 1;
            }
            else
            {
                // End of token is end of line.
                break;
            }
        }
    }


    //--------------------------------------------------------------------------
    // Fill the argv array.
    argc = tokens.size();
    char ** argv = static_cast<char **>( malloc( argc * sizeof( char * ) ) );
    int a = 0;
    for ( CharPtrVector::const_iterator it = tokens.begin(); it != tokens.end(); ++it )
    {
        argv[ a++ ] = (*it);
    }


    return argv;
}

Upvotes: 0

小太郎
小太郎

Reputation: 5620

I used the link given by rve in the comments (http://bbgen.net/blog/2011/06/string-to-argc-argv) and that solved my problem. Upvote his comment, not my answer!

Upvotes: 3

user735796
user735796

Reputation:

Use my nargv procedure. I've literally beaten this question to death with this answer: https://stackoverflow.com/a/10071763/735796 nargv means New Argument Vectors. It supports everything a shell would in regaurd to parsing a string into seperate elements. It supports double quotes, single quotes and string concatenation for example.

Upvotes: 1

stativ
stativ

Reputation: 1490

Maybe I'm missing something, but why don't you just pass the &argv[1] as parameters and the environment obtained using getenv() as environment?

EDIT: If you want a different separator, you can use the environment variable IFS (internal field separator) to achieve this.

Upvotes: 0

rxdazn
rxdazn

Reputation: 1390

char *strtok(char *s, const char *delim) is what you are looking for

char *s will be standard input and char *delim will be ' '

Upvotes: 0

Related Questions