Gabriel Frigo
Gabriel Frigo

Reputation: 326

Is there the CommandLineToArgvA function in windows c/c++ (VS 2022)?

There is the CommandLineToArgvW() function, which is CommandLineToArgv + W, where this W means wide char (wchar_t in C/C++). But the CommandLineToArgvA() function that should exist, such as GetCommandLineW() and GetCommandLineA(), does not exist, apparently.

char:

int argv;
char **argv = CommandLineToArgvA(GetCommandLineA(), &argc);

wide char:

int argv;
wchar_t **wargv = CommandLineToArgvW(GetCommandLineW(), &argc);

Well, I searched every corner of the Internet for the term CommandLineToArgvA() and the most I found was this function in Linux Wine... I want to know, does this function exist, and if yes, is it normal that it is "hidden"? Otherwise, does it really not exist?

edit: The question was whether there was the CommandLineToArgvA function in the Windows API, however, it does not exist (comment by Remy Lebeau). The answer I checked as correct is the one that best explains how to use the existing CommandLineToArgvW function and turn the wchar_t into char, which will provide the same result that would be provided with the CommandLineToArgvA function if it existed.

Upvotes: 2

Views: 2019

Answers (2)

Bill Yang
Bill Yang

Reputation: 11

Just like you, I wonder why there is no CommandLineToArgvA() function. Anyway, normally there should be an ANSI version for each wchart version. At last, I found the answer on Microsoft website. https://learn.microsoft.com/en-us/windows/win32/api/processenv/nf-processenv-getcommandlinea

"The command line returned by GetCommandLineA is a conversion of the Unicode command line to the 8-bit process code page.

For most code pages this conversion is lossy and the converted command line can differ from the Unicode command line, creating possible security issues like the following:

The conversion may alter strings intended for use as file names. For example, if the ANSI code page is Windows-1252, the Unicode character U+0100 (Latin capital letter A with macron: Ā) converts to 0x41 (the Latin capital letter A). If a user passes a file name containing the character Ā, a program that uses GetCommandLineA will receive it with the character A and operate on the wrong file. The conversion may alter how the command line is parsed. For example, if the ANSI code page is Windows-1252, the Unicode character U+FF02 (Fullwidth quotation mark: ") converts to 0x22 (the ASCII quotation mark) and the Unicode character U+2010 (Hyphen: ‐) converts to 0x2D (the ASCII minus sign). Both of these can result in command line file arguments being misinterpreted as command line options. To avoid this problem, use the GetCommandLineW function to receive the Unicode command line, or use an application manifest (on Windows Version 1903 or later) to set UTF-8 as the process code page."

The string from GetCommandLineA is a conversion of the Unicode command line to the 8-bit process code page. If we use it in CommandLineToArgvA() function, there might be a security problem. Hence, Microsoft don't want us to have CommandLineToArgvA() function. In fact, all ANSI version functions would convert ANSI strings into wchart strings under the hood of windows. When I use GetCommandLineA to get an ANSI command line, I just do it myself to reinvent the wheels to build my own function to handle it.

Upvotes: 1

Dúthomhas
Dúthomhas

Reputation: 10083

I don’t think you should try parsing your own command-line string. Windows does it one way. Trying to write duplicate code to do the same thing is the Wrong Thing™ to do.

Just get the command-line, then use the Window facilities to convert it to UTF-8.

#include <stdlib.h>
#include <windows.h>
#include <shellapi.h>

#pragma comment(lib, "Shell32")

void get_command_line_args( int * argc, char *** argv )
{
  // Get the command line arguments as wchar_t strings
  wchar_t ** wargv = CommandLineToArgvW( GetCommandLineW(), argc );
  if (!wargv) { *argc = 0; *argv = NULL; return; }
  
  // Count the number of bytes necessary to store the UTF-8 versions of those strings
  int n = 0;
  for (int i = 0;  i < *argc;  i++)
    n += WideCharToMultiByte( CP_UTF8, 0, wargv[i], -1, NULL, 0, NULL, NULL ) + 1;
  
  // Allocate the argv[] array + all the UTF-8 strings
  *argv = malloc( (*argc + 1) * sizeof(char *) + n );
  if (!*argv) { *argc = 0; return; }
  
  // Convert all wargv[] --> argv[]
  char * arg = (char *)&((*argv)[*argc + 1]);
  for (int i = 0;  i < *argc;  i++)
  {
    (*argv)[i] = arg;
    arg += WideCharToMultiByte( CP_UTF8, 0, wargv[i], -1, arg, n, NULL, NULL ) + 1;
  }
  (*argv)[*argc] = NULL;
}

Obtains an argv just like the one main() gets, with a final NULL element and writeable and all.

Interface is easy enough. Don’t forget to free() the result when you are done with it. Example usage:

#include <stdio.h>
#include <stdlib.h>

void f(void)
{
  int     argc;
  char ** argv;
  get_command_line_args( &argc, &argv );
  
  for (int n = 0;  n < argc;  n++)
    printf( "  %d : %s\n", n, argv[n] );
  
  free( argv );
}

int main(void)
{
  f();
}

Enjoy!

Upvotes: 6

Related Questions