georges619
georges619

Reputation: 286

Microsoft Text-To-Speech doesn't speak accented characters

I'm developping a Text-To-Speech application using the Microsoft sapi library. I implemented the speaking function and discovered that the accented characters (à,á,â,ä,é,è,ê,í,ì,î,ó,ò,ô,ö,ù,ú,û,ü) are not spoken. Here is my code:

int ttsSpeak( const char* text ) //Text to Speech speaking function
{
  if( SUCCEEDED(hr) )
  {
    hr = SpEnumTokens( SPCAT_VOICES, NULL, NULL, &cpEnum );

    cpEnum->Item( saveVoice, &cpVoiceToken );
    cpVoice->SetVoice( cpVoiceToken ); //Initialization of the voice

    string str( text );
    cout << str;
    std::wstring stemp = std::wstring( str.begin(), str.end() );
    LPCWSTR sw = ( LPCWSTR )stemp.c_str(); //variable allowing to speak my entered text

    printf( "Text To Speech processing\n" );
    hr = cpVoice->Speak( sw, SPF_DEFAULT, NULL ); //speak my text

    saveText = text;

    cpEnum.Release();
    cpVoiceToken.Release();
  }
  else
  {
    printf( "Could not speak entered text\n" );
  }

  return true;
}

I debugged my app and found out that the variable str gets the accented characters. However, I create a wstring variable called stemp where my string is converted, and here the accented character is replaced with a empty square. Then, a LPCWSTR variable (Long Pointer to Constant Wide String) is created in order to speak the entered text. Below a picture of my variables values.

Variable values

Maybe there is something wrong in my code, but what can I do to ensure that the accented characters are spoken out?

Upvotes: 0

Views: 446

Answers (2)

georges619
georges619

Reputation: 286

I implemented the MultiByteToWideChar suggested by @rveerd. Here is the code:

int ttsSpeak( const char* text ) //Text to Speech speaking function
{
  if( SUCCEEDED(hr) )
  {
    hr = SpEnumTokens( SPCAT_VOICES, NULL, NULL, &cpEnum );

    cpEnum->Item( saveVoice, &cpVoiceToken );
    cpVoice->SetVoice( cpVoiceToken ); //Initialization of the voice

    //processing conversion
    int wchars_num = MultiByteToWideChar( CP_ACP, 0, text, -1, NULL, 0 ); 
    wchar_t* wstr = new wchar_t[ wchars_num ];
    MultiByteToWideChar( CP_ACP, 0, text, -1, wstr, wchars_num );

    printf( "Text To Speech processing\n" );
    hr = cpVoice->Speak( wstr, SPF_DEFAULT, NULL ); //speak my text

    saveText = text;

    cpEnum.Release();
    cpVoiceToken.Release();
    delete new wchar_t[wchars_num];
  }
  else
  {
    printf( "Could not speak entered text\n" );
  }

  return true;
}

I also found a shorter way to convert it. Just replace the MultiByteToWideChar code with following:

CA2W pszWide( str.c_str(), CP_ACP);
hr = cpVoice->Speak( pszWide, SPF_DEFAULT, NULL );

Edit: I replaced CP_UTF7 because it is rarely used. CP_UTF8 is prefered. However, it didn't worked for me, but I found out that CP_ACP works for me. For more information look at the link @rveerd posted

Upvotes: 0

rveerd
rveerd

Reputation: 4006

You can't simply copy a single-byte or multi-byte character string (char, std::string) to a wide character string (wchar_t, std::wstring). You need to do proper conversion between encodings or character sets.

You have to determine the correct encodings used for both strings. On Windows, std::string data is usually in a local encoding, such as Windows-1252 and std::wstring data is in UTF-16.

On Windows, you can use MultiByteToWideChar for the conversion.

Alternatively, you can use standard functions such as mbstowcs or std::mbtowc.

Upvotes: 2

Related Questions