Reputation: 32715
I'm trying to use Process.Start
with redirected I/O to call PowerShell.exe
with a string, and to get the output back, all in UTF-8. But I don't seem to be able to make this work.
What I've tried:
-Command
parameterConsole.OutputEncoding
in both my console application and in the PowerShell script$OutputEncoding
in PowerShellProcess.StartInfo.StandardOutputEncoding
Encoding.Unicode
instead of Encoding.UTF8
In every case, when I inspect the bytes I'm given, I get different values to my original string. I'd really love an explanation as to why this doesn't work.
Here is my code:
static void Main(string[] args)
{
DumpBytes("Héllo");
ExecuteCommand("PowerShell.exe", "-Command \"$OutputEncoding = [System.Text.Encoding]::UTF8 ; Write-Output 'Héllo';\"",
Environment.CurrentDirectory, DumpBytes, DumpBytes);
Console.ReadLine();
}
static void DumpBytes(string text)
{
Console.Write(text + " " + string.Join(",", Encoding.UTF8.GetBytes(text).Select(b => b.ToString("X"))));
Console.WriteLine();
}
static int ExecuteCommand(string executable, string arguments, string workingDirectory, Action<string> output, Action<string> error)
{
try
{
using (var process = new Process())
{
process.StartInfo.FileName = executable;
process.StartInfo.Arguments = arguments;
process.StartInfo.WorkingDirectory = workingDirectory;
process.StartInfo.UseShellExecute = false;
process.StartInfo.CreateNoWindow = true;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.StandardOutputEncoding = Encoding.UTF8;
process.StartInfo.StandardErrorEncoding = Encoding.UTF8;
using (var outputWaitHandle = new AutoResetEvent(false))
using (var errorWaitHandle = new AutoResetEvent(false))
{
process.OutputDataReceived += (sender, e) =>
{
if (e.Data == null)
{
outputWaitHandle.Set();
}
else
{
output(e.Data);
}
};
process.ErrorDataReceived += (sender, e) =>
{
if (e.Data == null)
{
errorWaitHandle.Set();
}
else
{
error(e.Data);
}
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
process.WaitForExit();
outputWaitHandle.WaitOne();
errorWaitHandle.WaitOne();
return process.ExitCode;
}
}
}
catch (Exception ex)
{
throw new Exception(string.Format("Error when attempting to execute {0}: {1}", executable, ex.Message),
ex);
}
}
I found that if I make this script:
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
Write-Host "Héllo!"
[Console]::WriteLine("Héllo")
Then invoke it via:
ExecuteCommand("PowerShell.exe", "-File C:\\Users\\Paul\\Desktop\\Foo.ps1",
Environment.CurrentDirectory, DumpBytes, DumpBytes);
The first line is corrupted, but the second isn't:
H?llo! 48,EF,BF,BD,6C,6C,6F,21
Héllo 48,C3,A9,6C,6C,6F
This suggests to me that my redirection code is all working fine; when I use Console.WriteLine
in PowerShell I get UTF-8 as I expect.
This means that PowerShell's Write-Output
and Write-Host
commands must be doing something different with the output, and not simply calling Console.WriteLine
.
I've even tried the following to force the PowerShell console code page to UTF-8, but Write-Host
and Write-Output
continue to produce broken results while [Console]::WriteLine
works.
$sig = @'
[DllImport("kernel32.dll")]
public static extern bool SetConsoleCP(uint wCodePageID);
[DllImport("kernel32.dll")]
public static extern bool SetConsoleOutputCP(uint wCodePageID);
'@
$type = Add-Type -MemberDefinition $sig -Name Win32Utils -Namespace Foo -PassThru
$type::SetConsoleCP(65001)
$type::SetConsoleOutputCP(65001)
Write-Host "Héllo!"
& chcp # Tells us 65001 (UTF-8) is being used
Upvotes: 52
Views: 140801
Reputation: 25
Spent some time working on a solution to my issue and thought it may be of interest. I ran into a problem trying to automate code generation using PowerShell 3.0 on Windows 8. The target IDE was the Keil compiler using MDK-ARM Essential Toolchain 5.24.1. A bit different from OP, as I am using PowerShell natively during the pre-build step. When I tried to #include the generated file, I received the error
fatal error: UTF-16 (LE) byte order mark detected '..\GITVersion.h' but encoding is not supported
I solved the problem by changing the line that generated the output file from:
out-file -FilePath GITVersion.h -InputObject $result
to:
out-file -FilePath GITVersion.h -Encoding ascii -InputObject $result
Upvotes: 0
Reputation: 21
Set the [Console]::OuputEncoding
as encoding whatever you want, and print out with [Console]::WriteLine
.
If powershell ouput method has a problem, then don't use it. It feels bit bad, but works like a charm :)
Upvotes: 2
Reputation: 2267
This is a bug in .NET. When PowerShell launches, it caches the output handle (Console.Out). The Encoding property of that text writer does not pick up the value StandardOutputEncoding property.
When you change it from within PowerShell, the Encoding property of the cached output writer returns the cached value, so the output is still encoded with the default encoding.
As a workaround, I would suggest not changing the encoding. It will be returned to you as a Unicode string, at which point you can manage the encoding yourself.
Caching example:
102 [C:\Users\leeholm]
>> $r1 = [Console]::Out
103 [C:\Users\leeholm]
>> $r1
Encoding FormatProvider
-------- --------------
System.Text.SBCSCodePageEncoding en-US
104 [C:\Users\leeholm]
>> [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
105 [C:\Users\leeholm]
>> $r1
Encoding FormatProvider
-------- --------------
System.Text.SBCSCodePageEncoding en-US
Upvotes: 25
Reputation: 2772
Not an expert on encoding, but after reading these...
... it seems fairly clear that the $OutputEncoding variable only affects data piped to native applications.
If sending to a file from withing PowerShell, the encoding can be controlled by the -encoding
parameter on the out-file
cmdlet e.g.
write-output "hello" | out-file "enctest.txt" -encoding utf8
Nothing else you can do on the PowerShell front then, but the following post may well help you:.
Upvotes: 26