Reputation: 247
I am attempting to normalize a set of TAB-delimited logfiles using Powershell.
Here is the current script:
(Get-ChildItem *.csv) |%{
#Store the name of the file & date
$Name = $_.basename
$FileDate = $_.CreationTime
#Prepends the following to each message: unique Identifer, Hostname, date
(Get-Content $_.fullname) -replace "^","AR-LOG|$Name|$FileDate|"|
#Replaces the TAB delimeter with a Pipe delimeter
Foreach-Object {$_ -replace ' ','|'} |
#Appends the resulting message in ascii
Out-File -append -FilePath normalized.log -Encoding ascii
A snippet of the input & output can be seen here:
How can I force the output file to be ascii and not some type of unicode?
***Edit: Further troubleshooting reveals that the input files are actually windows-1252 encoded, which apparently Get-Content cannot handle natively(?)
Upvotes: 4
Views: 19110
Reputation: 49
Change the format of a file from ASCII
to UTF8
:
$filename = "c:\docs\demo.csv"
(Get-Content $filename) | Set-Content $filename -Encoding UTF8
Upvotes: 4
Reputation: 124
Can you play around with ReadAllText method? It stores the whole file in a single string. Get-Content stores values as an array of strings where array value is the line of the file.
(Get-ChildItem *.csv) |%{
#Store the name of the file & date
$Name = $_.basename
$FileDate = $_.CreationTime
#Prepends the following to each message: unique Identifer, Hostname, date
([IO.File]::ReadAllText($_.fullname)) -replace "^","AR-LOG|$Name|$FileDate|"
#Replaces the TAB delimeter with a Pipe delimeter
-replace ' ','|' |
#Appends the resulting message in ascii
Out-File -append -FilePath normalized.log -Encoding ascii
Upvotes: 0
Reputation: 2639
You should be able to use the encoding flag on out-file as in ... | Out-File -encoding ascii myfile.txt
. And if you're using append
, make sure all appends use the same encoding or you'll end up with an unusable file.
Upvotes: 6