Reputation: 53
I have multiple large log files that I'd like to export to CSV. To start with, I just want to split two parts, Date and Event. The problem I'm having is that not every line starts with a date.
Here is a sample chunk of log. Date/times are always 23 characters. The rest varies with the log and event description.
I'd like the end result to look like this in excel.
Here's what I've tried so far but just returns the first 23 characters of each line.
$content = Get-Content myfile.log -TotalCount 50
for($i = 0; $i -lt $content.Length; $i++) {
$a = $content[$i].ToCharArray()
$b = ([string]$a[0..23]).replace(" ","")
Write-Host $b }
Upvotes: 1
Views: 1779
Reputation: 36277
Read the file in raw as a multi-line string, then use RegEx to split on the date pattern, and for each chunk make a custom object with the two properties that you want, where the first value is the first 23 characters, and the second value is the rest of the string trimmed.
(Get-Content C:\Path\To\File.csv -Raw) -split '(?m)(?=^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})'|
Where{$_}|
ForEach-Object{
[PSCustomObject]@{
'Col1'=$_.Substring(0,23)
'Col2'=$_.Substring(23).Trim()
}
}
Then you can pipe that to a CSV, or do whatever you want with the data. If the files are truly massive this may not be viable, but it should work ok on files up to a few hundred megs I would think. Using your sample text that output:
Col1 Col2 ---- ---- 2017-09-04 12:31:11.343 General BOECD:: ProcessStartTime: ... 2017-09-04 12:31:11.479 General MelsecIoWrapper: Scan ended: device: 1, ScanStart: 9/4/2017 12:31:10 PM Display: False 2017-09-04 12:31:11.705 General BOECD:: ProcessEndTime: ... 2017-09-04 12:31:13.082 General BOECD:: DV Data:
The ...
at the end of the two lines are where it truncated the multi-line value in order to display it on screen, but the value is there intact.
(?=...)
is a so-called "positive lookahead assertion". Such assertions cause a regular expression to match the given pattern without actually including it in the returned match/string. In this case the match returns the empty string before a timestamp, so the string can be split there without removing the timestamp.
Upvotes: 3