HBasiri
HBasiri

Reputation: 373

Accelerate Powershell script runtime

I'm using a POWERSHELL script which converts a specific log format to a tab or comma separated (CSV) format and it looks like this:

$filename = "filename.log"
foreach ($line in [System.IO.File]::ReadLines($filename)) {
    $x = [regex]::Split( $line , 'regex')
    $xx = $x -join ","
    $xx >> Results.csv
}  

And it works fine, but for a 20MB log file it takes almost 20 min to be converted! Is there a way to accelerate it?
My System: CPU: Corei7 3720QM / RAM: 8GB
Update: The log format is like this:

192.168.1.5:24652 172.16.30.8:80  http://www.example.com "useragent"  

I want destination format to be:

192.168.1.5,24652,172.16.30.8,80,http://www.example.com,"useragent"

REGEX: ^([\d\.]+):(\d+)\s+([\d\.]+):(\d+)\s+([^ ]*)\s+(\".*\")$

Upvotes: 1

Views: 76

Answers (1)

Mathias R. Jessen
Mathias R. Jessen

Reputation: 174485

As Lieven Keersmaekers points out, you can do a single -replace operation to do the work.

Additionally, foreach($thing in $o.GetThings()){} will initially block until GetThings() return and then store the entire result in memory, which you have no need for. You can avoid this by using the pipeline instead.

Finally, your regex can be simplified so that the engine doesn't have to parse the entire string before splitting, by matching on either : preceded by a digit or whitespace:

Get-Content filename.log |ForEach-Object {
    $_ -replace '(?:(?<=\d)\:|\s+)',','
} |Out-File results.csv

Upvotes: 2

Related Questions