Lha
Lha

Reputation: 33

How do you count consecutive strings in file using powershell?

So I want to know how I could get content from a file and count the consecutive occurrences of a string within that file? So my file has the following strings:

1
1
1
0
0
0
0
1
1
1
0
1
1
0
0
0
1
0
1
1
1
0
0

Now the thing is I know next to nothing about powershell, but know bash, so if somebody understands both, this is my desired effect:

[me@myplace aaa8]$ cat fule1|uniq -c
      3 1
      4 0
      3 1
      1 0
      2 1
      3 0
      1 1
      1 0
      3 1
      2 0

And if it's possible, also add the powershell equivalent of sort -hr :D

[me@myplace aaa8]$ cat fule1|uniq -c|sort -hr
      4 0
      3 1
      3 1
      3 1
      3 0
      2 1
      2 0
      1 1
      1 0
      1 0

So basically what this does is it tells me that the file I had has the longest streak of 4 zeroes, etc.

Is there a way to do this with powershell?

Upvotes: 2

Views: 528

Answers (1)

mklement0
mklement0

Reputation: 438208

PowerShell's equivalent to the uniq utility, the Get-Unique cmdlet, unfortunately has no equivalent to the former's -c option for prepending the number of consecutive duplicate lines (as of PowerShell v6.2).

Note: Enhancing Get-Unique to support a -c-like feature and other features offered by the uniq POSIX utility is the subject of this feature request on GitHub.

Therefore, you must roll your own solution:

function Get-UniqueWithCount {

  begin {
    $instanceCount = 1; $prevLine = $null
  }

  process {
    if ($_ -eq $prevLine) {
      ++$instanceCount
    } elseif ($null -ne $prevLine) {
      [pscustomobject] @{ InstanceCount = $instanceCount; Line = $prevLine }
      $instanceCount = 1
    }
    $prevLine = $_
  }

  end {
    [pscustomobject] @{ InstanceCount = $instanceCount; Line = $prevLine }
  }

}

The above function accepts input from the pipeline (object by object as $_ in the process { ... } block). It compares each object (line) to the previous one and, if they're equal, increments the instance count; once a different line is found, the previous line is output, along with its instance count, as an object with properties InstanceCount and Line. The end { ... } block outputs the final output object for the last block of identical consecutive lines. See about_Functions_Advanced.

Then invoke it as follows:

Get-Content fule | Get-UniqueWithCount

which yields:

InstanceCount Line
------------- ----
            3 1
            4 0
            3 1
            1 0
            2 1
            3 0
            1 1
            1 0
            3 1
            2 0

Since Get-UniqueWithCount conveniently outputs objects whose typed properties we can act on, the equivalent of sort -hr (sort by embedded numbers (-h), in descending (reverse) order (-r)) is easy:

Get-Content fule | Get-UniqueWithCount | Sort-Object -Descending InstanceCount

which yields:

InstanceCount Line
------------- ----
            4 0
            3 1
            3 1
            3 0
            3 1
            2 1
            2 0
            1 0
            1 1
            1 0

Upvotes: 1

Related Questions