Ninja Cowgirl
Ninja Cowgirl

Reputation: 11201

Count records in csv file - Powershell

I have following PS script to get a count. Is there a way to count (minus the header) without importing the entire csv? Sometimes the csv file is very large and sometime it has no records.

Get-ChildItem 'C:\Temp\*.csv' | ForEach {
    $check = Import-Csv $_
    If ($check) { Write-Host "$($_.FullName) contains data" }
    Else { Write-Host "$($_.FullName) does not contain data" }
}

Upvotes: 0

Views: 8387

Answers (2)

beatcracker
beatcracker

Reputation: 6920

Here is the function that will check is CSV file empty (returns True if empty, False otherwise) with following features:

  • Can skip headers
  • Works in PS 2.0 (PS 2.0 hasn't -ReadCount switch for Get-Content cmdlet)
  • Doesn't load entire file in memory
  • Aware of CSV file structure (wouldn't count empty/nonvalid lines).

It accepts following arguments:

  • FileName: Path to CSV file.
  • MaxLine: Maximum lines to read from file.
  • NoHeader: If this switch is not specified, function will skip first line of the file

Usage example:

Test-IsCsvEmpty -FileName 'c:\foo.csv' -MaxLines 2 -NoHeader

function Test-IsCsvEmpty
{
    Param
    (
        [Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true)]
        [string]$FileName,

        [Parameter(ValueFromPipelineByPropertyName = $true)]
        [ValidateRange(1, [int]::MaxValue)]
        [int]$MaxLines = 2,

        [Parameter(ValueFromPipelineByPropertyName = $true)]
        [switch]$NoHeader
    )

    Begin
    {
        # Setup regex for CSV parsing
        $DQuotes = '"'
        $Separator = ','
        # http://stackoverflow.com/questions/15927291/how-to-split-a-string-by-comma-ignoring-comma-in-double-quotes
        $SplitRegex = "$Separator(?=(?:[^$DQuotes]|$DQuotes[^$DQuotes]*$DQuotes)*$)"
    }

    Process
    {
        # Open file in StreamReader
        $InFile = New-Object -TypeName System.IO.StreamReader -ArgumentList $FileName -ErrorAction Stop

        # Set inital values for Raw\Data lines count
        $CsvRawLinesCount = 0
        $CsvDataLinesCount = 0

        # Loop over lines in file
        while(($line = $InFile.ReadLine()) -ne $null)
        {
            # Increase Raw line counter
            $CsvRawLinesCount++

            # Skip header, if requested
            if(!$NoHeader -and ($CsvRawLinesCount -eq 1))
            {
                continue
            }

            # Stop processing if MaxLines limit is reached
            if($CsvRawLinesCount -gt $MaxLines)
            {
                break
            }

            # Try to parse line as CSV
            if($line -match $SplitRegex)
            {
                # If success, increase CSV Data line counter 
                $CsvDataLinesCount++
            }
        }
    }

    End
    {
        # Close file, dispose StreamReader
        $InFile.Close()
        $InFile.Dispose()

        # Write result to the pipeline
        if($CsvDataLinesCount -gt 0)
        {
            $false
        }
        else
        {
            $true
        }
    }
}

Upvotes: 0

Dave Sexton
Dave Sexton

Reputation: 11188

To count the rows without worrying about the header use this:

$c = (Import-Csv $_.FullName).count

However this has to read the entire file into memory. A faster way to count the file would be to use the Get-Content with the readcount flag like so:

$c = 0
Get-Content $_.FullName -ReadCount 1000 | % {$c += $_.Length}
$c -= 1

To remove the header row from the count you just subtract 1. If your files with no rows don't have an header you can avoid them counting as minus 1 like so:

$c = 0
Get-Content $_.FullName -ReadCount 1000 | % {$c += $_.Length}
$c -= @{$true = 0; $false = - 1}[$c -eq 0]

Upvotes: 1

Related Questions