Tom
Tom

Reputation: 13

Trying to count Excel Rows in Powershell

I have a set of Excel spreadsheets across multiple directories that I'm trying to get a row count on. I've had limited success so far as I'm a bit of a PowerShell neophyte and was hoping that I'd get a little better luck from asking.
I have Excel available but this is the best progress I've made so far, which as far as I can tell is giving my the occupied cell count and not the line count.
Any thoughts?

get-content (Get-ChildItem -Path .\ -Include *.xlsx -Recurse) | measure-object -line

I need to be able to deploy this across multiple directories which are similar but not of identical structure, however I can assume that every Excel file in the directory will be pertinent to the count.

There are some possible solutions for single files floating around here but mostly have to do with item counting instead of bulk checks.

Upvotes: 1

Views: 5595

Answers (1)

TheMadTechnician
TheMadTechnician

Reputation: 36332

I don't think that you can get the number of lines without actually opening the file in Excel. As such you are going to need to choose if you want this to be fast, or if you want it to be accurate. Do your files include header rows, and should those be included?

Getting the UsedRange for a file is easy once it's open in Excel, and opening files in Excel is simple enough really. The UsedRange is not always completely accurate of the actual range that has data, but more often than not it's a good indication of it. It is possible to format cells beyond those that have data, and in that case your UsedRange will include those 'blank, yet formatted' cells. If that isn't a concern then your issue is simple.

What I'll use for this will be Get-ChildItem, just like you did, to get the actual files. I'll also be using the Excel ComObject to open the program and interact with it (open files, get info, close files). I'll open each file and get the UsedRange for the last active worksheet of the file, and grab the count of the rows property. Then we will use Add-Member to append that info to the [FileInfo] object that we got from Get-ChildItem, and collect that info in a variable.

$Excel = New-Object -comobject Excel.application
$ExcelFiles = Get-ChildItem .\*.xlsx -recurse | ForEach{
    $WB = $Excel.Workbooks.Open($_.fullname)
    Add-Member -InputObject $_ -NotePropertyName 'RowCount' -NotePropertyValue ($Excel.ActiveSheet.UsedRange.Rows.Count) -PassThru
    $WB.Close()
}
$Excel.Quit()

From here you can output the information you now have however you'd like. For example, just the name of the file, and the number of rows:

$ExcelFiles | Format-Table Name,RowCount

Upvotes: 1

Related Questions