Reputation: 13
I have a set of Excel
spreadsheets across multiple directories that I'm trying to get a row count on. I've had limited success so far as I'm a bit of a PowerShell
neophyte and was hoping that I'd get a little better luck from asking.
I have Excel
available but this is the best progress I've made so far, which as far as I can tell is giving my the occupied cell count and not the line count.
Any thoughts?
get-content (Get-ChildItem -Path .\ -Include *.xlsx -Recurse) | measure-object -line
I need to be able to deploy this across multiple directories which are similar but not of identical structure, however I can assume that every Excel
file in the directory will be pertinent to the count.
There are some possible solutions for single files floating around here but mostly have to do with item counting instead of bulk checks.
Upvotes: 1
Views: 5595
Reputation: 36332
I don't think that you can get the number of lines without actually opening the file in Excel. As such you are going to need to choose if you want this to be fast, or if you want it to be accurate. Do your files include header rows, and should those be included?
Getting the UsedRange
for a file is easy once it's open in Excel, and opening files in Excel is simple enough really. The UsedRange is not always completely accurate of the actual range that has data, but more often than not it's a good indication of it. It is possible to format cells beyond those that have data, and in that case your UsedRange
will include those 'blank, yet formatted' cells. If that isn't a concern then your issue is simple.
What I'll use for this will be Get-ChildItem
, just like you did, to get the actual files. I'll also be using the Excel ComObject to open the program and interact with it (open files, get info, close files). I'll open each file and get the UsedRange
for the last active worksheet of the file, and grab the count of the rows property. Then we will use Add-Member
to append that info to the [FileInfo]
object that we got from Get-ChildItem
, and collect that info in a variable.
$Excel = New-Object -comobject Excel.application
$ExcelFiles = Get-ChildItem .\*.xlsx -recurse | ForEach{
$WB = $Excel.Workbooks.Open($_.fullname)
Add-Member -InputObject $_ -NotePropertyName 'RowCount' -NotePropertyValue ($Excel.ActiveSheet.UsedRange.Rows.Count) -PassThru
$WB.Close()
}
$Excel.Quit()
From here you can output the information you now have however you'd like. For example, just the name of the file, and the number of rows:
$ExcelFiles | Format-Table Name,RowCount
Upvotes: 1