Reputation: 149
I have a more than 100 text files and I have to count the lines for each of them. Column A
lists the file name, located in the folder specified in E1
. Several files have more than 1 million lines, causing the script to run a terrifyingly long time.
Sub counter()
Dim fso As New FileSystemObject
Dim ts As TextStream
Dim longtext As String
Dim lines As Variant
Dim GoToNum As Integer
Dim Start As Integer
GoToNum = 2
Start = 3
Do Until IsEmpty(Cells(Start, 1))
GoToNum = GoToNum + 1
Start = Start + 1
Loop
For i = 3 To GoToNum
If Cells(i, 2).Value <= Cells(2, 5).Value Then
ConOrg = Cells(1, 4).Value & "\" & Cells(i, 1).Value
Set ts = fso.OpenTextFile(ConOrg, ForReading, False)
longtext = ts.ReadAll
ts.Close
lines = Split(longtext, vbLf)
Cells(i, 3) = UBound(lines) - LBound(lines) - 1
End If
Next i
End Sub
How can I get the number of the last row (from the text file) to avoid the counting line by line?
Upvotes: 9
Views: 10731
Reputation: 21639
The fastest method depends on the size of the file(s) you're processing:
# of lines | filesize | Open statement |
FileSystemObject |
|
---|---|---|---|---|
one GIANT file | 1.7 million | 40mb × 1 | ❌ 25.2 sec | ✔️ 2.1 sec |
many sᴍᴀʟʟ files | 6 | 14b × 10,000 | ✔️ 1.3 sec | ❌ 18.9 sec |
better for sᴍᴀʟʟ files | better for BIG files |
Function countLF(fName As String) As Long
Dim st As String
Open fName For Input As #1: st = Input(LOF(1), 1): Close #1
countLF = Len(st) - Len(Replace(st, vbLf, "")) + 1
End Function
Example Usage:
Debug.Print countLF("c:\test.txt")
Function countLines(fName As String) As Long
countLines=CreateObject("Scripting.FileSystemObject").OpenTextFile(fName, 8, True).Line
End Function
Example Usage:
Debug.Print countLines("c:\test.txt")
More Benchmarking of other tested variations: (2500 tiny text files)
Binary Access/Get (4.32s) Kill=1.17s . . . Open F `For Binary Access Read As #1:ReDim...Get #1,,bytes
Line Input/LineInput (4.44s) Kill=1.11s . . . Open F For Input As #iFile...Line Input #1,st
Early Bind/ReuseObj (5.25s) Del=1.12s . . . Set o=New Scripting.FileSystemObject':st=o.OpenTextFile(F).ReadAll()
Early Bind/FreshObj (11.98s) Del=1.35s . . . Set o=New Scripting.FileSystemObject':st=o.OpenTextFile(F).ReadAll()
LateBind/ReuseObj (6.25s) Del=1.47s . . . Set o=CreateObject("Scripting.FileSystemObject")
LateBind/FreshObj (13.59s) Del=2.29s . . . With CreateObject("Scripting.FileSystemObject")
Upvotes: 12
Reputation: 7891
An alternative approach would be to use Power Query (Get & Transform Data):
let
Source = Folder.Files("C:\Users\me\MyFolder"),
#"Filtered Rows" = Table.SelectRows(Source, each [Extension] = ".txt"),
#"Added Row Count" = Table.AddColumn(#"Filtered Rows", "Rows In File", each Table.RowCount(Table.FromColumns({Lines.FromBinary([Content])})), Int64.Type),
#"Removed Columns" = Table.SelectColumns(#"Added Row Count",{"Name", "Rows In File"})
in
#"Removed Columns"
This works pretty fast.
Upvotes: 2
Reputation: 145
I dont think you can read last line with just a method.
Do While fso.AtEndOfStream <> True
fso.SkipLine
Loop
lines = fso.Line-1
something like this would not be faster?
Upvotes: -1
Reputation: 29286
Give this function a try. It uses the FileSystemObject
. Should be faster than read the whole file and split it into single lines. Inspired from Hey, Scripting guy
Function countLines(fName As String) As Long
Const ForReading = 1
Dim objFSO As Object, objTextFile As Object
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile(fName, ForReading)
objTextFile.ReadAll
countLines = objTextFile.Line
End Function
Upvotes: 3