Reputation: 9
I have huge .txt file, it's around 500MB produced daily by my web apps, it has 21 fields in a row, separated by pipe character | in each field and it has more 2 million rows in it. for speed case, I split the input file by its size and now need to be split by character branch field that i'm about to add this new field.
'previous header
Date|Field_2|Field_3|Field_4|Field_5|Field_6|Field_7|Field_8|Field_9|Field_10|Field_11|Field_12|Field_13|Field_14|Field_15|Field_16|Field_17|Field_18|Field_19|Field_20|
'after add branch field
Date|Branch|Field_2|Field_3|Field_4|Field_5|Field_6|Field_7|Field_8|Field_9|Field_10|Field_11|Field_12|Field_13|Field_14|Field_15|Field_16|Field_17|Field_18|Field_19|Field_20|
'i used to split use this code:
'got the script from http://prabhuram.com/articles/2012/02/28/splitting-large-files-using-vbscript/
Dim Counter
Const InputFile = "C:\input.txt"
Const OutputFile = "C:\output"
Const RecordSize = 1000000
Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile (InputFile, ForReading)
Counter = 0
FileCounter = 0
Set objOutTextFile = Nothing
Do Until objTextFile.AtEndOfStream
if Counter = 0 Or Counter = RecordSize Then
Counter = 0
FileCounter = FileCounter + 1
if Not objOutTextFile is Nothing then objOutTextFile.Close
Set objOutTextFile = objFSO.OpenTextFile( OutputFile & "_" & FileCounter & ".txt", ForWriting, True)
end if
strNextLine = objTextFile.Readline
objOutTextFile.WriteLine(strNextLine)
Counter = Counter + 1
Loop
objTextFile.Close
objOutTextFile.Close
Msgbox "Done..."
the code works 100% by splitting every RecordSize = 1000000 rows, now I want some improvement by adding new field (Branch) for report by branch and split the huge file into separate output file based on Branch code (example branch code: AAA, BBB, CCC, DDD etc). the input file already sorted by branch, so no need more sort/order by procedure in the script.
one huge .txt file --> separate .txt file based on branch code and the output file would be the branch code it self. (for exp.: AAA.txt and so on..).
Any idea, how can I accomplish this using VBscript?
Upvotes: 0
Views: 5436
Reputation: 200293
You need to write to multiple files identified by your branch code. I'd probably use a dictionary for managing them, e.g. like this:
...
Set outFiles = CreateObject("Scripting.Dictionary")
Do Until objTextFile.AtEndOfStream
line = objTextFile.ReadLine
branchCode = Split(line, "|")(1)
If Not outFiles.Exists(branchCode) Then
outFiles.Add branchCode, fso.OpenTextFile(outputFile _
& "_" & branchCode & ".txt", ForWriting, True)
End If
outFiles(branchCode).WriteLine line
Loop
For Each branchCode In outFiles.Keys
outFiles(branchCode).Close
Next
...
Adjust the name of the output files as you see fit.
Upvotes: 1