Reputation: 321
Below is an example of a parsing program. It takes text from a text file and parses the data using string manipulation, and a couple loops:
Dim myFile As String
Dim text As String
Dim textline As String
Dim cstAct as integer
Dim actOpe as integer
Dim cusNam as integer
Dim act as integer
Dim reg as integer
myFile = "put file patch to text file here"
myFile = Application.GetOpenFilename()
Here is the do loop that I would like to pause once it reaches line 3 (the next account record)
Do Until EOF(1)
Line Input #1, textline
text = text & textline
Loop
cusAct = InStr(text, "ACCOUNT ")
actOpe = InStr(text, "ACCOUNT OPEN:")
reg = InStr(text, "REGION:")
cusNam = InStr(text, "CUSTOMER NAME:")
This is the for...loop I wish to execute once the do...loop stops or 'pauses once it reaches the next record
For i = 2 To ThisWorkbook.Worksheets("b2").Range("a65536").End(xlUp).Row
ThisWorkbook.Worksheets("name").Range("a" & i).Value = Mid(text, cstAct + 6, 9)
ThisWorkbook.Worksheets("name").Range("b" & i).Value = Mid(text, actOpe + 13, 27)
ThisWorkbook.Worksheets("name").Range("c" & i).Value = Mid(text, reg + 6, 9)
ThisWorkbook.Worksheets("name").Range("d" & i).Value = Mid(text, cusNam + 20, 19)
This is where I want to resume the 'do...loop' so that new the sub strings (ie 987654321 would be the new substring that results from Mid(text, cstAct + 6, 9)) of its respective parent string (ie ACCOUNT) refresh so to speak otherwise, lines 1 and 2 will just loop over and over again.
next i
Below is an example of the sample text file:
ACCOUNT ABCDEF12
ACCOUNT OPEN: 05/10/15 ACT TYPE: PREMIUM
CUSTOMER NAME: JOHN B. SMITH CSA REP: 154983
CUSTOMER ADDRESS: 123 SOMEWHERE DRIVE SOMETHING HERE:
LAST ORDER: 06/24/2011 COUNTRY CODE: UNITED STATES
INVOICE #: 123456789 STATE CODE: CALIFORNIA
LAST MAINTENANCE: 01/02/15 COUNTY CODE: UNCODED
SOME INDICATOR: NO COMPLAINTS: NO IPM IND: DATAPREP/PERF4
SOME INDICATOR: NO STATUS: NONE AUTO RENEW: YES
SOMETHING HERE: NO
SOMETHING HERE: ABC IND:
SOMETHING HERE: 2 ABC ASSET NO: T
ACCOUNT ZXYFDG13
ACCOUNT OPEN: 05/10/15 ACT TYPE: PREMIUM
CUSTOMER NAME: JANE B. SMITH CSA REP: 154983
CUSTOMER ADDRESS: 123 SOMEWHERE DRIVE SOMETHING HERE:
LAST ORDER: 06/24/2011 COUNTRY CODE: UNITED STATES
INVOICE #: 123456789 STATE CODE: CALIFORNIA
LAST MAINTENANCE: 01/02/15 COUNTY CODE: UNCODED
SOME INDICATOR: NO COMPLAINTS: NO IPM IND: DATAPREP/PERF4
SOME INDICATOR: NO STATUS: NONE AUTO RENEW: YES
SOMETHING HERE: NO
SOMETHING HERE: ABC IND: NO
SOMETHING HERE: 2 REGION: NE
without adjusting the above code structure, output in excel will look like this:
A B C D
ROW 1 123456789 00/00/0000 NY JON SMITH
ROW 2 123456789 00/00/0000 NY JON SMITH
I am trying to get it to look like this:
A B C D
ROW 1 123456789 00/00/0000 NY JON SMITH
ROW 2 987654321 00/00/0000 FL JANE SMITH
Any thoughts on how to best do this?
Upvotes: 2
Views: 4613
Reputation: 1489
If you know the literal structure of each "record type", then you can declare them as VBA User Defined Type
structures for reading (and writing). Further, it looks like you can simplify your efforts with a slightly different code design and improve your error handling.
Consider how I would approach this problem using UDFs, which makes the code so much more readable and therefore, maintainable:-
'Always set this to ensure you have all variables declared
Option Explicit
'User Defined Types for each record format
Private Type AccountInfoType
OpenText As String * 18 'Absorb all text and prefixes up to data
OpenDate As String * 8 'Contains the data
AccTypeText As String * 24 'Absorb all text and prefixes up to data
AccType As String * 7 'Contains the data
'Add additional fields here
CRLF As String * 2 'CR/LF character
End Type
Private Type CustomerNameType
NameText As String * 18 'Absorb all text and prefixes up to data
Name As String * 20 'Contains the data
CsaRepText As String * 12 'Absorb all text and prefixes up to data
CsaRep As String * 6 'Contains the data
'Add additional fields here
CRLF As String * 2 'CR/LF character
End Type
Private Type AddressType
AddressText As String * 18 'Absorb all text and prefixes up to data
AddressData As String * 20 'Contains the data
SomethingHereText As String * 17 'Absorb remaining text
'Add additional fields here
CRLF As String * 2 'CR/LF character
End Type
Private Type LastOrderType
LastOrderText As String * 18 'Absorb all text and prefixes up to data
LastOrderDate As String * 10 'Contains the data
CountryText As String * 27 'Absorb all text and prefixes up to data
Country As String * 13 'Contains the data
'Add additional fields here
CRLF As String * 2 'CR/LF character
End Type
Private Type InvoiceType
InvoiceText As String * 18 'Absorb all text and prefixes up to data
InvoiceNumber As String * 9 'Contains the data
StateText As String * 28 'Absorb all text and prefixes up to data
State As String * 10 'Contains the data
'Add additional fields here
CRLF As String * 2 'CR/LF character
End Type
Sub ParseFile()
Dim wb As Workbook
Dim ws As Worksheet
Dim row As Long
Dim dataRecord As String
Dim accountNumber As String
Dim accountInfo As AccountInfoType
Dim customerName As CustomerNameType
Dim address As AddressType
Dim lastOrder As LastOrderType
Dim invoice As InvoiceType
Dim myFile As Variant
'Consider using proper error handling
On Error GoTo ParseFileZ
myFile = Application.GetOpenFilename()
If myFile = False Then
'Not a fan of GoTo but better than running the whole method inside if/then block
GoTo ParseFileX
End If
'I started with a new workbook. Change this to open an exsting workbook if desired
Set wb = Application.Workbooks.Add
'Set this handle to your desired worksheet
Set ws = wb.Worksheets(1)
'Set up column headers here. I chose row 3 to allow for a heading in row 1. Choose your own...
ws.Range("A3").Value = "Acc Number"
ws.Range("B3").Value = "Acc Opened"
ws.Range("C3").Value = "Region"
ws.Range("D3").Value = "Name"
'Base output row in the worksheet
row = 3
'Open the file in binary mode so that you can use User Defined Types to read each record
Open CStr(myFile) For Binary As #1
While Not EOF(1)
'Read next record
Input #1, dataRecord
'Find the first record of the next account - otherwise, skip until you get one
If Left(dataRecord, 7) = "ACCOUNT" And Len(dataRecord) = 16 Then
'Found the Account Number record. This is the start of the next account
accountNumber = Mid(dataRecord, 9, 8)
Get #1, , accountInfo 'Read the Account info record
Get #1, , customerName 'Read the Customer Name record
Get #1, , address 'Read the Address record
Get #1, , lastOrder 'Read the Last Order record
Get #1, , invoice 'read the Invoice record
'Ignore the remaining records unless you want to get more data. The "Read Next Record" loop will skip them
'Get the next row number on the output worksheet to write values to
row = row + 1
'Assign the values from the various records
ws.Cells(row, 1).Value = Trim(accountNumber)
ws.Cells(row, 2).Value = Trim(accountInfo.OpenDate)
ws.Cells(row, 3).Value = Trim(invoice.State) '(you talk about "region" but no region in data sample)
ws.Cells(row, 4).Value = Trim(customerName.Name)
'Add more cells for additional records you want to extra fields from here
End If
Wend
'We're finished. Close the file
Close #1
'Resize the cells for readibilty
ws.Cells.EntireColumn.AutoFit
ParseFileX:
'Disable error handling
On Error GoTo 0
'Be a good memory citizen
Set ws = Nothing
Set wb = Nothing
Exit Sub
ParseFileZ:
MsgBox Err.Number & " - " & Err.Description, "Error occurred"
Resume ParseFileX
End Sub
Upvotes: 2