TrevorBrooks
TrevorBrooks

Reputation: 3840

Parse Text File with Variable Fields Vb.Net

A text file that I process has changed in the way data is formatted, so it's time to update the code that parses it. The old file had a fixed number of lines and fields per record and so parsing it by position was easy, of course now that isn't the case (I added the spaces for readability, the ~ indicates a new line, the * is the field separator):

~ENT*1*2J*34*111223333
~NM1*IL*1*SMITHJOHNA***N*123456789
~RMRIKH62XX/PAY/1234567/20150103**12345.67
~REFZZMEDPM/M/12345.67
~REF*LU*40/CSWI
~DTM*582****RD8*20150101-20150131

~ENT*2*2J*34*222334444
~NM1*IL*1*DOEJANES***N*234567891
~RMRIKH62XX/PAY/1234567/345678901**23456.78
~REF*LU*40/CSWI
~DTM*582****RD8*20141211-20141231

~ENT*3*2J*34*333445555
~NM1*IL*1*DOE*JOHN****N*3456789012
~RMRIKH62XX/PAY/200462975/20150103**45678.90
~REFZZMEDPM/M/3456.78
~REF*LU*40/CSWI
~DTM*582****RD8*20150101-20150131

~ENT*4*2J*34*444556666
~NM1*IL*1*SMITHJANED***N*456789012
~RMRIKH62XX/PAY/567890123/678901234**6789.01
~REFZZMEDPM/M/6789.01
~REF*LU*40/CSWI
~DTM*582****RD8*20150101-20150131

~ENT*5*2J*34*666778888
~NM1*IL*1*SMITHJONJ***N*8901234
~RMRIKH62XX/PAY/56789012/67890123**5678.90
~REFZZMEDPM/M/5678.90
~REF*LU*40/CSWI
~DTM*582****RD8*20150101-20150131

~ENT*6*2J*34*777889999
~NM1*IL*1*DOEBOBE***N*567890123
~RMRIKH62XX/PAY/34567890/45678901*5678.90
~REF*LU*40/CSWI
~DTM*582****RD8*20141210-20141231
~RMRIKH62XX/PAY/1234567890/2345678901**6789.01
~REFZZMEDPM/M/6789.01
~REF*LU*40/CSWI
~DTM*582****RD8*20150101-20150131

What is the best way to parse this data? Is there a better way than using StreamReader?

Upvotes: 0

Views: 892

Answers (2)

Jason Faulkner
Jason Faulkner

Reputation: 6558

You can get this into an 2-D array fairly easily:

' Dynamic structure to hold the data as we go.
Dim data As New List(Of String())

' Break each delimiter into a new line.
Dim lines = System.IO.File.ReadAllText("data.txt").Split("~")

' Process each line.
For Each line As String In lines
    ' Break down the components of each line.
    data.Add(line.Split("*"))
Next

' Produce 2-D array. Not really needed, as you can just use data if you want.
Dim dataArray = data.ToArray()

Now just iterate through the 2-D structure and process the data accordingly.

If you need to ensure your data always has a specific number of indexes (for example, some lines have 5 fields supplied, but you expect there to always be 8), you can can adjust the data.Add command like so:

' Ensure there are always at least 8 indexes for each line.
' This will insert blank (String.Empty) values into the array indexes if a line of data omits certain values.
data.Add((line & Space(8).Replace(" ", "*")).Split("*"))

Upvotes: 1

Heinzi
Heinzi

Reputation: 172260

String.Split is your friend.

If the file is not too large, the simplest approach would be to:

  • Read the file contents into a string variable (File.ReadAllText).
  • Split the "lines" (lines = allText.Split("~"c)).
  • Loop through the lines. For each line:
    • Split the line into fields (fields = line.Split("*"c))
    • Process the field values. You'll probably want to have a big Select Case statement on fields(0) and then proceed depending on the first field of the line.

Upvotes: 1

Related Questions