Brandon Bearden
Brandon Bearden

Reputation: 850

Parsing text with random line breaks from a serial tool in C#

This is data from a serial tool over 6 runs. I have to be able to parse each line of data consistently.

Does anyone have an suggestions as to how I might fix this data so that I can ensure its sanity?

There is indeed a \r\n at the end of the first line. Sometimes it breaks the line up and other times it breaks it up in a random location. I cannot even be sure that it will not eat away into the important data which is the hex after the packet number. If I could ensure the data was sane, I could just do something like value.Split(':')[1].Split(' ')[1] and be done with it. I just cannot think of any way to splice it together. Also, to be clear, the entire dataset is in one string because of the way it has to be read from the tool. So, I was just doing string[] values = message.Split('\n'); and then iterating with a foreach loop.

Packe
t 0: 1A064140084243440842555803EE (overlay)
Packet 1: 1A06414258404643440842585503EE
Packet 2: 1D0608054203EE
Packet 3: 1D0608427273747571505703EE
...


Packet 0: 1A064140084243440842555803EE (overlay)
Packet 1: 1A06414258404643440842585503EE
Packet 2: 1D0608054203EE
Packet 3: 1D0608427273747571505703EE
...


P
acket 0: 1A064140084243440842555803EE (overlay)
Packet 1: 1A06414258404643
Packet 2: 1D0608054203EE
Packet 3: 1D0608427273747571505703EE
...


Packet 0
: 1A064140084243440842555803EE (overlay)
Packet 1: 1A06414258404643440842585503EE
Packet 2: 1D0608054203EE
Packet 3: 1D0608427273747571505703EE
...


Pack
et 0: 1A064140084243440842555803EE (overlay)
Packet 1: 1A06414258404643440842585503EE
Packet 2: 1D0608054203EE
Packet 3: 1D0608427273747571505703EE
...


Packet 0: 1A064140084243440842555803EE (overlay)
Packet 1: 1A06414258404643440842585503EE
Packet 2: 1D0608054203EE
Packet 3: 1D0608427273747571505703EE
...

Upvotes: 0

Views: 245

Answers (1)

w0rd-driven
w0rd-driven

Reputation: 933

A +1 to what Offler mentioned:

  1. Strip variations of \r\n
  2. Regex group by "Packet #: ". Everything after this until the next Packet entry is what you want.
  3. Regex.Split on the grouping
  4. Clean data as necessary

Here's a rough scratch program to illustrate:

        var reader = new StreamReader("Test.txt");
        var contents = reader.ReadToEnd();
        var strippedContents = contents.Replace("\r\n", "").Replace("\r", "").Replace("\n", "");
        var regex = new Regex(@"(?<packet>Packet \d+: )");
        var matches = regex.Split(strippedContents);

Test.txt just combines the data from your 1st and 3rd sections "Packe\r\n" and "P\r\n" but I kept the information sequential and without any spaces between the sections.

Number 4 above isn't in the scratch program but the data has the Packet section at every odd index of the array and index 0 is an empty string (""). It's not elegant but you're essentially telling Regex "match everything but this".

That likely isn't a final solution but it should at least get you started.

Upvotes: 1

Related Questions