Dominic K
Dominic K

Reputation: 7085

Parsing text file for hexadecimal content

I have this text file that contains approximately 22 000 lines, with each line looking like this:

12A4 (Text)

So it's in the format 4-letter/number (Hexdecimal) and then text. Sometimes there is more than one value in text, separated by a comma: A34d (Text, Optional)

Is there any efficient way to search for the Hex and then return the first text in the parentheses? Would it be much more effective if I stored this data in SQLite?

Upvotes: 1

Views: 641

Answers (6)

Wildhorn
Wildhorn

Reputation: 932

Use a StreamReader to ReadLine and you can then check if the first characters are equal to what you search and if it is you can do

string yourresult = thereadline.Split
                    (new string[]{" (",","}, 
                     StringSplitOptions.RemoveEmptyEntries)[1]

Upvotes: 1

NeverDie
NeverDie

Reputation: 21

There are elegant answers posted already, but since you requested regex, try this:

var regex = @"^(?<hexData>.{4}\s(?<textData>.*)$)";
var matches = Regex.Matches
              (textInput, regex, RegexOptions.IgnoreWhiteSpace 
               | RegexOptions.Singleline);

then you parse through matches object to get whatever you want.

Upvotes: 2

Lasse Espeholt
Lasse Espeholt

Reputation: 17792

var lines = ...;

var item = (from line in lines
            where line.StartsWith("a34d", StringComparison.OrdinalIgnoreCase)
            select line).FirstOrDefault();

//if item == null, it is not found

var firstText = item.Split('(',',',')')[1];

It works and if you want to strip leading and trailing whitespaces from firstText then add a .Trim() in the end.

For splitting a text into several lines, see my two answers here. How can I convert a string with newlines in it to separate lines?

Upvotes: 1

Crispy
Crispy

Reputation: 5637

Example using substring and split.

        string value = "A34d (Text, Optional)";

        string hex = value.Substring(0, 4);
        string text = value.Split('(')[1];

        if (text.Contains(','))
            text = text.Substring(0, text.IndexOf(','));
        else
            text = text.Substring(0, text.Length-1);

For searching use a Dictionary.

Upvotes: 5

OscarRyz
OscarRyz

Reputation: 199324

That's probably < 2 mb of data.

I think you can:

  1. Read the whole file
  2. Split each line in key ( the hex number ) and value ( the remaining ) Chris Persichetti answer is excellent for that
  3. Store each line in a dictionary ( using the number as int , nor as string )

    d = Dictionary<int,string>
    d.put( int.Perse( key ), value );
    
  4. Keep that dictionary in memory and then perform a very quick look up by the id

Upvotes: 3

Daren Thomas
Daren Thomas

Reputation: 70344

If you want to search for the Hex value more than once, you definitely want to store this in a lookup table of some sort.

This could be as simple as a Dictionary<string, string> that you populate with the contents of your file on startup:

  • read each line (StreamReader.ReadLine)
  • hexString = substring of first 4 characters in line
  • store the rest of the string

To find the first part, create a function that retrieves "A" from "(A, B, C, ...)"

If you can rule out commas "," in "A", you are in luck: Remove the parentheses, split on "," and return first substring.

Upvotes: 1

Related Questions