kevp
kevp

Reputation: 377

What is a way to extract data from this file type?

I have to extract Time and coordinate (both Easting and Northing) data from a text file. I want to loop through it and populate 3 parallel arrays with all of the data contained until I reach the EOF. I have done this with XML, but this time its a bit different. Might I add that this is a SMI file that is used for CC for video files. I will post an example below:

<SAMI>
<HEAD>
    <STYLE TYPE="Text/css">
    <!--
        P {margin-left: 29pt; margin-right: 29pt; font-size: 24pt; text-align: center; font-family: Tahoma; font-weight: bold; color: #FF0000; background-color: #000000;}
        .SUBTTL {Name: 'Subtitles'; Lang: en-US; SAMIType: CC;}
    -->
    </STYLE>
</HEAD>
<BODY>
    <SYNC START=0>
        <P CLASS=SUBTTL>E: 4444444 N: 4444444 Time: 13:42:07
    <SYNC START=10>
        <P CLASS=SUBTTL>E: 44444444 N: 3333330 Time: 13:42:08
    <SYNC START=1010>
        <P CLASS=SUBTTL>E: 33333333 N: 4444444 Time: 13:42:09
    <SYNC START=2010>
        <P CLASS=SUBTTL>E: 2222222 N: 3333333 Time: 13:42:10
</BODY>
</SAMI>

Thanks, Kevin

Upvotes: 1

Views: 311

Answers (1)

paparazzo
paparazzo

Reputation: 45096

Use the following pattern with .NET Regex

@"^(?:\s+<P CLASS=SUBTTL>E:)\s+(\d+)\s+N:\s+(\d+)\s+Time:\s+(\d\d:\d\d:\d\d)" 

The coordinates and time will be in Group[1], [2], [3]

I tested this.

        Regex r = new Regex(pat, RegexOptions.IgnoreCase);

        // Match the regular expression pattern against a text string.
        Match m = r.Match(input);
        if (m != null)
        {
            Debug.WriteLine(m.Groups[1].Value);
            Debug.WriteLine(m.Groups[2].Value);
            Debug.WriteLine(m.Groups[3].Value);
        }

Upvotes: 1

Related Questions