Silentbob
Silentbob

Reputation: 3065

Extract a section of a line in a text file

I am trying to extract some specfic data from a pdf, I have managed to extract the text from the pdf and placed it into a txt file. The data that is placed in the text file is one long line. I need to extract a specfic part of the line.

If it starts with 'UK' and ends with '- -'

I have been trying using.

        Using read = New StreamReader(fName)
        Dim line As String = read.ReadToEnd
        If line.StartsWith(" UK") And line.Contains("- -") Then

        Else
            'do nothing
        End If

    End Using

Startswith doesn't work as the line doesn't start with 'UK'. I can use line.contains as it does find UK but the line contains multiple instances of '- -'.

The section I need looks like the following

UK (0.6085)* (£) 1.6435 -0.0062 0.8206 -0.0017 - -

I am using vb.net in MS Visual studio 2013.

Can anyone offer some help?

Upvotes: 0

Views: 789

Answers (3)

M. Ferry
M. Ferry

Reputation: 147

Simple solution:

If line like "*UK*- -" then
    'do something
Else
    'do nothing
End If

Upvotes: 0

spajce
spajce

Reputation: 7092

How about StartWith and EndsWith.

if (src.StartsWith("UK") AND src.EndsWith("- -")) Then
    'True
End If

Upvotes: 0

lb.
lb.

Reputation: 5776

Try using the Regex class:

Dim regex As New Regex("UK.*-\s?-\s?", RegexOptions.Singleline)
Dim match As Match = regex.Match(a)

If match.Success Then
    ' Do stuff
End If

Inside the If..Then you could loop through a series of matches via the Match.Captures collection property.

For Each c As Capture In result.Captures
    ' c.Value
Next

Regular expressions are a great tool for text matching, extraction, etc. get used to using them if you do a fair bit of this. I've found RegexStudio to be quite handy in testing .NET Regex patterns on the fly before I use them in code.

Upvotes: 1

Related Questions