Reputation: 3065
I am trying to extract some specfic data from a pdf, I have managed to extract the text from the pdf and placed it into a txt file. The data that is placed in the text file is one long line. I need to extract a specfic part of the line.
If it starts with 'UK' and ends with '- -'
I have been trying using.
Using read = New StreamReader(fName)
Dim line As String = read.ReadToEnd
If line.StartsWith(" UK") And line.Contains("- -") Then
Else
'do nothing
End If
End Using
Startswith
doesn't work as the line doesn't start with 'UK'. I can use line.contains
as it does find UK but the line contains multiple instances of '- -'.
The section I need looks like the following
UK (0.6085)* (£) 1.6435 -0.0062 0.8206 -0.0017 - -
I am using vb.net in MS Visual studio 2013.
Can anyone offer some help?
Upvotes: 0
Views: 789
Reputation: 147
Simple solution:
If line like "*UK*- -" then
'do something
Else
'do nothing
End If
Upvotes: 0
Reputation: 7092
How about StartWith and EndsWith.
if (src.StartsWith("UK") AND src.EndsWith("- -")) Then
'True
End If
Upvotes: 0
Reputation: 5776
Try using the Regex class:
Dim regex As New Regex("UK.*-\s?-\s?", RegexOptions.Singleline)
Dim match As Match = regex.Match(a)
If match.Success Then
' Do stuff
End If
Inside the If..Then you could loop through a series of matches via the Match.Captures collection property.
For Each c As Capture In result.Captures
' c.Value
Next
Regular expressions are a great tool for text matching, extraction, etc. get used to using them if you do a fair bit of this. I've found RegexStudio to be quite handy in testing .NET Regex patterns on the fly before I use them in code.
Upvotes: 1