Reputation: 358
I'm trying to get a match against XML data as string for a specific id and a name from a listbox.
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
'website
Dim link As String = "https://s25-pt.ogame.gameforge.com/api/players.xml"
Dim html As String
'name selected on listbox
Dim jogador As String = ListBox1.Text
Dim pattern As String = "player id=""(.*?)"" name=""" & jogador & """"
webc1 = New WebClient
webc1.Headers.Add("user-agent", "Mozilla/5.0 (Windows; U; Windows NT 5.0; es-ES; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3")
html = webc1.DownloadString(link)
Dim match As Match = Regex.Match(html, pattern)
If match.Success Then
MsgBox(match.Groups(1).Value)
End If
End Sub
I'm not getting just the id but also I get a big piece of the 'html' string.
I tried to look for answer's on google, I tried other patterns but i don't get how to solve this problem. Is there a way I can improve my regex ?
I know this is xml, and I probably could get it using other method more appropriate, but i find this way easier.
Upvotes: 1
Views: 715
Reputation: 1268
I just couldn't resist this since RegEx against XML is just not a good idea.
Your link to the sample XML was kind enough to offer up a schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="players">
<xs:complexType>
<xs:sequence>
<xs:element name="player" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="id" use="required" type="xs:integer"/>
<xs:attribute name="name" use="required" type="xs:string"/>
<xs:attribute name="status" use="optional">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="(a|[vIibo]+)"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
<xs:attribute name="alliance" type="xs:string"/>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="timestamp" type="xs:integer"/>
<xs:attribute name="serverId" type="xs:string"/>
</xs:complexType>
</xs:element>
</xs:schema>
This produces the following two classes (we don't care about the restriction in this case):
Imports System.Net
Imports System.IO
Imports System.Text
Imports System.Collections.Specialized
Imports System.Xml.Serialization
Imports System.Diagnostics
Imports System.Collections.Generic
Imports System.Linq
<XmlType(AnonymousType:=True, TypeName:="players"), XmlRoot(ElementName:="players")>
Public Class PlayerList
<XmlElement("player", Form:=XmlSchemaForm.Unqualified, ElementName:="player")>
Public Property Players() As New List(Of Player)
<XmlAttribute(AttributeName:="timestamp"), DefaultValue(0)>
Public Property Timestamp() As Integer
<XmlAttribute(AttributeName:="serverId"), DefaultValue("")>
Public Property ServerId() As String
Public Function Find(PlayerName As String) As Player
Return Players.FirstOrDefault(Function(p) p.Name = PlayerName)
End Function
End Class
<XmlType(AnonymousType:=True, TypeName:="player"), XmlRoot("player")>
Public Class Player
<XmlAttribute(AttributeName:="id"), DefaultValue(0)>
Public Property Id() As Integer
<XmlAttribute(AttributeName:="name"), DefaultValue("")>
Public Property Name() As String
<XmlAttribute(AttributeName:="status"), DefaultValue("")>
Public Property Status() As String
<XmlAttribute(AttributeName:="alliance"), DefaultValue("")>
Public Property Alliance() As String
End Class
I've added a Find
function in the PlayerList class for your button handler to call:
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
Dim Link As String = "https://s25-pt.ogame.gameforge.com/api/players.xml"
Dim MyPlayers As PlayerList = Nothing
With New WebClient
.Headers.Add("user-agent", "Mozilla/5.0 (Windows; U; Windows NT 5.0; es-ES; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3")
MyPlayers = Deserialize(.DownloadString(Link), GetType(PlayerList))
.Dispose()
End With
Dim MyPlayer As Player = MyPlayers.Find(ListBox1.Text)
If MyPlayer IsNot Nothing Then
Debug.Print("Player ID: {0}", MyPlayer.Id)
Debug.Print("Player Name: {0}", MyPlayer.Name)
Debug.Print("Player Status: {0}", MyPlayer.Status)
Debug.Print("Player Alliance: {0}", MyPlayer.Alliance)
Else
Debug.Print("Not Found")
End If
End Sub
Private Function Deserialize(XMLString As String, ObjectType As Type) As Object
Return New XmlSerializer(ObjectType).Deserialize(New MemoryStream(Encoding.UTF8.GetBytes(XMLString)))
End Function
Testing with Fantasma2
I get the following output:
Player ID: 100110
Player Name: Fantasma2
Player Status: vI
Player Alliance: 4762
Upvotes: 1
Reputation: 19299
If you try your regex on regex101 then it works fine e.g. running in pcre/ php mode. However, .NET regexes work a little differently from other implementations.
So, I tried with this regex instead and got a proper match:
player id="(\d+)" name="sniper lord"
Giving me a result of 1000042
from your data.
\d+
just means one or more digits - your XML data indicates the player IDs are numeric only so this 'tightens up' the regex. This also uses sniper lord
as a test value for jogador
.
Perhaps you can also use the String.Format
command to help out with the slightly confusing run of double quotes:
Dim pattern As String = String.Format("player id=""{0}"" name=""{1}""", "(\d+)", jogador)
Upvotes: 1