Reputation: 51
I have a text file that contains this:
<Person>
<Prenom>Jack</Prenom>
<Nom>Jhon</Nom>
<Adresse>4 rue de la Mélandine</Adresse>
<Tél></Tél>
<Email>[email protected]</Email>
<PhotoPath>c:\Program Files\Zonedetec\Gestionnaire de tâche v2\Img\5295f1ea-372a-4f2f-8f32-c52e8a48cc0839105.png</PhotoPath>
<Age>19</Age>
<Id>4640434</Id>
</Person>
<Person>
<Prenom>Jean</Prenom>
<Nom>Delamar</Nom>
<Adresse>13 rue de la Mélandine</Adresse>
<Tél></Tél>
<Email>[email protected]</Email>
<PhotoPath>c:\Program Files\Zonedetec\Gestionnaire de tâche v2\Img\5295f1ea-372a-4f2f-8f32-c52e8a48cc0839105.png</PhotoPath>
<Age>19</Age>
<Id>4640434</Id>
</Person>
I would like to retrieve all the values between the tags For example, in a list, I would like to retrieve the values (here 2) between and
How could I do this?
I tried this:
internal static void LoadPerson()
{
string data = File.ReadAllText(Main.PersonnePath);
Regex regex = new Regex("<Person>(.*)</Person>");
var v = regex.Match(data);
string s = v.Groups[1].ToString();
MessageBox.Show(s);
}
Except that s contains nothing at all
Can you help me? Thank you.
Upvotes: 2
Views: 709
Reputation: 23228
Since your file has an XML format, you can use XmlSerializer
for reading that, it's less painful, than parse it manually
Create a Person
class first (or generate using Edit -> Paste special -> Paste XML as classes in Visual Studio)
[Serializable]
public class Person
{
private string _prenomField;
private string _nomField;
private string _adresseField;
private object _télField;
private string _emailField;
private string _photoPathField;
private byte _ageField;
private uint _idField;
public string Prenom
{
get => _prenomField;
set => _prenomField = value;
}
public string Nom
{
get => _nomField;
set => _nomField = value;
}
public string Adresse
{
get => _adresseField;
set => _adresseField = value;
}
public object Tél
{
get => _télField;
set => _télField = value;
}
public string Email
{
get => _emailField;
set => _emailField = value;
}
public string PhotoPath
{
get => _photoPathField;
set => _photoPathField = value;
}
public byte Age
{
get => _ageField;
set => _ageField = value;
}
public uint Id
{
get => _idField;
set => _idField = value;
}
}
Than update a structure of file a little bit (you have to have one root tag)
<?xml version="1.0" encoding="utf-8" ?>
<people>
<Person>
<Prenom>Jack</Prenom>
<Nom>Jhon</Nom>
<Adresse>4 rue de la Mélandine</Adresse>
<Tél></Tél>
<Email>[email protected]</Email>
<PhotoPath>c:\Program Files\Zonedetec\Gestionnaire de tâche v2\Img\5295f1ea-372a-4f2f-8f32-c52e8a48cc0839105.png</PhotoPath>
<Age>19</Age>
<Id>4640434</Id>
</Person>
<Person>
<Prenom>Jean</Prenom>
<Nom>Delamar</Nom>
<Adresse>13 rue de la Mélandine</Adresse>
<Tél></Tél>
<Email>[email protected]</Email>
<PhotoPath>c:\Program Files\Zonedetec\Gestionnaire de tâche v2\Img\5295f1ea-372a-4f2f-8f32-c52e8a48cc0839105.png</PhotoPath>
<Age>19</Age>
<Id>4640434</Id>
</Person>
</people>
and finally parse it
var mySerializer = new XmlSerializer(typeof(Person[]), new XmlRootAttribute("people"));
Person[] people;
using (var fileStream = new FileStream(Main.PersonnePath, FileMode.Open))
{
people = (Person[])mySerializer.Deserialize(fileStream);
}
Don't forget to add using System.Xml.Serialization;
namespace. After deserialization people
array will contain all values you need, you can format them to any string/whatever you want. The best option here is override ToString()
method of Person
class to get required string representation of object
Upvotes: 3
Reputation: 2264
If you only need this values as plain text. you can use Regular Expression or XMLSerializer or (Linq to XML).
What you need to analyse before choose one approach or the other is:
1) What I need to do with this?
1.a) If you only needs the plain text inside each tag. And you will not do any validation / calc / re-parser. You can use both methods in a easy way.
1.a.1) Using Regular Expression:
public List<string> GetValueByRegex(string input)
{
string pattern = @"<Person>([\s\S]*?)</Person>";
var matches = Regex.Matches(input, pattern);
if (matches.All(m => !m.Success))
return null;
var result = new List<string>();
foreach (Match match in matches)
{
result.Add(match.Groups[1].Value);
}
return result;
}
1.a.2) Use XDocument to parse Xml string
Important: XDocument requires that your XML have one root Tag to work. As Your XML has two root Tags. I forced it with string interpolation
$"<root>{input}</root>"
public List<string> GetValueByXmlParse(string input)
{
var result = new List<string>();
var ensureThereAreOnlyOneRootTag = $"<root>{input}</root>";
XDocument xmlDocument = XDocument.Parse(ensureThereAreOnlyOneRootTag);
foreach(var personXml in xmlDocument.Root.Elements("Person"))
{
result.Add(String.Concat(personXml.Nodes()));
}
return result;
}
1.b) If you will do any thing with the data you extract from your XML should be better to parse it to an object.
You can make Visual Studio generate one by copy the XML value and click in Edit > Paste Special > Paste XML As Classes.
@PavelAnikhouski already share a good example for that.
2) I really need a good performance for that?
To answer that I use a Benchmark nuget package to compare all options. This is the result:
| Method | Gen 0 | Allocated |
|---------------------- |---------:|----------:|
| GetValueByRegex | 1.2207 | 2688 B |
| GetValueByXmlParse | 115.6006 | 243536 B |
Gen 0 : GC Generation 0 collects per 1000 operations
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
So, the answer is: Depends on what you need to do with the result of that. I hope I could help you to decide.
Best Regards
Upvotes: 4