Reputation: 1
I have a string:
string s= "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";
Which looks - formatted like this:
<tr>
<td>abc</td>
<td>1</td>
<td>def</td>
</tr>
<tr>
<td>aaa</td>
<td>2</td>
<td>bbb</td>
</tr>
Now I want get values "1" and "2", how do I do this? I have tried convert it to XML but not success.
Upvotes: 0
Views: 1665
Reputation: 176
Good day Brom
This might not be the solution you were looking for but it will definitely provide one of the many help.
I would use this regex to extract all the tags
(<\/[a-z]*>)+(<[a-z]*>)+|(<[a-z]*>)+(<\/[a-z]*>)+|(<[a-z]*>)+|(<\/[a-z]*>)+
Example:
string input = "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";
string replacement = "#";
string pattern = "(<\/[a-z]*>)+(<[a-z]*>)+|(<[a-z]*>)+(<\/[a-z]*>)+|(<[a-z]*>)+|(<\/[a-z]*>)+";
RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Compiled |
RegexOptions.Multiline;
Regex rgx = new Regex(pattern, options);
string result = rgx.Replace(input, replacement);
// result == "#abc#1#def#aaa#2#bbb#"
This regex expression will grab the tags as groups or as individuals and then you could replace it with a delimiter line a pipe "|" or "#" and split on that. I hope this helps.
Kind Regards
Ps. Regex explanation: Pipes are used as or operators
(<\/[a-z]*>)+(<[a-z]*>)+ // Closing tag(s) that are followed by opening tag(s)
(<[a-z]*>)+(<\/[a-z]*>)+ // Opening tags followed by closing tags
(<[a-z]*>)+ // one or more opening tags
(<\/[a-z]*>)+ // one or more closing tags
Upvotes: 0
Reputation: 694
string s = "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";
var regexPunctuation = s;
while (regexPunctuation != "")
{
regexPunctuation = System.Text.RegularExpressions.Regex.Match(s, @"\d+").Value;
s = s.Substring(s.IndexOf(regexPunctuation)+regexPunctuation.Length);
MessageBox.Show(regexPunctuation);
}
The regex matches every number in the string and the while loop goes through all of them. Do what ever you want intead of MessageBox.Show and you're good to go.
Upvotes: 0
Reputation: 503
Regex regex = new Regex("<td>(.*?)<\\/td>");
var maches = regex.Matches("<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>");
var values = maches.Cast<Match>().Select(m => m.Groups[1].Value).ToList();
Upvotes: 0
Reputation: 1555
string s = "<tr><td>abc</td><td>1</td><td>def</td></tr><tr><td>aaa</td><td>2</td><td>bbb</td></tr>";
s = s.Replace("<tr>","").Replace("</tr>","").Replace("</td>","");
string[] val = s.Split(new string[] { "<td>" }, StringSplitOptions.None);
string one = val[2];
string two = val[5];
I hope it will work for you.
Upvotes: 1
Reputation: 1222
You can use HTML Agility Pack. to achieve this
HtmlDocument doc = new HtmlDocument();
doc.Parse(str);
IEnumerable<string> cells = doc.DocumentNode.Descendants("td").Select(td => td.InnerText);
Upvotes: 2