M_K
M_K

Reputation: 3455

Regex Matching, cascaded tags

Hi I am trying to get results from the tags below, what I need to achieve is to get the first match in the tags, then the fifth match, then the ninth match, so the first and then every fifth match. So my results would be, Note I realize this isnt the best way to parse HTML but I really only need it for this

The regex I am using is

<td class="stat">(.*?)<\/td>

The code I am using is

private static ObservableCollection<Top> top = new ObservableCollection<Top>();
    
public void twit_topusers_DownloadStringCompleted(Object sender, DownloadStringCompletedEventArgs e)
    {
            string str;
            // Size the control to fill the form with a margin
            str = (string)e.Result;



             
            Regex r = new Regex("<td class=\"stat\">(.*?)</td>");
            // Find a single match in the string.
            Match m = r.Match(str);
            




            while (m.Success)
            {

                testMatch = "";

                //
                testMatch += System.Text.RegularExpressions.Regex.Unescape(m.Groups[0].ToString()).Trim();

                

                top.Add(new Top(testMatch));
                m = m.NextMatch();

            }

            listBox.ItemsSource = top;
        
        
    }



    }

The tags are

<td class="stat">14307149</td>//FIRST
<td class="stat">679761</td>
<td class="stat">3508</td>
<td class="stat">62 months ago</td>
<td class="stat">1430700</td>//FIFTH
<td class="stat">679761</td>
<td class="stat">3508</td>
<td class="stat">72 months ago</td>
<td class="stat">1430600</td>//NINTH
<td class="stat">679761</td>
<td class="stat">3508</td>
<td class="stat">82 months ago</td>

But the results I am getting are

Match 1 14307149

Match 2 679761

Match 3 3508

Match 4 62 months ago

Match 5 1430700

Match 6 679761

Match 7 3508

Match 8 72 months ago

Match 9 14307149

Match 10 679761

Match 11 3508

Match 12 62 months ago

The results I need are

Match 1 14307149

Match 2 1430700

Match 3 1430600

Can you help me with this?

Upvotes: 0

Views: 284

Answers (2)

Jake
Jake

Reputation: 4234

It doesn't look like you're checking for the row number at all. If you simply add a counter, then check if its mod of 4 is zero, you'd be good.

counter = 0;
while (m.Success)
{
        if( counter % 4 == 0 )
        {
            testMatch = "";

            //
            testMatch += System.Text.RegularExpressions.Regex.Unescape(m.Groups[0].ToString()).Trim();



            top.Add(new Top(testMatch));
            m = m.NextMatch();

        }
        counter++;
}

Note: I am not a WP7 developer, so this code might be slightly off depending on the way WP7's coding system works.

Upvotes: 2

thumbmunkeys
thumbmunkeys

Reputation: 20764

Change it as follows to match only numbers:

     <td class="stat">(\d+)<\/td>

If I get you correctly you have to first split the string by months ago and then match the results of the split operation by the above regex.

Upvotes: 0

Related Questions