user3788761
user3788761

Reputation: 55

How do i parse specific text from long content?

contents is a long text. Somewhere inside in some places there is this text:

<img src="http://rotter.net/forum/Images/locked_icon_general.gif" border=

public static void FilterLockedThreads(string contents)
        {
            //string middle = \"http://rotter.net/forum/Images/locked_icon_general.gif\"";
            string firstTag = "<img src=";
            string lastTag = "border=\"";
            int f = 0;
            int startPos = 0;
            while (true)
            {

                f = contents.IndexOf(firstTag, startPos);
                if (f == -1)
                {
                    break;
                }
                int g = contents.IndexOf(lastTag, f);
                startPos = g + lastTag.Length;
                string responser = contents.Substring(f + firstTag.Length, g - f - firstTag.Length);
                lockedThreads.Add(responser);
            }
        }

I want that the List

http://rotter.net/forum/Images/locked_icon_general.gif

In this case there are 3 places in contents so lockedThreads should contain 3 indexs each one contain the string: http://rotter.net/forum/Images/locked_icon_general.gif

The problem the way the code is now im getting 3 indexs but each contain a verty long text/string and not only: http://rotter.net/forum/Images/locked_icon_general.gif

What is wrong with the code the way it is now ? Tried to use breakpoint but didn't find.

Upvotes: 0

Views: 54

Answers (1)

L.B
L.B

Reputation: 116168

Using HtmlAgilityPack

HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
var doc = web.Load(your url);
var imgUrls = doc.DocumentNode.SelectNodes("//img[@border and @src]")
                    .Select(i => i.Attributes["src"].Value)
                    .ToList();

Upvotes: 1

Related Questions