Neeraj Verma
Neeraj Verma

Reputation: 2314

Image Extraction : uri is too long

I am working on Image Extraction Software from A WebPage . have created a function

 public static void GetAllImages()
        {

            WebClient x = new WebClient();
            string source = x.DownloadString(@"http://www.bbc.com");

            var document = new HtmlWeb().Load(source);
            var urls = document.DocumentNode.Descendants("img")
                                .Select(e => e.GetAttributeValue("src", null))
                                .Where(s => !String.IsNullOrEmpty(s));

            document.Load(source);


        }

It says "Uri is too long " ..

I tried to use Uri.EscapeDataString .. But not getting idea where to put it

Any Help would be appreciated

Upvotes: 0

Views: 120

Answers (1)

spender
spender

Reputation: 120450

HtmlWeb.Load takes a URL as its source and deals with the downloading of the content. You don't need a supplementary WebClient to do this, it's all taken care of.

What you are doing is downloading the content, then attempting to use the downloaded content (HTML) as a URL (probably under the assumption that Load means Parse).

So remove

WebClient x = new WebClient();
string source = x.DownloadString(@"http://www.bbc.com");

then change the next line to

var document = new HtmlWeb().Load(@"http://www.bbc.com");

and you'll be good to go.

Upvotes: 1

Related Questions