DannyTree
DannyTree

Reputation: 1173

filtering externally loaded javascript in htmlunit

While using htmlunit to scrape a webpage, I occasionally notice warnings like these that flood the console output.

Jul 24, 2011 5:12:59 PM com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter warning
WARNING: warning: message=[Calling eval() with anything other than a primitive string value 
will simply return the value. Is this what you intended?] sourceName=[http://ad.doubleclick.net/adj/N5762.morningstar.com/B5553006.25;sz=728x90;click0=http://ads.morningstar.com/RealMedia/ads/click_lx.ads/www.morningstar.com/quicktake/fund/L34/648978540/TopLeft/Morningstar/JPM_FRpt_728x90_Jul_3827448/Fund_Reports_728x90_content.html/656d5477595534723465554144664a2b?;ord=648978540?] line=[356] lineSource=[null] lineOffset=[0]

Is there a way that I can have htmlunit ignore javascript from

or even just

Likewise, is there a way to have htmlunit only interpret the javascript on a webpage containing a particular substring or matching a regex?

Upvotes: 4

Views: 836

Answers (1)

MrSmith42
MrSmith42

Reputation: 10161

You might be able to remove the unwanted javascript by implementing your own ScriptPreProcessor. Your ScriptPreProcessor could detect the jsvascript you do not want to execute and than remove it from the web site.

I have not tried it yet, but might work.

Upvotes: 2

Related Questions