Brandon Watson
Brandon Watson

Reputation: 1775

Microsoft Web Matrix

Pretty easy question I hope: does anyone know of a tool that will effectively scrape sites built with Microsoft Matrix? I could write the code in python, but it will take me way longer than I think I want to dedicate to the task, namely because of the really bad and ugly HTML generated by Matrix.

I have tried Web Harvey, Helium Scraper, and I tried the Web Scraper plugin for Chrome. WebHarvey choked on the HTML and couldn't load subsequent pages. Helium Scraper was able to move from one details page to another (the Next links were followed) but content from within the details pages was not lifted out. The Chrome plugin web scraper was not able to navigate links, with the popup window displaying an error page. My gut is telling me that this has to do with uniquely ASP.net things, but I could be wrong.

Any pointers or suggestions appreciated.

Upvotes: 1

Views: 107

Answers (1)

Knox
Knox

Reputation: 2919

You know there are two completely different versions of Microsoft Web Matrix right? There's the one from 2003; i have no idea what its html looks like. There's the one from 2011 to current which uses razor cshtml source files to produce its html. In the 2011+ one, you write the html by hand; there's no drag and drop, and so it's unlikely you'll get consistent html from site to site.

Upvotes: 1

Related Questions