Reputation: 4577
so I am transferring an old website to a new server, and attempting cleanup in the process.
What I am looking for is some script or free software that can:
a) show the paths through the website (following hyperlinks, etc), so I can see what links to what
and b) some software than can see which html files are orphans (not linked to) in the folder structure.
Any help with either or both of these would be greatly appreciated :)
Upvotes: 0
Views: 781
Reputation: 4577
home.snafu.de/tilman/xenulink.html (Xenulink) provides link spidering, and, with FTP access, orphan file checking.
Upvotes: 0
Reputation: 26
a) depending on the complexity of your site and how dynamic the content is you can download any spider and restrict it to your wevsite and check the results("burp suite" contains a pretty good spider and is alltogether a tool that everyone should know).
b) after the spider have done its work check the access time of all the files in your wevsites directory any file that has an access time older than the spider execution time is probably an orphan.
(both solutions will be less effective on a website that use user input to reffer to pages)
Upvotes: 1
Reputation: 117487
a) show the paths through the website (following hyperlinks, etc), so I can see what links to what
So basically a crawler? You could whisk something together with an http-library, an html parser and any brand of scripting language. I don't know of any off-the-shelf scripts though.
and b) some software than can see which html files are orphans (not linked to) in the folder structure.
Does your site consist of plain html files, or is there some sort of server-side technology, such as PHP? If so, there is no way of automatically detecting said orphans, since they are generated as a function of the server side application and aren't actual pages, even though they may appear as such in a browser.
Upvotes: 1
Reputation: 57268
http://haveamint.com/ says it all, Beautiful GUI, Simple integration, Light Weight, Database Storage, JavaScript Tracking.
Have a mint (y)
Or you can just use Google analytic's witch is pretty much used by every site these days
Upvotes: 1