Reputation: 279
My company have Google Search running on our sites indexing all pages, as far as I know. I've developed a document system that is also being indexed by Google. The pages in the system are dynamically generated, so I have www.mysite.com/doc.aspx?id=234, www.mysite.com/doc.aspx?id=236, etc which are indexed. The thing is that some random pages (say, www.mysite.com/doc.aspx?id=235) are not indexed for some unknown reason. Where do I look to have this resolved? Any ideas?
Upvotes: 0
Views: 1313
Reputation: 3816
here is a short and very simpliefied outline on how google processes your site(s)
discovery -> crawling -> indexing -> ranking (->feedback)
discovery: is the process of google discovering the pages of your site(s), this can either be done via links in html or via an sitemap.xml (and urls in onpage javascript, rss or atom feeds, ... basically any url google can find somewhere)
crawling: the process of google fetching the content of a discovered url (and pushing newly found URLs into the discovery queue)
indexing: storing the discovered and crawled content into their database and making it searchable
ranking: matching the indexed content with a user query and - if it is important enough - return it as a visible SERP listing to the user.
feedback based on the click/no-click behavior and data collected from other sources (presumed ISDN data and google toolbar, chrome browser reports, ...) google gathers feedback about the user behavior on it's serp (and after the click).
so basically even if you communicate all your urls to google (i.e. via sitemap.xml) google will not necessarily crawl all of your urls or index or rank them visible.
ok, so what are the low hanging fruites to get more pages into the index (where they at least have a chance to rank for something)?
p.s.: just as a side-note - the crawling step is optional. even uncrawled urls (i.e. if they were blocked via robots.txt) can get indexed (and rank) - but well that's not very common
Upvotes: 6
Reputation: 700302
Not all pages are indexed, the index engine simply deems some pages to be uninterresting. On our site about 80% of the pages are indexed, and that is considered to be very good for that type of site, very few sites have a higher rate.
As Daniel mentioned, having links to the page is crucial, otherwise it won't be found at all. Then the page have to have some information that is unique for that page, and preferrably a unique title, or it may be classified as a duplicate.
Upvotes: 0
Reputation: 2921
I agree with Daniel. You need a page with a links list. Or a page with pagination listing links.
But dinamyc urls are bad for SEO, the best way is friendly url. Take a look to ISAPIRewrite or Routing.
I hope this help you.
Upvotes: 0
Reputation: 174299
Afaik, pages are not indexed, if they are not linked to from other pages. Maybe not a single page links to the non-indexed pages?
Upvotes: 0