Reputation: 441
This may be a dumb question, but I really have no idea and I'm utterly curious! So please bear with me.
What I know is search engines just read HTML and words in a site. They usually ignore CSS or part of it. They arguably cannot read images. Do they?
If they really cannot or ignore to read those, then my question is how do they make screenshot, which is a page that is presented just the way as CSS makes it, and has images.
If they do not read CSS, images, and they also do not like human being to open it in his or her screen. How do they make the screenshot?
Thanks!
Upvotes: 0
Views: 786
Reputation: 2201
Are you referring to Google's new screenshot feature, or their old cache feature? Your question is talking about screenshots and doesn't mention the cache at all, but your comments on your question seem to imply that you're referring to the cache, not the screenshots.
In the case of the screenshots:
You are correct in that search engines usually only read the HTML and text on a website, because that's all they need. But that doesn't mean they can't.
When they want to take a screenshot of a site, they'll just do exactly what a normal browser does when a user visits the site. Download the website, the CSS, the images, and everything else, and render it with the rendering engine of a web browser, such as WebKit.
In the case of the cache:
The search engine usually just stores the HTML without/before parsing it. It sends the saved HTML to your browser, and your browser pulls all the other stuff in the page (images, etc) from the original website. The search engine isn't reading anything, it's just saving the page verbatim (well, with minor changes, namely URL rewriting), and giving it to your browser.
Upvotes: 1
Reputation: 15202
There are apps that takes screenshot of pages as if displayed in a chosen browser.
Browershot is an example of online service that does it.
Here are some links and projects of webpage thumbnail generator:
Upvotes: 1
Reputation: 27323
Search engine don't use the CSS and image content for indexing but they can store them on their servers to make a cached version of the site.
In the case of google I think they store only text files, so HTML, CSS, maybe javascript but no images.
Upvotes: 0
Reputation: 2042
Maybe I'm not understanding your question, but...
You seem to be using "read an image" to mean load the data from the image to the search engine. This the search engine does do (including CSS). When people say search engines ignore images they mean it doesn't see them as meaningful searchable data. In other words if I make an image that has the word "Hello" on it you and I "read" it in the sense that we see and understand that the image contains a word. A search engine typically will not attempt to do this, the search engine will however "read" the image into its storage if it wants to have the ability to present that to a user at a later time.
Upvotes: 0