Reputation: 7559
I'm interesting in a method of finding web page data, for example, headline, cover image and some text, how it does Facebook when you want to share with a link on your wall.
I thought about it. Yes, I can send HTTP request to the page, get all web page and parse later. But how facebook does it successfully for each web page, because not all websites structure is the same.
What is the best algorithm for finding headline, cover image and some text from the inputted URL?
Upvotes: 0
Views: 711
Reputation: 1213
Check out the following script. They used meta tags to collect data from website. http://www.techumber.com/2012/11/exactly-facebook-like-url-parsing-using.html
Upvotes: 1
Reputation: 1355
No perfect solution to this. Facebook uses meta tags (their sets webmaster) to receive a normal result. If the tag is not present, the result is poor. If you are concerned about the practical side of the issue, for a start you should check the tag of Facebook and other social networks =)
Upvotes: 0