Finger twist
Finger twist

Reputation: 3786

Storing HTML snippets with Python

I'm scrapping pages using Beautiful Soup and I would like to save some html snippets offline and use them to compare with every time I scrape again to check if there as been any change to the page .

Aside from directly writing out an html file, what would be the best strategy for save a lot of html snippets offline ( which format ) for comparison use later on ?

Thank you

Upvotes: 0

Views: 87

Answers (1)

Michael Lorton
Michael Lorton

Reputation: 44386

This is a classic use for a hash function. Algorithms like md5 and sha256 boil any amount of text down to a few bytes. You can store just the hashes for any file you parse, and then when you get a new file, calculate the hash of that and compare the two hashes.

Upvotes: 2

Related Questions