Reputation: 33
I am running a small personal python service that uses a paid rotating proxy with limited bandwidth to make requests to websites and scrapes data from websites that have no api.
My question is now how can I reduce the bandwidth that is used when scraping websites? Can i somehow only get pure text or something like that?
I appreciate any help
Upvotes: 0
Views: 957
Reputation: 923
Maybe you can try to add HTTP compression by adding content-encoding: gzip
header to you requests. If proxy and target website are supporting this, then you should be able to reduce the bandwidth. You can check this question on how to add such header in requests library.
Upvotes: 1