Reputation: 37075
I'm using wget
to grab a something from the web, but I don't want to follow a portion of the page. I thought I could set up a proxy that would remove the parts of the webpage I didn't want to be processed, before returning it to wget but I'm not sure how I would accomplish that.
Is there a proxy that lets me easily modify the http response in python or node.js?
Upvotes: 1
Views: 4152
Reputation: 11247
In nodejs I would fork node-http-proxy and customize the code to my needs.
Much simpler that writing an http proxy from scratch, IMHO.
Upvotes: 0
Reputation: 1868
There are several ways you could achieve this goal. This should get you started (using node.js). In the following example I am fetching google.com and replacting all instances of "google" with "foobar".
// package.json file...
{
"name": "proxy-example",
"description": "a simple example of modifying response using a proxy",
"version": "0.0.1",
"dependencies": {
"request": "1.9.5"
}
}
// server.js file...
var http = require("http")
var request = require("request")
var port = process.env.PORT || 8001
http.createServer(function(req, rsp){
var options = { uri: "http://google.com" }
request(options, function(err, response, body){
rsp.writeHead(200)
rsp.end(body.replace(/google/g, "foobar"))
})
}).listen(port)
console.log("listening on port " + port)
Upvotes: 6