Reputation: 51
So i'm using this URL because i want so scrape html using axios and cheerio: https://www.mgewholesale.com/ecommerce/Bags%20and%20Cases.cfm?cat_id=876
I tested a get request in postman and it works fine with status 200
using this code works with status 200 as well but response.data is empty
update, so with this code i got the actual response with the data object filled, but when im trying to access to response.data it shows me this error:
const axios = require('axios');
const cheerio = require('cheerio');
const https = require('https');
let fs = require('fs');
const httpsAgent = new https.Agent({ keepAlive: true });
axios
.get('https://www.mgewholesale.com/ecommerce/Bags%20and%20Cases.cfm', {
httpsAgent,
params: {
cat_id: '876',
},
headers: {
'Accept-Encoding': 'gzip, deflate, br',
},
//is the same as set the entire url
})
.then((res) => {
console.log(res.data)
//this triggers the error
// let status = res.status;
console.log(status);
//Status 200
console.log(response)
//This brings the entire response with data object filled
});
ERROR:
(node:9068) UnhandledPromiseRejectionWarning: Error: read ECONNRESET
at TLSWrap.onStreamRead (internal/stream_base_commons.js:205:27)
(node:9068) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2)
(node:9068) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
I tried using the entire url and the url with its params it brings me empty data, but if i try with other url like: https://www.google.com it brings me the actual html.
Upvotes: 2
Views: 12767
Reputation: 386
The issue is that your query params are not getting added on correctly.
Remove + '.json'
from the second argument of axios.get
.
I'm surprised that this isn't throwing an error on its own, but apparently axios is just playing along and appending 0=[object+Object].json
, turning your URL into: https://www.mgewholesale.com/ecommerce/Bags%20and%20Cases.cfm?0=[object+Object].json
I can't add a comment to the other answer, but it is incorrect as you are properly using promise chaining ( .then
) after your call to .get()
.
Edit:
For this particular URL, it looks like you will need some additional headers, as well as the ability to keep the connection alive after the initial response:
const axios = require('axios'); //15k (gzipped: 5.1k)
const cheerio = require('cheerio');
const https = require('https');
let fs = require('fs');
const httpsAgent = new https.Agent({ keepAlive: true });
axios
.get('https://www.mgewholesale.com/ecommerce/Bags%20and%20Cases.cfm', {
httpsAgent,
params: {
cat_id: '876',
},
headers: {
'Accept-Encoding': 'gzip, deflate, br',
},
//is the same as set the entire url
})
.then((res) => {
let status = res.status;
console.log(status);
//This should now output the html content
console.log(res.data);
})
.catch(err => console.error(err));
Edit 2:
Added the proper method for handling errors to the code above.
Edit 3:
Make sure the variables you are logging in your .then()
block are all defined. Also, to get more helpful errors, add the .catch()
to the end:
.then((res) => {
console.log(res.data);
//this triggers the error
let status = res.status;
console.log(status);
//Status 200
console.log(res);
//This brings the entire response with data object filled
})
.catch(err => console.error(err));
Upvotes: 4