Reputation: 701
To give you some background, many (if not all) websites load their images one by one, so if there are a lot of images, and/or you have a slow computer, most of the images wont show up. This is avoidable for the most part, however if you're running a script to exact image URLs, then you don't need to see the image, you just want its URL. My question is as follows:
Is it possible to trick a webpage into thinking an image is done loading so that it will start loading the next one?
Upvotes: 4
Views: 913
Reputation: 2212
use a plugin called lazy load. what it does is it will load the whole webpage and will just load the image later on. it will only load the image when the user scroll on it.
Upvotes: 1
Reputation: 11
I am using this, works as expected:
var imageLoading = function(n) {
var image = document.images[n];
var downloadingImage = new Image();
downloadingImage.onload = function(){
image.src = this.src;
console.log('Image ' + n + ' loaded');
if (document.images[++n]) {
imageLoading(n);
}
};
downloadingImage.src = image.getAttribute("data-src");
}
document.addEventListener("DOMContentLoaded", function(event) {
setTimeout(function() {
imageLoading(0);
}, 0);
});
And change every src
attribute of image element to data-src
Upvotes: 0
Reputation: 189
You want the "DOMContentLoaded" event docs. It fires as soon as the document is fully parsed, but before everything has been loaded.
let addIfImage = (list, image) => image.src.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g) ?
[image.src, ...list] :
list;
let getSrcFromTags= (tag = 'img') => Array.from(document.getElementsByTagName(tag))
.reduce(addIfImage, []);
if (document.readyState === "loading") {
document.addEventListener("DOMContentLoaded", doSomething);
} else { // `DOMContentLoaded` already fired
doSomething();
}
Upvotes: 0
Reputation: 1288
If you just want to extract images once. You can use some tools like
2) Software
If you want to run it multiple times. Probably use the above code https://stackoverflow.com/a/53245330/4674358 wrapped in if condition
if(document.readyState === "complete") {
extractURL();
}
else {
//Add onload or DOMContentLoaded event listeners here: for example,
window.addEventListener("onload", function () {
extractURL();
}, false);
//or
/*document.addEventListener("DOMContentLoaded", function () {
extractURL();
}, false);*/
}
extractURL() {
//code mentioned above
}
Upvotes: 0
Reputation: 2979
Typically browser will not wait for one image to be downloaded before requesting the next image. It will request all images simultaneously, as soon as it gets the src
s of those images.
Are you sure that the images are indeed waiting for previous image to download or are they waiting for a specific time interval?
In case if you are sure that it depends on download of previous image, then what you can do is, route all your requests through some proxy server / firewall and configure it to return an empty file with HTTP status 200 whenever an image is requested from that site.
That way the browser (or actually the website code) will assume that it has downloaded the image successfully.
how do I do that? – Jack Kasbrack
That's actually a very open ended / opinion based question. It will also depend on your OS, browser, system permissions etc. Assuming you are using Windows and have sufficient permissions, you can try using Fiddler. It has an AutoResponder functionality that you can use.
(I've no affiliation with Fiddler / Telerik as such. I'm suggesting it only as an example and because I've used it in the past and know that it can be used for the aforementioned purpose. There will be many more products that provide similar functionality and you should use the product of your choice.)
Upvotes: 5
Reputation: 439
To extract all image URLs to a text file maybe you could use something like this, If you execute this script inside any website it will list the URLs of the images
document.querySelectorAll('*[src]').forEach((item) => {
const isImage = item.src.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);
if (isImage) console.log(item.src);
});
You could also use the same idea to read Style from elements and get images from background url or something, like that:
document.querySelectorAll('*').forEach((item) => {
const computedItem = getComputedStyle(item);
Object.keys(computedItem).forEach((attr) => {
const style = computedItem[attr];
const image = style.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);
if (image) console.log(image[0]);
});
});
So, at the end of the day you could do some function like that, which will return an array of all images on the site
function getImageURLS() {
let images = [];
document.querySelectorAll('*').forEach((item) => {
const computedItem = getComputedStyle(item);
Object.keys(computedItem).forEach((attr) => {
const style = computedItem[attr];
const image = style.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);
if (image) images.push(image[0]);
});
});
document.querySelectorAll('*[src]').forEach((item) => {
const isImage = item.src.match(/(http(s?):)([/|.|\w|\s|-])*\.(?:jpg|jpeg|gif|png|svg)/g);
if (isImage) images.push(item.src);
});
return images;
}
It can probably be optimized but, well you get the idea..
Upvotes: 0