Oliver Hayman
Oliver Hayman

Reputation: 39

Asynchronously write data to GCS inside of a Promise

I'm trying to find a way to write json data to a file in a Google Cloud Storage bucket, inside of a promise.

What I'm finding is that if I try and .push() the values to an array one by one and then return that, it only gives me the first 3 results from the array (whereas console.log will return everything).

And if I try and write something within the local scope, it only returns the last value from the array (and overwrites all the previous values rather than appending them).

So essentially my question is: is there any way to write a promise or similar that will wait for all the looped through values to be gathered up, and once that's done return those values to a function that will then upload it all to GCS?

Or is there a way in which I can write these values to the .json file in GCS asynchronously, at the same time as the data is being scraped?

const urls = [/* 20+ URLs go here... */];
let promises = [];

// Build array of Promises
urls.map(function(url) {
  promises.push(axios.get(url));
});

// Map through the array of promises and get the response results
axios.all(promises).then((results) => {
  results.map((res) => {
    try {
      // Scrape the data
      const $ = new JSDOM(res.data);
      const data = {};

      data.title = ($.window.document.querySelector('head > title') !== null ? $.window.document.querySelector('head > title').text : '');
      data.description = ($.window.document.querySelector("meta[name='description']") !== null ? $.window.document.querySelector('meta[name="description"]').content : '');
      data.robots = ($.window.document.querySelector("meta[name='robots']") !== null ? $.window.document.querySelector("meta[name='robots']").content : '');

      const value = JSON.stringify(data) + '\n';

      // Tried array.push(value) here but doesn't return all the values?
      // Any way to return all the values and then bulk upload them to GCS outside of this code block?
      const file = storage.bucket(bucketName).file(filename);
      file.save(value, function(err) {
        if (!err) {
          // file written
        }
      })

    } catch(e) {
      console.log(e);
    }
  })
})

Sorry for the poor explanation, essentially I can't push all the values to an array and then upload that, and if I try to upload the values one by one I only get the last value in the looped through array.

Note: I'm not trying to save the data to a .json file locally with fs.writeFile() and then upload to GCS but send the JSON data directly to GCS without the step in between.

Upvotes: 1

Views: 537

Answers (1)

Yevhenii
Yevhenii

Reputation: 1703

if i correctly understood what do you need it should work

axios.all(promises).then((results) => {
  const uploads = results.map((res) => {
    try {
      // Scrape the data
      const $ = new JSDOM(res.data);
      const data = {};

      data.title = ($.window.document.querySelector('head > title') !== null ? $.window.document.querySelector('head > title').text : '');
      data.description = ($.window.document.querySelector("meta[name='description']") !== null ? $.window.document.querySelector('meta[name="description"]').content : '');
      data.robots = ($.window.document.querySelector("meta[name='robots']") !== null ? $.window.document.querySelector("meta[name='robots']").content : '');

      const value = JSON.stringify(data) + '\n';

      return new Promise((resolve, reject) => {
       const file = storage.bucket(bucketName).file(filename);
       file.save(value, function(err) {
         if (!err) {
           resolve()
         }
         reject()
       })
      });


    } catch(e) {
      console.log(e);
    }
  })
  return Promise.all(uploads);
})

Upvotes: 3

Related Questions