Tri Le Minh
Tri Le Minh

Reputation: 11

Extract Inline images from Gmail Body

I am trying to extract all the images store in Inline body of email, store it in drive folder.

I am tryin to use this code

function GETGMEmails(){ 
   var label = GmailApp.getUserLabelByName ('WHOLESALE REP');
   var threads = label.getThreads();
   for(var i = threads.length - 1; i >= 0; i--){
    var messages = threads[i].getMessages();
    for (var j = 0; j < messages.length; j++){
      var message = messages[j];
       // extractDetails(message,folder)
       fetchInlineImage(message)
      }
    }
}

function fetchInlineImage(message) {
      var msg = message;
      console.log(message)
      var pattern = /<img.*src="([^"]*)"[^>]*>/;
      var matches = pattern.exec(msg.getBody());
      console.log(matches.length)
      if(matches) {
        var url = matches[1];
        
        var urlPattern = /^https*\:\/\/.*$/;
        // If this matches, that means this was copied and pasted from a browser and it's a
        // standard URL that we can urlFetch
        if(urlPattern.exec(url)) {
          // NO OP!
        } else {
          // Else this means the user copied and pasted from an OS clipboard and the image 
          // exists on Google's servers. The image can only be retrieved when logged in. Fortunately,
          // if we use URLFetchApp, this will act as a logged in user and be able to URLFetch the image.
          // We'll need to prepend a Gmail URL (subject to change)
          url = "https://mail.google.com/mail/u/0/" + url;         
        }
        
        // TODO - there is one more case that we're not covering - embedded images that newsletters use
        Logger.log("Fetching image from URL: " + url);
        
        var response = UrlFetchApp.fetch(url);
        Logger.log("Response status: " + Utilities.jsonStringify(response.getHeaders()));
          
        var blob = response.getBlob();
        Logger.log("Response blob: " + blob.getBytes().length);
        Drivefolder.createFile(blob).setName('ss.jpeg')
      }  
};

Email look like this, with lot of images here and there, and I want to extract each one of them:-

enter image description here

Upvotes: 1

Views: 560

Answers (1)

Tedinoz
Tedinoz

Reputation: 8044

You are trying to access inline images from a GMail message.

When GMail was first introduced there was no ability to access inline images.

In short, the script that you are using is redundant. However, the process to access inline images is simple and straightforward. The key is to examine the body by Regex.

The following script provides a basis for you to identify and log inline images.

  • Put headers in row 1 of sheet="Images".
    • A1="ID", B1="Subject", C1="Image"

function so75327302() {
  
  var label = GmailApp.getUserLabelByName ('WHOLESALE REP');
  var threads = label.getThreads();
   
  for (var i=0;i<threads.length;i++){
    var messages = threads[i].getMessages()
    for (m=0;m<messages.length;m++){
      // Logger.log("DEBUG: i:"+i+",m:"+m+", subject:"+messages[m].getSubject()+", message ID:"+messages[m].getId())

        var body = messages[m].getBody()
        
        var regex1 = RegExp('<img src="([^"]*)"[^>]*>', 'g')
        var array1 = new Array
        var images = new Array

        while ((array1 = regex1.exec(body)) !== null) {
          Logger.log("message ID:"+messages[m].getId()+", Subject: "+messages[m].getSubject()+" contains inline images."+ `Found ${array1[0]}`);
          images.push([array1[0]])
        }     
        if (images.length > 0){
          var ss = SpreadsheetApp.getActiveSpreadsheet()
          var sheetName = "Images"
          var sheet = ss.getSheetByName(sheetName)
          var imageLR = sheet.getLastRow()
          sheet.getRange(imageLR+1,1).setValue(messages[m].getId())
          sheet.getRange(imageLR+1,2).setValue(messages[m].getSubject())
          sheet.getRange(imageLR+1,3,images.length).setValues(images)
        }
    }
  }
}

Upvotes: 2

Related Questions