Chiperific
Chiperific

Reputation: 4686

DRY Strategy for looping over unknown levels of nested objects

My scenario is based on Gmail API.

I've learned that email messages can have their message parts deeply or shallowly nested based upon varying factors, but mostly the presence of attachments.

I'm using the Google API Ruby Client gem, so I'm not working with JSON, I'm getting objects with all the same information, but I think the JSON representation makes it easier to understand my issue.

A simple message JSON response looks like this (one parts array with 2 hashes inside it):

{
  "id": "175b418b1ff69896",
  "snippet": "COVID-19: Resources to help your business manage through uncertainty 20 Liters 500 PEOPLE FOUND YOU ON GOOGLE Here are the top search queries used to find you: 20 liters used by 146 people volunteer",
  "payload": {
    "parts": [
      {
        "mimeType": "text/plain",
        "body": {
          "data": "Hey, you found the body of the email! I want this!"
        }
      },
      {
        "mimeType": "text/html",
        "body": {
          "data": "<div>I actually don't want this</div>"
        }
      }
    ]
  }
}

The value I want is not that hard to get:

response.payload.parts.each do |part|
  @body_data = part.body.data if part.mime_type == 'text/plain'
end

BUT The JSON response of a more complex email message with attachments looks something like this (now parts nests itself 3 levels deep):

{
  "id": "175aee26de8209d2",
  "snippet": "snippet text...",
  "payload": {
    "parts": [
      {
        "mimeType": "multipart/related",
        "parts": [
          {
            "mimeType": "multipart/alternative",
            "parts": [
              {
                "mimeType": "text/plain",
                "body": {
                  "data": "hey, you found me! This is what I want!!"
                }
              },
              {
                "mimeType": "text/html",
                "body": {
                  "data": "<div>I actually don't want this one.</div>"
                }
              }
            ]
          },
          {
            "mimeType": "image/jpeg"
          },
          {
            "mimeType": "image/png"
          },
          {
            "mimeType": "image/png"
          },
          {
            "mimeType": "image/jpeg"
          },
          {
            "mimeType": "image/png"
          },
          {
            "mimeType": "image/png"
          }
        ]
      },
      {
        "mimeType": "application/pdf"
      }
    ]
  }
}

And looking at a few other messages, the object can vary from 1 to 5 levels (maybe more) of parts

I need to loop over an unknown number of parts and then loop over an unknown number of nested parts and the repeat this again until I reach the bottom, hopefully finding the thing I want.

Here's my best attempt:

def trim_response(response)
  # remove headers I don't care about
  response.payload.headers.keep_if { |header| @valuable_headers.include? header.name }

  # remove parts I don't care about
  response.payload.parts.each do |part|
    # parts can be nested within parts, within parts, within...
    if part.mime_type == @valuable_mime_part && part.body.present?
      @body_data = part.body.data
      break
    elsif part.parts.present?
      # there are more layers down
      find_body(part)
    end
  end
end

def find_body(part)
  part.parts.each do |sub_part|
    if sub_part.mime_type == @valuable_mime_part && sub_part.body.present?
      @body_data = sub_part.body.data
      break
    elsif sub_part.parts.present?
      # there are more layers down
      ######### THIS FEELS BAD!!! ###########
      find_body(sub_part)
    end
  end
end

Yep, there's a method calling itself. I know, that's why I'm here.

This does work, I've tested it on a few dozen messages, but... there has to be a better, DRY-er way to do this.

How do I recursively loop and then move down a level and loop again in a DRY fashion when I don't know how deep the nesting goes?

Upvotes: 0

Views: 139

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110725

You can compute the desired result using recursion.

def find_it(h, top_key, k1, k2, k3)
  return nil unless h.key?(top_key)
  recurse(h[top_key], k1, k2, k3)
end
def recurse(h, k1, k2, k3)
  return nil unless h.key?(k1)      
  h[k1].each do |g|
    v = g.dig(k2,k3) || recurse(g, k1 , k2, k3)
    return v unless v.nil?
  end
  nil
end

See Hash#dig.

Let h1 and h2 equal the two hashes given in the example1. Then:

find_it(h1, :payload, :parts, :body, :data)
  #=> "Hey, you found the body of the email! I want this!"
find_it(h2, :payload, :parts, :body, :data)
  #=> "hey, you found me! This is what I want!!"

1. The hash h[:payload][:parts].last #=> { "mimeType": "application/pdf" } appears to contain hidden characters that are causing a problem. I therefore removed that hash from h2.

Upvotes: 1

Zain Arshad
Zain Arshad

Reputation: 1907

No need to go through all this pain. Just keep diving in the parts dictionary until you find the first value where there is no parts anymore. At this moment you have the final parts in your parts variable.

Code:

reponse = {"id" => "175aee26de8209d2","snippet" => "snippet text...","payload" => {"parts" => [{"mimeType" => "multipart/related","parts" => [{"mimeType" => "multipart/alternative","parts" => [{"mimeType" => "text/plain","body" => {"data" => "hey, you found me! This is what I want!!"}},{"mimeType" => "text/html","body" => {"data" => "<div>I actually don't want this one.</div>"}}]},{"mimeType" => "image/jpeg"}]},{"mimeType" => "application/pdf"}]}}
parts   = reponse["payload"]
parts   = (parts["parts"].send("first") || parts["parts"]) while parts["parts"]
data    = parts["body"]["data"]
puts data

Output:

hey, you found me! This is what I want!!

Upvotes: 1

Related Questions