Reputation: 4686
My scenario is based on Gmail API.
I've learned that email messages can have their message parts deeply or shallowly nested based upon varying factors, but mostly the presence of attachments.
I'm using the Google API Ruby Client gem, so I'm not working with JSON, I'm getting objects with all the same information, but I think the JSON representation makes it easier to understand my issue.
A simple message JSON response looks like this (one parts
array with 2 hashes inside it):
{
"id": "175b418b1ff69896",
"snippet": "COVID-19: Resources to help your business manage through uncertainty 20 Liters 500 PEOPLE FOUND YOU ON GOOGLE Here are the top search queries used to find you: 20 liters used by 146 people volunteer",
"payload": {
"parts": [
{
"mimeType": "text/plain",
"body": {
"data": "Hey, you found the body of the email! I want this!"
}
},
{
"mimeType": "text/html",
"body": {
"data": "<div>I actually don't want this</div>"
}
}
]
}
}
The value I want is not that hard to get:
response.payload.parts.each do |part|
@body_data = part.body.data if part.mime_type == 'text/plain'
end
BUT The JSON response of a more complex email message with attachments looks something like this (now parts
nests itself 3 levels deep):
{
"id": "175aee26de8209d2",
"snippet": "snippet text...",
"payload": {
"parts": [
{
"mimeType": "multipart/related",
"parts": [
{
"mimeType": "multipart/alternative",
"parts": [
{
"mimeType": "text/plain",
"body": {
"data": "hey, you found me! This is what I want!!"
}
},
{
"mimeType": "text/html",
"body": {
"data": "<div>I actually don't want this one.</div>"
}
}
]
},
{
"mimeType": "image/jpeg"
},
{
"mimeType": "image/png"
},
{
"mimeType": "image/png"
},
{
"mimeType": "image/jpeg"
},
{
"mimeType": "image/png"
},
{
"mimeType": "image/png"
}
]
},
{
"mimeType": "application/pdf"
}
]
}
}
And looking at a few other messages, the object can vary from 1 to 5 levels (maybe more) of parts
I need to loop over an unknown number of parts and then loop over an unknown number of nested parts and the repeat this again until I reach the bottom, hopefully finding the thing I want.
Here's my best attempt:
def trim_response(response)
# remove headers I don't care about
response.payload.headers.keep_if { |header| @valuable_headers.include? header.name }
# remove parts I don't care about
response.payload.parts.each do |part|
# parts can be nested within parts, within parts, within...
if part.mime_type == @valuable_mime_part && part.body.present?
@body_data = part.body.data
break
elsif part.parts.present?
# there are more layers down
find_body(part)
end
end
end
def find_body(part)
part.parts.each do |sub_part|
if sub_part.mime_type == @valuable_mime_part && sub_part.body.present?
@body_data = sub_part.body.data
break
elsif sub_part.parts.present?
# there are more layers down
######### THIS FEELS BAD!!! ###########
find_body(sub_part)
end
end
end
Yep, there's a method calling itself. I know, that's why I'm here.
This does work, I've tested it on a few dozen messages, but... there has to be a better, DRY-er way to do this.
Upvotes: 0
Views: 139
Reputation: 110725
You can compute the desired result using recursion.
def find_it(h, top_key, k1, k2, k3)
return nil unless h.key?(top_key)
recurse(h[top_key], k1, k2, k3)
end
def recurse(h, k1, k2, k3)
return nil unless h.key?(k1)
h[k1].each do |g|
v = g.dig(k2,k3) || recurse(g, k1 , k2, k3)
return v unless v.nil?
end
nil
end
See Hash#dig.
Let h1
and h2
equal the two hashes given in the example1. Then:
find_it(h1, :payload, :parts, :body, :data)
#=> "Hey, you found the body of the email! I want this!"
find_it(h2, :payload, :parts, :body, :data)
#=> "hey, you found me! This is what I want!!"
1. The hash h[:payload][:parts].last #=> { "mimeType": "application/pdf" }
appears to contain hidden characters that are causing a problem. I therefore removed that hash from h2
.
Upvotes: 1
Reputation: 1907
No need to go through all this pain. Just keep diving in the parts
dictionary until you find the first value where there is no parts
anymore. At this moment you have the final parts
in your parts
variable.
Code:
reponse = {"id" => "175aee26de8209d2","snippet" => "snippet text...","payload" => {"parts" => [{"mimeType" => "multipart/related","parts" => [{"mimeType" => "multipart/alternative","parts" => [{"mimeType" => "text/plain","body" => {"data" => "hey, you found me! This is what I want!!"}},{"mimeType" => "text/html","body" => {"data" => "<div>I actually don't want this one.</div>"}}]},{"mimeType" => "image/jpeg"}]},{"mimeType" => "application/pdf"}]}}
parts = reponse["payload"]
parts = (parts["parts"].send("first") || parts["parts"]) while parts["parts"]
data = parts["body"]["data"]
puts data
Output:
hey, you found me! This is what I want!!
Upvotes: 1