Reputation: 155
Can anyone help me extract the authors name from this script tag using Nokogiri.
I can so far get to the script tag using:
parsed_page.xpath("//script[@type ='application/ld+json']")
I am trying to get the name "Kevin McCart"
<script type="application/ld+json">{"@context":"https:\/\/schema.org","@type":"NewsArticle","headline":"OLYMPICS: Check out schedule","url":"https:\/\/www.website.ie\/sport\/winter-olympics-check-out-jack-gowers-schedule-4237649","mainEntityOfPage":{"@type":"WebPage","@id":"https:\/\/www.southernstar.ie\/sport\/winter-olympics-check-out-jack-gowers-schedule-4237649"},"dateCreated":"2022-02-04T12:00:40+00:00","datePublished":"2022-02-04T12:00:40+00:00","dateModified":"2022-02-02T15:08:29+00:00","thumbnailUrl":"https:\/\/images.website.ie\/uploads\/2022\/01\/24153939\/Jack-Gower-cropped.jpg","image":{"@type":"ImageObject","url":"https:\/\/images.website.ie\/uploads\/2022\/01\/24153939\/Jack-Gower-cropped.jpg","width":700,"height":370},"articleSection":"Sport","keywords":"Jack Go","author":[{"@type":"Person","name":"Kevin McCart"}],"publisher":{"@type":"Organization","name":"The Southern Star","logo":{"@type":"ImageObject"}}}</script>
Any help would be really appreciated.
Thanks,
Gerard
Upvotes: 1
Views: 162
Reputation: 17528
Once you get to the script tag, nokogiri's work is done, and it's time to parse JSON.
require 'json'
def get_kevin(script_element)
data = JSON.parse(script_element.text)
data.fetch("author").first.fetch("name")
end
Upvotes: 1