Reputation: 37
I have to parse a given XML File that looks like this:
<country id='cid-cia-Ashmore-and-Cartier-Islands'
continent='Asia'
name='Ashmore and Cartier Islands'
datacode='AT'
total_area='5'
government='territory of Australia administered by the Australian Ministry for the Environment'>
<coasts>Indian Ocean</coasts>
</country>
<country id='cid-cia-Azerbaijan'
continent='Asia'
name='Azerbaijan'
datacode='AJ'
total_area='86600'
population='7676953'
population_growth='0.78'
infant_mortality='74.5'
inflation='85'
gdp_total='11500'
indep_date='30 08 1991'
government='republic'
capital='Baku'>
<ethnicgroups name='Russian'>2.5</ethnicgroups>
<ethnicgroups name='Armenian'>2.3</ethnicgroups>
<ethnicgroups name='Azeri'>90</ethnicgroups>
<ethnicgroups name='Dagestani Peoples'>3.2</ethnicgroups>
<religions name='Muslim'>93.4</religions>
<religions name='Armenian Orthodox'>2.3</religions>
<religions name='Russian Orthodox'>2.5</religions>
<languages name='Russian'>3</languages>
<languages name='Armenian'>2</languages>
<languages name='Azeri'>89</languages>
<borders country='cid-cia-Armenia'>787</borders>
<borders country='cid-cia-Georgia'>322</borders>
<borders country='cid-cia-Iran'>611</borders>
<borders country='cid-cia-Russia'>284</borders>
<borders country='cid-cia-Turkey'>9</borders>
<coasts>Caspian Sea</coasts>
</country>
<country id='cid-cia-Bahrain'
continent='Asia'
name='Bahrain'
datacode='BA'
total_area='620'
population='590042'
population_growth='2.27'
infant_mortality='17.1'
inflation='3'
gdp_total='7300'
indep_date='15 08 1971'
government='traditional monarchy'
capital='Manama'>
<ethnicgroups name='Arab'>10</ethnicgroups>
<ethnicgroups name='Asian'>13</ethnicgroups>
<ethnicgroups name='Bahraini'>63</ethnicgroups>
<ethnicgroups name='Iranian'>8</ethnicgroups>
<religions name='Sunni Muslim'>25</religions>
<religions name='Shia Muslim'>75</religions>
<coasts>Persian Gulf</coasts>
</country>
I have to parse this with XML to grab the name
and inflation
value ONLY if there is an inflation value associated with a given Country.
I have this Rubular setup here: http://rubular.com/r/L7pbX2mm1J with my progress. I have it returning back two matches which is fine, but if you look closely at the 1st match, the country is Ashmore and Cartier Islands and then look at the XML for that Country and there is no inflation - the regex just keeps going down until it finds an inflation value, then it closes it.
I'm wondering if there is a way I can have some sort of conditional operation that checks if there is an inflation key at all, and if so, grab the name value and inflation value...
Thanks in advance!
Upvotes: 0
Views: 130
Reputation: 89566
You can indeed use Nokogiri, an example:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::XML(open('./country.xml'))
doc.xpath('//country[@inflation]/@name|//country/@inflation').each do |res|
puts res
end
if you "need" to use a regex, this one should do the job:
<country [^>]*? name='(?<name>[^']+)'[^>]*? inflation='(?<inflation>[^']+)'
Upvotes: 2