Henri
Henri

Reputation: 1761

NodeJS XML file parsing

I would like to store the URLs included in tag from an XML file, in an array. I don't know how to start the and to extract links.

My NodeJS code

const fs = require("fs");
const xml2js = require('xml2js');
const util = require('util');

const parser = new xml2js.Parser();

fs.readFile('example.xml', (err, data) => {
    parser.parseString(data, (err, result) => {
        console.log((util.inspect(result, false, null)));
    });
});

Inputs: XML File example

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.mywebsite.fr/001.html</loc>
<lastmod>2020-10-24</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>https://www.mywebsite.fr/002.html</loc>
<lastmod>2020-10-24</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>https://www.mywebsite.fr/003.html</loc>
<lastmod>2020-10-24</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
</urlset>

Expected Outputs

result = 
[
  'https://www.mywebsite.fr/001.html',
  'https://www.mywebsite.fr/002.html',
  'https://www.mywebsite.fr/003.html'
]

Upvotes: 2

Views: 422

Answers (1)

Jack Fleeting
Jack Fleeting

Reputation: 24930

Using the sample xml in your question, try something like this:

urls = `[your xml above]`
xpath = require('xpath')
 , dom = require('xmldom').DOMParser; 
let target = new dom().parseFromString(urls);
item = xpath.select('//*[local-name()="loc"]/text()', target);
result = [];
item.forEach(function(url) {
    result.push(url.nodeValue);
});
console.log(result);

Output:

[
  'https://www.mywebsite.fr/001.html',
  'https://www.mywebsite.fr/002.html',
  'https://www.mywebsite.fr/003.html'
]

Upvotes: 1

Related Questions