user70192
user70192

Reputation: 14214

Reading XML file in Node.js

I'm learning how to use Node. At this time, I have an XML file that looks like this:

sitemap.xml

<?xml version="1.0" encoding="utf-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"   xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
  <url>
    <loc>http://www.example.com</loc>
    <lastmod>2015-10-01</lastmod>
    <changefreq>monthly</changefreq>
  </url>

  <url>
    <loc>http://www.example.com/about</loc>
    <lastmod>2015-10-01</lastmod>
    <changefreq>never</changefreq>
  </url>

  <url>
    <loc>http://www.example.com/articles/tips-and-tricks</loc>
    <lastmod>2015-10-01</lastmod>
    <changefreq>never</changefreq>
    <article:title>Tips and Tricks</blog:title>
    <article:description>Learn some of the tips-and-tricks of the trade</article:description>
  </url>
</urlset>

I am trying to load this XML in my Node app. When loaded, I want to only get the url elements that include the use of the <article: elements. At this time, I'm stuck though. Right now, I'm using XML2JS via the following:

var parser = new xml2js.Parser();
fs.readFile(__dirname + '/../public/sitemap.xml', function(err, data) {
    if (!err) {
        console.log(JSON.stringify(data));
    }
});

When the console.log statement is executed, I just see a bunch of numbers in the console window. Something like this:

{"type":"Buffer","data":[60,63,120, ...]}

What am I missing?

Upvotes: 58

Views: 160648

Answers (11)

Cleardd
Cleardd

Reputation: 31

I like to use xml-js

var fs = require('fs');
var convert = require('xml-js');
var xml =
'<?xml version="1.0" encoding="utf-8"?>' +
'<note importance="high" logged="true">' +
'    <title>Happy</title>' +
'    <todo>Work</todo>' +
'    <todo>Play</todo>' +
'</note>';
fs.writeFileSync(`./file.xml`,xml);
var result1 = convert.xml2js(fs.readFileSync(`./file.xml`).toString());
console.log(JSON.stringify(result1,null,4));
/*
{
    "declaration": {
        "attributes": {
            "version": "1.0",
            "encoding": "utf-8"
        }
    },
    "elements": [
        {
            "type": "element",
            "name": "note",
            "attributes": {
                "importance": "high",
                "logged": "true"
            },
            "elements": [
                {
                    "type": "element",
                    "name": "title",
                    "elements": [
                        {
                            "type": "text",
                            "text": "Happy"
                        }
                    ]
                },
                {
                    "type": "element",
                    "name": "todo",
                    "elements": [
                        {
                            "type": "text",
                            "text": "Work"
                        }
                    ]
                },
                {
                    "type": "element",
                    "name": "todo",
                    "elements": [
                        {
                            "type": "text",
                            "text": "Play"
                        }
                    ]
                }
            ]
        }
    ]
}

*/

Upvotes: 0

Reaper
Reaper

Reputation: 412

Install xml2js using: npm install xml2js --save.

const xml2js = require('xml2js');
const fs = require('fs');
const parser = new xml2js.Parser({ attrkey: "ATTR" });

// this example reads the file synchronously
// you can read it asynchronously also
let xml_string = fs.readFileSync("data.xml", "utf8");

parser.parseString(xml_string, function(error, result) {
  if (error === null) {
    console.log(result);
  } else {
    console.log(error);
  }
});

Upvotes: 4

mewc
mewc

Reputation: 1447

For an express server:

  app.get('/api/rss/', (_request: Request, response: Response) => {
    const rssFile = fs.readFileSync(__dirname + '/rssFeeds/guardian.xml', { encoding: 'utf8' })

    console.log('FILE', rssFile)

    response.set('Content-Type', 'text/xml')
    response.send(rssFile)
  })
  • Take request
  • Read File
  • Set xml header
  • Return file

Upvotes: 4

chrisbyte
chrisbyte

Reputation: 1633

@Sandburg mentioned xml-js in a comment and it worked best for me (several years after this question was asked). The others I tried were: xml2json which required some Windows Sdk that I did not want to deal with, and xml2js that did not provide an easy enough OTB way to search through attributes.

I had to pull out a specific attribute in an xml file 3 nodes deep and xml-js did it with ease.

https://www.npmjs.com/package/xml-js

With the following example file stats.xml

<stats>
  <runs>
    <latest date="2019-12-12" success="100" fail="2" />
    <latest date="2019-12-11" success="99" fail="3" />
    <latest date="2019-12-10" success="102" fail="0" />
    <latest date="2019-12-09" success="102" fail="0" />
  </runs>
</stats>

I used xml-js to find the element /stats/runs/latest with attribute @date='2019-12-12' like so

const convert = require('xml-js');
const fs = require('fs');

// read file
const xmlFile = fs.readFileSync('stats.xml', 'utf8');

// parse xml file as a json object
const jsonData = JSON.parse(convert.xml2json(xmlFile, {compact: true, spaces: 2}));

const targetNode = 

    // element '/stats/runs/latest'
    jsonData.stats.runs.latest

    .find(x => 

        // attribute '@date'
        x._attributes.date === '2019-12-12'
    );

// targetNode has the 'latest' node we want
// now output the 'fail' attribute from that node
console.log(targetNode._attributes.fail);  // outputs: 2

Upvotes: 15

Nate
Nate

Reputation: 553

fs.readFile has an optional second parameter: encoding. If you do not include this parameter it will automatically return you a Buffer object.

https://nodejs.org/api/fs.html#fs_fs_readfile_filename_options_callback

If you know the encoding just use:

fs.readFile(__dirname + '/../public/sitemap.xml', 'utf8', function(err, data) {
    if (!err) {
        console.log(data);
    }
});

Upvotes: 5

Atul Kr Dey
Atul Kr Dey

Reputation: 160

You can try this

npm install express-xml-bodyparser --save

at Client side:-

 $scope.getResp = function(){
     var posting = $http({
           method: 'POST',
           dataType: 'XML',
           url: '/getResp/'+$scope.user.BindData,//other bind variable
           data: $scope.project.XmlData,//xmlData passed by user
           headers: {
              "Content-Type" :'application/xml'
            },
           processData: true
           });
       posting.success(function(response){
       $scope.resp1 =  response;
       });
   };

on Server side:-

xmlparser = require('express-xml-bodyparser');
app.use(xmlparser());
app.post('/getResp/:BindData', function(req, res,next){
  var tid=req.params.BindData;
  var reqs=req.rawBody;
  console.log('Your XML '+reqs);
});

Upvotes: 2

Daphoque
Daphoque

Reputation: 4678

You can also use regex before parsing to remove elements not matching your conditions :

var parser = new xml2js.Parser();
fs.readFile(__dirname + '/../public/sitemap.xml', "utf8",function(err, data) {
    // handle err...

    var re = new RegExp("<url>(?:(?!<article)[\\s\\S])*</url>", "gmi")
    data = data.replace(re, ""); // remove node not containing article node
    console.log(data);
    //... parse data ...



});

Example :

   var str = "<data><url><hello>abc</hello><moto>abc</moto></url><url><hello>bcd</hello></url><url><hello>efd</hello><moto>poi</moto></url></data>";
   var re = new RegExp("<url>(?:(?!<moto>)[\\s\\S])*</url>", "gmi")
   str = str.replace(re, "")

   // "<data><url><hello>abc</hello><moto>abc</moto></url><url><hello>efd</hello><moto>poi</moto></url></data>"

Upvotes: 1

Sajith Mantharath
Sajith Mantharath

Reputation: 2627

use xml2json

https://www.npmjs.com/package/xml2json

fs = require('fs');
var parser = require('xml2json');

fs.readFile( './data.xml', function(err, data) {
    var json = parser.toJson(data);
    console.log("to json ->", json);
 });

Upvotes: 50

KuN
KuN

Reputation: 1211

coming late to this thread, just to add one simple tip here, if you plan to use parsed data in js or save it as json file, be sure to set explicitArray to false. The output will be more js-friendly

so it will look like,
letparser=newxml2js.Parser({explicitArray:false})

Ref: https://github.com/Leonidas-from-XIV/node-xml2js

Upvotes: 0

Chad Campbell
Chad Campbell

Reputation: 937

In order to read an XML file in Node, I like the XML2JS package. This package lets me easily work with the XML in JavaScript then.

var parser = new xml2js.Parser();       
parser.parseString(fileData.substring(0, fileData.length), function (err, result) {
  var json = JSON.stringify(result);
});

Upvotes: 0

Quentin
Quentin

Reputation: 944301

From the documentation.

The callback is passed two arguments (err, data), where data is the contents of the file.

If no encoding is specified, then the raw buffer is returned.

If options is a string, then it specifies the encoding. Example:

fs.readFile('/etc/passwd', 'utf8', callback);

You didn't specify an encoding, so you get the raw buffer.

Upvotes: 18

Related Questions