Reputation: 165
i have a xml file that includes rootNode and child node with attributes that handle values.
i am using the R language to work on the xml file.
what i need is to display the result of the employees that are in the department IT
how to display the ID or the name of the employees that are in the IT department?
i used this code:
print(getNodeSet(rootnode,"//EMPLOYEE/DEPT[@DEPT='IT']"))
where rootnode is the variable that handle the value : RECORDS
IT DID NOT WORK
<RECORDS>
<EMPLOYEE>
<ID>1</ID>
<NAME>Rick</NAME>
<SALARY>623.3</SALARY>
<STARTDATE>1/1/2012</STARTDATE>
<DEPT>IT</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>2</ID>
<NAME>Dan</NAME>
<SALARY>515.2</SALARY>
<STARTDATE>9/23/2013</STARTDATE>
<DEPT>Operations</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>3</ID>
<NAME>Michelle</NAME>
<SALARY>611</SALARY>
<STARTDATE>11/15/2014</STARTDATE>
<DEPT>IT</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>4</ID>
<NAME>Ryan</NAME>
<SALARY>729</SALARY>
<STARTDATE>5/11/2014</STARTDATE>
<DEPT>HR</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>5</ID>
<NAME>Gary</NAME>
<SALARY>843.25</SALARY>
<STARTDATE>3/27/2015</STARTDATE>
<DEPT>Finance</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>6</ID>
<NAME>Nina</NAME>
<SALARY>578</SALARY>
<STARTDATE>5/21/2013</STARTDATE>
<DEPT>IT</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>7</ID>
<NAME>Simon</NAME>
<SALARY>632.8</SALARY>
<STARTDATE>7/30/2013</STARTDATE>
<DEPT>Operations</DEPT>
</EMPLOYEE>
<EMPLOYEE>
<ID>8</ID>
<NAME>Guru</NAME>
<SALARY>722.5</SALARY>
<STARTDATE>6/17/2014</STARTDATE>
<DEPT>Finance</DEPT>
</EMPLOYEE>
</RECORDS>
Upvotes: 2
Views: 484
Reputation: 11955
Seems you need to modify getNodeSet
as below.
getNodeSet(xml_data, "//EMPLOYEE[DEPT='IT']/NAME")
In case you want to have more than one column in the Output:
library(XML)
library(dplyr)
#sample data
xml_data <- xmlParse("<RECORDS>
<EMPLOYEE><ID>1</ID><NAME>Rick</NAME><SALARY>623.3</SALARY><DEPT>IT</DEPT></EMPLOYEE>
<EMPLOYEE><ID>2</ID><NAME>Dan</NAME><SALARY>515.2</SALARY><DEPT>Operations</DEPT></EMPLOYEE>
<EMPLOYEE><ID>3</ID><NAME>Michelle</NAME><SALARY>611</SALARY><DEPT>IT</DEPT></EMPLOYEE>
</RECORDS>")
df <- xmlToDataFrame(nodes=getNodeSet(xml_data, "//EMPLOYEE[DEPT='IT']")) %>%
select(NAME, SALARY)
df
Output is:
NAME SALARY
1 Rick 623.3
2 Michelle 611
(Edit - modified code to have more than one column in the output)
Upvotes: 1