GCGM
GCGM

Reputation: 1073

Generate XML document in R

In the project I am working, I need to automatize the creation of an XML document depending on the user input. The part of using the user input to modify the xml document is okay for me but I am new in creating xml documents from scratch in R

I am wondering if an XML document like the one below can be generated in R using the XML or xml2 packages. So far, I have explored the newXMLdoc, xml_new_document and xml_new_root functions but I am not familiar with all the syntax needed to create such an xml file (which should be saved in a local path once finished)

<session>
  <modelVersion>1.0.0</modelVersion>
  <products>
    <product>
      <refNo>1</refNo>
      <uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
      <productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
    </product>
    <product>
      <refNo>2</refNo>
      <uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
      <productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
    </product>
  </products>
  <views/>
</session>

Upvotes: 9

Views: 4686

Answers (3)

josep maria porr&#224;
josep maria porr&#224;

Reputation: 1388

Package xml2 (cran) provides an alternative solution within the Hadleyuniverse.

library(xml2)
library(tidyverse)

df <- data.frame(number = c(1, 2),
  uri = c('S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip', 
    'S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip'),
  plugin = c('class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn', 
    'class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn'),
  stringsAsFactors = FALSE)

We first create the xml doc that will contain all xml structure

doc <- xml_new_root("session") 
xml_add_child(doc, "modelVersion", "1.0.0")  
xml_add_child(doc, "products") 
xml_add_child(doc, "products") 
xml_add_child(doc, "views")
doc
#> {xml_document}
#> <session>
#> [1] <modelVersion>1.0.0</modelVersion>
#> [2] <products/>
#> [3] <products/>
#> [4] <views/>

We add now the components in each product node. No loop is required as xml_add_child function is vectorized.

products_nodes <- xml_find_all(doc, "//products")
xml_add_child(products_nodes, "refNo", df$number)
xml_add_child(products_nodes, "uri", df$uri)
xml_add_child(products_nodes, "productReaderPlugin", df$plugin)

Save finally the xml tree into a file and show the content of it

write_xml(doc, file = "output.xml", options =c("format", "no_declaration"))
cat(paste0(readLines("output.xml"), collapse = "\n"))

This is the content of "output.xml" file:

<session>
  <modelVersion>1.0.0</modelVersion>
  <products>
    <refNo>1</refNo>
    <uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
    <productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
  </products>
  <products>
    <refNo>2</refNo>
    <uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
    <productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
  </products>
  <views/>
</session>

Created on 2021-05-06 by the reprex package (v0.3.0)

Upvotes: 9

Amit Kohli
Amit Kohli

Reputation: 2950

Might be easily solveable w/out any of those packages... if your structure is fairly static, I would use https://github.com/tidyverse/glue and then just cat() the file out. Something like this:



## I guess your data looks like this?
df <- data.frame(number = c(1,2),
                 uri = c("S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip<",
                         "S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip"),
                 plugin = c("class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn",
                            "class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn"))
df

## build a function that outputs every block in xml format
thingieBuilder <- function(number, uri, plugin){
  glue::glue("<product>
           <refNo>{number}</refNo>
           <uri>{uri}</uri>
           <productReaderPlugin>{plugin}</productReaderPlugin>
           </product>")
}

## now run that for each entry in your df and unlist it, and make it a sausage, seperated by newlines
xmlProducts <- df %>% purrr::pmap(thingieBuilder) %>% unlist %>% paste(collapse = "\n")

## Now stick on top and bottom, and cat it to a file!
glue::glue("<session>
  <modelVersion>1.0.0</modelVersion>
  <products>\n",
           xmlProducts,
           "/n</products>
             <views/>
           </session>") %>% 
  cat(file = "boom.xml")

Upvotes: 1

Parfait
Parfait

Reputation: 107652

Consider building XML with DOM methods using aforementioned libraries such as XML without the need of concatenating or interpolating strings:

library(XML)

# DATA
df <- data.frame(refNo = c(1, 2),
                 uri = c('S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip', 
                         'S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip'),
                 plugin = c('class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn', 
                            'class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn')
                )

# CREATE XML FILE
doc = newXMLDoc()
root = newXMLNode("session", doc = doc)

# WRITE XML NODES AND DATA
mvNode = newXMLNode("modelVersion", "1.0.0", parent = root)

for (i in 1:nrow(df)){
  prodNode = newXMLNode("products", parent = root)

  # APPEND TO PRODUCT NODE
  newXMLNode("refNo", df$refNo[i], parent = prodNode)
  newXMLNode("uri", df$uri[i], parent = prodNode)
  newXMLNode("productReaderPlugin", df$plugin[i], parent = prodNode)
}

vwNode = newXMLNode("views", parent = root)

# OUTPUT XML CONTENT TO CONSOLE
print(doc)

# OUTPUT XML CONTENT TO FILE
saveXML(doc, file="Output.xml")

Output

<?xml version="1.0"?>
<session>
  <modelVersion>1.0.0</modelVersion>
  <products>
    <refNo>1</refNo>
    <uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
    <productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
  </products>
  <products>
    <refNo>2</refNo>
    <uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
    <productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
  </products>
  <views/>
</session>

Upvotes: 9

Related Questions