Tomas Greif
Tomas Greif

Reputation: 22661

Add ID of dataset when merging datasets

I'm importing and merging .tcx files (gps and fitness data in XML format) for further analysis:

files <- dir(pattern = "\\.tcx")
ldf   <- lapply(files, 
      function(x) plyr::ldply(
            getNodeSet(xmlParse(x), "//ns:Trackpoint", "ns"), 
            as.data.frame(xmlToList)))
mydf  <- plyr::rbind.fill(mydf)
setNames(mydf, c('time', 'lat', 'long', 'alt', 'heartrate'))

This works well, but I need to add one column with file identification. This can be counter, but I prefer to have file name in added column. How do I add this column?

Upvotes: 0

Views: 99

Answers (3)

WAF
WAF

Reputation: 1151

Let's assume that ID is the vector containing the ID (here your file name), you can do:

  mydf[,'ID'] <- ID

Upvotes: 1

agstudy
agstudy

Reputation: 121608

Hard to give a solution without a reproducible example and also not sure about the desired output. One idea is to change this line:

    as.data.frame(xmlToList)

to something like :

    function(y){data.frame(ID=x,as.data.frame(xmlToList(y)))})

This will add and ID to column with the name file for each data.frame.

Upvotes: 3

zelite
zelite

Reputation: 1501

Not a full answer, but a starting point.

For a dataframe you can add a id column in such a way:

data <- data.frame(x=rnorm(100), y=rnorm(100))

data$ID <- "id"

And a column of the dataframe will be filled with id.

So, i would try to add such a column inside your function(x) on the lapply.

Upvotes: 0

Related Questions