Leehbi
Leehbi

Reputation: 779

Concatenate char vector with | separator

I have a data structure containing char vectrors (see below). It's a bit messy as it came from json source.

I need to combine/concatenate to one big string with lat/long pairs seperated by | and lat/long values separated by comma with the names removed.

i.e."53.193418,-2881248|53.1905138631287,-2.89043889005541|etc.."

I've tried

piped.data<-unname(paste(b, sep="|", collapse=","))

This gets me so far as pairing the values with comma and removing the names.

I just need to add the pipe to the individual pairs

Any ideas?

dput(b)

structure(c("53.193418", "-2.881248", "53.1905138631287", "-2.89043889005541", 
"53.186744", "-2.890165", "53.189836", "-2.893896", "53.1884117", 
"-2.88802", "53.1902965", "-2.8919373", "53.1940384", "-2.8972299", 
"53.1934748", "-2.8814698", "53.1894004", "-2.8886692", "53.1916771", 
"-2.8846099"), .Names = c("location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude", 
"location.coordinate.latitude", "location.coordinate.longitude"
))

Upvotes: 9

Views: 2392

Answers (6)

David Arenburg
David Arenburg

Reputation: 92300

Another option would be

paste(tapply(b, gl(length(b)/2, 2), toString), collapse = "|")
# [1] "53.193418, -2.881248|53.1905138631287, -2.89043889005541|53.186744, -2.890165|53.189836, 
#     -2.893896|53.1884117, -2.88802|53.1902965, -2.8919373|53.1940384, -2.8972299|53.1934748, 
#     -2.8814698|53.1894004, -2.8886692|53.1916771, -2.8846099"

If you don't want the space after the comma, do

paste(tapply(b, gl(length(b)/2, 2), paste, collapse = ","), collapse = "|")

Edit: So @akrun and @SvenHohenstein were able to vectorize their solutions, so here are some benchmarks for illustration

b <- rep(b, 1e3)

library(microbenchmark)

microbenchmark(
 SH = paste(paste(b[c(TRUE, FALSE)], b[c(FALSE, TRUE)], sep = ","), collapse = "|"),
 akrun1 = paste(c(rbind(b,rep(c(',','|'), length.out = length(b))))[-length(b)*2], collapse = ""),
 akrun2 = paste(vapply(split(b,cumsum(grepl('latitude',names(b)))), paste, collapse=",", character(1L)), collapse="|"),
 akrun3 = as.data.table(matrix(b, ncol=2, byrow=TRUE))[, paste(V1, V2, sep=',',collapse="|")],
 AM = paste(apply(matrix(b, ncol = 2, byrow = TRUE), 1, paste, collapse = ","),  collapse = "|"),
 DA = paste(tapply(b, gl(length(b)/2, 2), paste, collapse = ","), collapse = "|"),
 BA = do.call(paste, c(data.frame(matrix(b, ncol=2, byrow=TRUE)), list(sep=",", collapse="|")))
)

#  Unit: milliseconds
#  expr       min        lq      mean    median        uq        max neval
#    SH  6.207338  6.275886  6.633830  6.472943  6.915140  10.556983   100
#akrun1  8.738792  8.790045  9.301718  9.049665  9.611671  11.899290   100
#akrun2 40.676819 42.329860 45.361688 43.887247 46.427638 109.963421   100
#akrun3  4.648384  4.831599  5.019834  4.901934  5.217579   5.798325   100
#    AM 38.322320 40.905073 43.108411 42.457375 44.875023  56.236726   100
#    DA 47.102466 49.679579 52.092028 51.237417 53.694154  68.123738   100
#    BA  5.227204  5.366769  6.147758  5.494207  5.806313  55.938247   100

Upvotes: 5

akrun
akrun

Reputation: 887951

You could try

 paste(sapply(split(b,cumsum(grepl('latitude',names(b)))),
             toString),collapse="|")

If you don't need space

 paste(sapply(split(b,cumsum(grepl('latitude',names(b)))),
                             paste, collapse=","), collapse="|")

Or use vapply that would be a bit faster

 paste(vapply(split(b,cumsum(grepl('latitude',names(b)))),
             paste, collapse=",", character(1L)), collapse="|")

Or

  paste(c(rbind(b,rep(c(',','|'),length.out=length(b))))[
                            -length(b)*2],collapse="")

or

  library(data.table)
  as.data.table(matrix(b, ncol=2, byrow=TRUE))[,
                  paste(V1, V2, sep=',',collapse="|")]

Upvotes: 4

baptiste
baptiste

Reputation: 77124

Another option is to reshape the vector as a data.frame,

do.call(paste, c(data.frame(matrix(b, ncol=2, byrow=TRUE)), 
        list(sep=",", collapse="|")))

Upvotes: 6

Sven Hohenstein
Sven Hohenstein

Reputation: 81753

You can use logical indexing and vector recycling:

paste(paste(b[c(TRUE, FALSE)], b[c(FALSE, TRUE)], sep = ","), collapse = "|")

Upvotes: 4

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193687

I would convert your "b" to a 2-column matrix and paste with that:

apply(matrix(b, ncol = 2, byrow = TRUE), 1, paste, collapse = "|")
#  [1] "53.193418|-2.881248"              "53.1905138631287|-2.89043889005541"
#  [3] "53.186744|-2.890165"              "53.189836|-2.893896"               
#  [5] "53.1884117|-2.88802"              "53.1902965|-2.8919373"             
#  [7] "53.1940384|-2.8972299"            "53.1934748|-2.8814698"             
#  [9] "53.1894004|-2.8886692"            "53.1916771|-2.8846099" 

Edit

I guess I misread your question.

If it's a single long string you want, first separated by a comma, and then by a pipe, you'll need paste twice:

paste(apply(matrix(b, ncol = 2, byrow = TRUE), 1, paste, collapse = ","), 
      collapse = "|")

Upvotes: 9

javlacalle
javlacalle

Reputation: 1049

You may do:

tmp <- apply(matrix(b, ncol = 2, byrow = TRUE), MARGIN = 1,  FUN = paste, collapse = ",")
paste(tmp, collapse = "|")
# [1] "53.193418,-2.881248|53.1905138631287,-2.89043889005541|53.186744,-2.890165|53.189836,-2.893896|53.1884117,-2.88802|53.1902965,-2.8919373|53.1940384,-2.8972299|53.1934748,-2.8814698|53.1894004,-2.8886692|53.1916771,-2.8846099"

Upvotes: 2

Related Questions