Reputation: 4555
As the title states, I am trying to use gsub where I use a vector for the "pattern" and "replacement". Currently, I have a code that looks like this:
names(x1) <- gsub("2110027599", "Inv1", names(x1)) #x1 is a data frame
names(x1) <- gsub("2110025622", "Inv2", names(x1))
names(x1) <- gsub("2110028045", "Inv3", names(x1))
names(x1) <- gsub("2110034716", "Inv4", names(x1))
names(x1) <- gsub("2110069349", "Inv5", names(x1))
names(x1) <- gsub("2110023264", "Inv6", names(x1))
What I hope to do is something like this:
a <- c("2110027599","2110025622","2110028045","2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")
names(x1) <- gsub(a,b,names(x1))
I'm guessing there is an apply function somewhere that can do this, but I am not very sure which one to use!
EDIT: names(x1) looks like this (There are many more columns, but I'm leaving them out):
> names(x1)
[1] "2110023264A.Ms.Amp" "2110023264A.Ms.Vol" "2110023264A.Ms.Watt" "2110023264A1.Ms.Amp"
[5] "2110023264A2.Ms.Amp" "2110023264A3.Ms.Amp" "2110023264A4.Ms.Amp" "2110023264A5.Ms.Amp"
[9] "2110023264B.Ms.Amp" "2110023264B.Ms.Vol" "2110023264B.Ms.Watt" "2110023264B1.Ms.Amp"
[13] "2110023264Error" "2110023264E-Total" "2110023264GridMs.Hz" "2110023264GridMs.PhV.phsA"
[17] "2110023264GridMs.PhV.phsB" "2110023264GridMs.PhV.phsC" "2110023264GridMs.TotPFPrc" "2110023264Inv.TmpLimStt"
[21] "2110023264InvCtl.Stt" "2110023264Mode" "2110023264Mt.TotOpTmh" "2110023264Mt.TotTmh"
[25] "2110023264Op.EvtCntUsr" "2110023264Op.EvtNo" "2110023264Op.GriSwStt" "2110023264Op.TmsRmg"
[29] "2110023264Pac" "2110023264PlntCtl.Stt" "2110023264Serial Number" "2110025622A.Ms.Amp"
[33] "2110025622A.Ms.Vol" "2110025622A.Ms.Watt" "2110025622A1.Ms.Amp" "2110025622A2.Ms.Amp"
[37] "2110025622A3.Ms.Amp" "2110025622A4.Ms.Amp" "2110025622A5.Ms.Amp" "2110025622B.Ms.Amp"
[41] "2110025622B.Ms.Vol" "2110025622B.Ms.Watt" "2110025622B1.Ms.Amp" "2110025622Error"
[45] "2110025622E-Total" "2110025622GridMs.Hz" "2110025622GridMs.PhV.phsA" "2110025622GridMs.PhV.phsB"
What I hope to get is this:
> names(x1)
[1] "Inv6A.Ms.Amp" "Inv6A.Ms.Vol" "Inv6A.Ms.Watt" "Inv6A1.Ms.Amp" "Inv6A2.Ms.Amp"
[6] "Inv6A3.Ms.Amp" "Inv6A4.Ms.Amp" "Inv6A5.Ms.Amp" "Inv6B.Ms.Amp" "Inv6B.Ms.Vol"
[11] "Inv6B.Ms.Watt" "Inv6B1.Ms.Amp" "Inv6Error" "Inv6E-Total" "Inv6GridMs.Hz"
[16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC" "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt"
[21] "Inv6InvCtl.Stt" "Inv6Mode" "Inv6Mt.TotOpTmh" "Inv6Mt.TotTmh" "Inv6Op.EvtCntUsr"
[26] "Inv6Op.EvtNo" "Inv6Op.GriSwStt" "Inv6Op.TmsRmg" "Inv6Pac" "Inv6PlntCtl.Stt"
[31] "Inv6Serial Number" "Inv2A.Ms.Amp" "Inv2A.Ms.Vol" "Inv2A.Ms.Watt" "Inv2A1.Ms.Amp"
[36] "Inv2A2.Ms.Amp" "Inv2A3.Ms.Amp" "Inv2A4.Ms.Amp" "Inv2A5.Ms.Amp" "Inv2B.Ms.Amp"
[41] "Inv2B.Ms.Vol" "Inv2B.Ms.Watt" "Inv2B1.Ms.Amp" "Inv2Error" "Inv2E-Total"
[46] "Inv2GridMs.Hz" "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB"
Upvotes: 52
Views: 35250
Reputation: 3883
From stringr
documentation of str_replace_all
, "If you want to apply multiple patterns and replacements to the same string, pass a named version to pattern."
Thus using a, b, and names(x1) from above
stringr::str_replace_all(names(x1), setNames(b, a))
EDIT
stringr::str_replace_all
calls stringi::stri_replace_all_regex
, which can be used directly and is quite a bit quicker.
x <- names(x1)
pattern <- a
replace <- b
microbenchmark::microbenchmark(
str = stringr::str_replace_all(x, setNames(replace, pattern)),
stri = stringi::stri_replace_all_regex(x, pattern, replace, vectorize_all = FALSE)
)
Unit: microseconds
expr min lq mean median uq max neval cld
str 1022.1 1070.45 1286.547 1175.55 1309 2526.8 100 b
stri 145.2 150.45 190.124 160.55 178 457.9 100 a
Upvotes: 34
Reputation: 534
I needed to do something similar but had to use base R. As long as your vectors are the same length, I think this will work
for (i in seq_along(a)){
names(x1) <- gsub(a[i], b[i], names(x1))
}
Upvotes: 5
Reputation: 110062
Lot's of solutions already, here are one more:
The qdap package:
library(qdap)
names(x1) <- mgsub(a,b,names(x1))
Upvotes: 33
Reputation: 193687
If we can make another assumption, the following should work. The assumption this time is that you are really interested in substituting the first 10 characters from each value in names(x1)
.
Here, I've stored names(x1)
as a character vector named "X1". The solution essentially uses substr
to separate the values in X1 into 2 parts, match
to figure out the correct replacement option, and paste
to put everything back together.
a <- c("2110027599", "2110025622", "2110028045",
"2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")
X1pre <- substr(X1, 1, 10)
X1post <- substr(X1, 11, max(nchar(X1)))
paste0(b[match(X1pre, a)], X1post)
# [1] "Inv6A.Ms.Amp" "Inv6A.Ms.Vol" "Inv6A.Ms.Watt"
# [4] "Inv6A1.Ms.Amp" "Inv6A2.Ms.Amp" "Inv6A3.Ms.Amp"
# [7] "Inv6A4.Ms.Amp" "Inv6A5.Ms.Amp" "Inv6B.Ms.Amp"
# [10] "Inv6B.Ms.Vol" "Inv6B.Ms.Watt" "Inv6B1.Ms.Amp"
# [13] "Inv6Error" "Inv6E-Total" "Inv6GridMs.Hz"
# [16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC"
# [19] "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt" "Inv6InvCtl.Stt"
# [22] "Inv6Mode" "Inv6Mt.TotOpTmh" "Inv6Mt.TotTmh"
# [25] "Inv6Op.EvtCntUsr" "Inv6Op.EvtNo" "Inv6Op.GriSwStt"
# [28] "Inv6Op.TmsRmg" "Inv6Pac" "Inv6PlntCtl.Stt"
# [31] "Inv6Serial Number" "Inv2A.Ms.Amp" "Inv2A.Ms.Vol"
# [34] "Inv2A.Ms.Watt" "Inv2A1.Ms.Amp" "Inv2A2.Ms.Amp"
# [37] "Inv2A3.Ms.Amp" "Inv2A4.Ms.Amp" "Inv2A5.Ms.Amp"
# [40] "Inv2B.Ms.Amp" "Inv2B.Ms.Vol" "Inv2B.Ms.Watt"
# [43] "Inv2B1.Ms.Amp" "Inv2Error" "Inv2E-Total"
# [46] "Inv2GridMs.Hz" "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB"
If we can assume that names(x1)
is in the same order as the pattern and replacement and that it is basically a one-for-one replacement, you might be able to get away with just sapply
.
Here's an example of that particular situation:
Imagine "names(x)" looks something like this:
X1 <- paste0("A2", a, sequence(length(a)))
X1
# [1] "A221100275991" "A221100256222" "A221100280453"
# [4] "A221100347164" "A221100693495" "A221100232646"
Here's our pattern
and replacement
vectors:
a <- c("2110027599", "2110025622", "2110028045",
"2110034716", "2110069349", "2110023264")
b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")
This is how we might use sapply
if these assumptions are valid.
sapply(seq_along(a), function(x) gsub(a[x], b[x], X1[x]))
# [1] "A2Inv11" "A2Inv22" "A2Inv33" "A2Inv44" "A2Inv55" "A2Inv66"
Upvotes: 11
Reputation: 121177
Try mapply
.
names(x1) <- mapply(gsub, a, b, names(x1), USE.NAMES = FALSE)
Or, even easier, str_replace
from stringr
.
library(stringr)
names(x1) <- str_replace(names(x1), a, b)
Upvotes: 3
Reputation: 60000
Somehow names<-
and match
seems much more appropriate here...
names( x1 ) <- b[ match( names( x1 ) , a ) ]
But I am making the assumption that the elements of vector a
are the actual names
of your data.frame
.
If a
really is a pattern found within each of the names
of x1
then this grepl
approach with names<-
could be useful...
new <- sapply( a , grepl , x = names( x1 ) )
names( x1 ) <- b[ apply( new , 1 , which.max ) ]
Upvotes: 2