user3108348
user3108348

Reputation: 1

Cut string model formula in R

I am struggling cutting a character model formula after a specific value. This is the vector I am trying to cut:

bla
#[1] "pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6 + (1 | clusterid)"

str(bla)
# chr "pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6 + (1 | clusterid)"

The desired result should look like this:

bla2
[1] "pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6"

This is what I have tried:

bla2 <- gsub("+ (1 | clusterid)", "", bla)

But unfortunately this is not working :(

I would appreciate any help. Thanks!!

Upvotes: 0

Views: 123

Answers (4)

Zheyuan Li
Zheyuan Li

Reputation: 73325

You have a formula, rather than just an ordinary string. For formula, there is its own way:

f <- as.formula("pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6 + (1 | clusterid)")
# pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6 + (1 | clusterid)

g <- terms.formula(f)

modelterms <- attr(g, "term.labels")
#[1] "x1"            "x2"            "x3"            "x4"           
#[5] "x5"            "x6"            "1 | clusterid"

retain <- modelterms[!grepl("|", modelterms, fixed = TRUE)]
#[1] "x1" "x2" "x3" "x4" "x5" "x6"

reformulate(retain, f[[2]])
# pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6

My answer is assuming that you want the solution flexible enough to drop off all model terms involving conditional specification |, without prior knowledge on the content of the formula or the order of term specification.

Upvotes: 1

Sandipan Dey
Sandipan Dey

Reputation: 23099

With stringr:

library(stringr)
bla <- "pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6 + (1 | clusterid)"
bla2 <- str_match(bla, "(.*) \\+ \\(1 | clusterid\\)")[2]
bla2
#[1] "pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6"

Upvotes: 0

joel.wilson
joel.wilson

Reputation: 8413

You need to pass the fixed= argument to gsub()

gsub(" + (1 | clusterid)", "", bla, fixed = T)

If fixed = TRUE, pattern is a string to be matched as is.

Upvotes: 0

akrun
akrun

Reputation: 887173

We can try with sub. The + is a metacharacter, so it needs to be escaped. Here, we match one or more spaces (\\s+) followed by a + (\\+) followed by one or more space (\\s+) followed by a bracket (\\() and other characters following it (.*). Replace it with a blank ("")

sub("\\s+\\+\\s+\\(.*", "", bla)
#[1] "pseudoy ~ x1 + x2 + x3 + x4 + x5 + x6"

Upvotes: 0

Related Questions