Reputation: 1372

Extract first element from string

I want to pull first element from square brackets of c([a,b],[c,d]). See the example below -

x <- "[\"Multi & Flexi-Cap\", \"Multi & Flexi Cap Fund\"], [\"Large-Cap\", \"Large Cap Fund\"], [\"Large & Mid-Cap\", \"Large & Mid Cap Fund\"], [\"Mid-Cap\", \"Mid Cap Fund\"], [\"Small-Cap\", \"Small Cap Fund\"], [\"ELSS\", \"ELSS (Tax Savings)\"], [\"Dividend Yield\", \"Dividend Yield\"], [\"Equity - Sectoral\", \"Sectoral/Thematic\"], [\"Contra\", \"Contra Fund\"], [\"Focused Fund\", \"Focused Fund\"], [\"Value\", \"Value Fund\"], [\"RGESS\", \"RGESS\"], [\"Equity - Other\", \"Equity - Other\"]"

Desired Output of first 3 brackets [ ]

c("Multi & Flexi-Cap", "Large-Cap", "Large & Mid-Cap")

Upvotes: 2

Answers (3)

akrun

Reputation: 887048

We could use reticulate

library(reticulate)
py_run_string(paste0('x = ', x))
sapply(py$x, `[`, 1)

-output

[1] "Multi & Flexi-Cap" "Large-Cap"         "Large & Mid-Cap"   "Mid-Cap"           "Small-Cap"         "ELSS"              "Dividend Yield"   
 [8] "Equity - Sectoral" "Contra"            "Focused Fund"      "Value"             "RGESS"             "Equity - Other"

data

x <- "[\"Multi & Flexi-Cap\", \"Multi & Flexi Cap Fund\"], [\"Large-Cap\", \"Large Cap Fund\"], [\"Large & Mid-Cap\", \"Large & Mid Cap Fund\"], [\"Mid-Cap\", \"Mid Cap Fund\"], [\"Small-Cap\", \"Small Cap Fund\"], [\"ELSS\", \"ELSS (Tax Savings)\"], [\"Dividend Yield\", \"Dividend Yield\"], [\"Equity - Sectoral\", \"Sectoral/Thematic\"], [\"Contra\", \"Contra Fund\"], [\"Focused Fund\", \"Focused Fund\"], [\"Value\", \"Value Fund\"], [\"RGESS\", \"RGESS\"], [\"Equity - Other\", \"Equity - Other\"]"

Upvotes: 1

Peter

Reputation: 12699

Using stringr you could try:

x <- "[\"Multi & Flexi-Cap\", \"Multi & Flexi Cap Fund\"], [\"Large-Cap\", \"Large Cap Fund\"], [\"Large & Mid-Cap\", \"Large & Mid Cap Fund\"], [\"Mid-Cap\", \"Mid Cap Fund\"], [\"Small-Cap\", \"Small Cap Fund\"], [\"ELSS\", \"ELSS (Tax Savings)\"], [\"Dividend Yield\", \"Dividend Yield\"], [\"Equity - Sectoral\", \"Sectoral/Thematic\"], [\"Contra\", \"Contra Fund\"], [\"Focused Fund\", \"Focused Fund\"], [\"Value\", \"Value Fund\"], [\"RGESS\", \"RGESS\"], [\"Equity - Other\", \"Equity - Other\"]"

library(stringr)

unlist(str_extract_all(x, "(?<=\\[\\\\?[:punct:])[A-z &-]*(?=\\\\?)"))
#>  [1] "Multi & Flexi-Cap" "Large-Cap"         "Large & Mid-Cap"  
#>  [4] "Mid-Cap"           "Small-Cap"         "ELSS"             
#>  [7] "Dividend Yield"    "Equity - Sectoral" "Contra"           
#> [10] "Focused Fund"      "Value"             "RGESS"            
#> [13] "Equity - Other"

^{Created on 2021-08-27 by the reprex package (v2.0.0)}

Upvotes: 3

anymous.asker

Reputation: 1259

You can use a regular expression:

gsub("\\[([^,]+).*", "\\1", x)

Upvotes: 1

Extract first element from string

Answers (3)

data

Related Questions