Reputation: 59
This is actually a series of questions about the referencing character type of values in R. Would add more bullets when I recalled any other related questions I believe which is interesting and related to this topic. For simplification, here I shall use some simple random examples to explain my questions. Hope this helps:
When building up a set of datasets using for loops and wanted to output a series of vectors with names restored in a list called name_list = ("a", "b", "c", "d", "e", "f")
in the loop we would like to define as
for(i in 1:4){
a <- data[data$Year == 2010,]
b <- unique(data$Name)
c <- summarise(group_by(data,Year,Name), avg = mean(quantity))
...
f <- left_join(data,data1, by = c("Year", "Names)
}
Is there any function that allows me to use function(name_list[1])
through function(name_list[6])
to replace the a through f in the for loop? This question also goes for trying to create columns using column names in some tables/data frames embedded a chunk of code. (as.name
and noquote
function work when just referencing the vector/dataset but don't work when attempting to assign values to the target variable, if possible could anyone share why this happens?)
When we extract some information from SQL or other data sources we might have some information separated by comma or some other delimiters as one variable. How could we test if certain values is among one of the values separated by commas? See the example below:
1567 %in% c(1567,1456,123)
TRUE
a <- "c(1567,1456,123)"
noquote(a)
c(1567,1456,123)
1567 %in% noquote(a)
FALSE
1567 %in% list(noquote(a))
FALSE
b <- "1567,1456,123"
noquote(b)
1567,1456,123
1567 %in% noquote(strsplit(a,","))
FALSE
1567 %in% list(noquote(strsplit(a,",")))
FALSE
I kind of get why the %in%
here doesn't work, seems like R is taking 1567,1456,123
as one element. So I used the strsplit
to separate them. But seems that it's still not working. Wondering is there any way that allows us to get R taking the string as commands?
Upvotes: 1
Views: 138
Reputation: 13903
If all you need to do is convert comma-separated lists like "1567,1456,123"
into R vectors like c(1567, 1456, 123)
, you definitely do not need to wrap them in c(...)
and try to evaluate them directly as vectors. You should just use strsplit
to split the data:
data_str <- "1567,1456,123"
data_vec <- as.integer(strsplit(string_data, ","))
stopifnot(1567 %in% data_vec)
Note that strsplit
returns a list, because it can also character vectors of length greater than one:
stopifnot(
all.equal(
list(c("a", "b"), c("x", "y")),
strsplit(c("a,b", "x,y"), ",")) == TRUE)
which makes it useful for operating on columns of SQL output:
| id | concatenated_field |
|----|--------------------|
| 1 | 5362,395,9000,7 |
| 2 | 319,75624,63 |
(etc.)
d <- data.frame(
id = c(1, 2),
concatenated_field = c("5362,395,9000,7", "319,75624,63"))
d$split_field <- strsplit(d$concatenated_field, ",")
sapply(d, class)
# id concatenated_field split_field
# "numeric" "character" "list"
d$split_field[[1]]
# [1] "5362" "395" "9000" "7"
Alternatively, if you're reading in one big stream of comma-separated data, you can use scan
:
data_vec <- scan(
what = 0, # arcane way to say "expect numeric input"
sep = ",",
text = "1,2,3,4,5,6,7,8,9,10")
stopifnot(all.equal(data_vec, 1:10) == TRUE)
scan
is more heavy-duty than strsplit
and can handle more complicated inputs as well, such as data with quoted fields:
weird_data <- scan(what="", sep=",", text='marvin,ruby,"joe,joseph",dean')
print(weird_data)
# [1] "marvin" "ruby" "joe,joseph" "dean"
If you are really really sure you need to be able to accept and evaluate R code passed as an input (this can be VERY DANGEROUS since it means you will be executing arbitrary unverified R code), you can use
r_code_string <- 'c("a", "b"), c("x", "y"))'
stopifnot(
all.equal(
c("a", "b"), c("x", "y")),
eval(parse(r_code_string))) == TRUE)
parse
converts raw text into an unevaluated "expression", which is a representation of R code in the form of a special R object, eval
passes the expression to the interpreter for execution.
As for noquote
, it doesn't do what you think it does. It doesn't actually modify the string, it just adds a flag to the variable so that it will print without quotation marks. You can emulate this behavior with print(..., quote = FALSE)
.
Upvotes: 2