Nirshad Nijam
Nirshad Nijam

Reputation: 182

How to get the average from an array like string in R?

I have this string variable.

x <- "[2,3,3,5]"

I want to get the average of this. How can I achieve this on R?

Upvotes: 3

Views: 90

Answers (6)

user438383
user438383

Reputation: 6206

Extract all the digits and then take the mean:

library(stringr)
library(dplyr)

str_split(x, ",")[[1]] %>% 
    str_remove_all("\\[|\\]") %>% 
    as.numeric %>% 
    mean
[1] 3.25

Upvotes: 2

Carl Witthoft
Carl Witthoft

Reputation: 21532

[this is posted as an answer rather than a comment because I'm demonstrating the value of the incorrect answers] At least two of the posted answers are incorrect, or at least correct only in the infamous "Microsoft Answer" way. Consider:

x <- "[2,37,1, -45]"
Rgames> mean(as.numeric(strsplit(x, '\\D')[[1]]), na.rm = TRUE)
[1] 21.25
Rgames> mean(as.numeric(str_extract_all(x, "[0-9]")[[1]]))
[1] 3.666667
Rgames> mean(c(2,37,1,-45))
[1] -1.25

If you want to extract both positive and negative integers, not to mention integers of absolute magnitude > 10, or floats such as x <- "[4.8,-65]" you will need considerably better regex-fu.

As such, Ricard SeC's answer is highly preferable:

Rgames> stringr::str_replace_all(x, c("\\[" = "c\\(", "\\]" = "\\)")) %>% parse(text = .) %>% eval() %>% mean()
[1] -1.25

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 102201

We can replace [] as c() to make a valid expression string in R and then eval it, e.g.,

> mean(eval(str2lang(paste0("c", chartr("[]", "()", x)))))
[1] 3.25

or using scan + substr

> mean(scan(text = substr(x, 2, nchar(x) - 1), sep = ",", quiet = TRUE))
[1] 3.25

Or, Similarily, we can try py_eval to parse the expression in a Python manner

> library(reticulate)

> mean(py_eval(x))
[1] 3.25

Upvotes: 1

jay.sf
jay.sf

Reputation: 73242

Looks like json format.

mean(jsonlite::fromJSON(x))
# [1] 3.25

Data:

x <- "[2,3,3,5]"

Upvotes: 4

Ricardo Semi&#227;o
Ricardo Semi&#227;o

Reputation: 4456

You can also change the "["'s into "("'s and ask for R to interpret the string as a expression with parse and eval:

stringr::str_replace_all(x, c("\\[" = "c\\(", "\\]" = "\\)")) %>% parse(text = .) %>% eval() %>% mean()

Upvotes: 0

Allan Cameron
Allan Cameron

Reputation: 174238

In base R:

mean(as.numeric(strsplit(x, '\\D')[[1]]), na.rm = TRUE)
#> [1] 3.25

Upvotes: 5

Related Questions