Reputation: 569
In a data table, all the cells are numeric, and what i want do is to replace all the numbers into a string like this:
Numbers in [0,2]: replace them with the string "Bad"
Numbers in [3,4]: replace them with the string "Good"
Numbers > 4 : replace them with the string "Excellent"
Here's an example of my original table called "data.active":
My attempt to do that is this:
x <- c("churches","resorts","beaches","parks","Theatres",.....)
for(i in x){
data.active$i <- as.character(data.active$i)
data.active$i[data.active$i <= 2] <- "Bad"
data.active$i[data.active$i >2 && data.active$i <=4] <- "Good"
data.active$i[data.active$i >4] <- "Excellent"
}
But it doesn't work. is there any other way to do this?
EDIT
Here's the link to my dataset GoogleReviews_Dataset and here's how i got the table in the image above:
library(FactoMineR)
library(factoextra)
data<-read.csv2(file.choose())
data.active <- data[1:10, 4:8]
Upvotes: 1
Views: 2631
Reputation: 15065
You can use the tidyverse
's mutate
-across
combination to condition on the ranges:
library(tidyverse)
df <- tibble(
x = 1:5,
y = c(1L, 2L, 2L, 2L, 3L),
z = c(1L,3L, 3L, 3L, 2L),
a = c(1L, 5L, 6L, 4L, 8L),
b = c(1L, 3L, 4L, 7L, 1L)
)
df %>% mutate(
across(
.cols = everything(),
.fns = ~ case_when(
.x <= 2 ~ 'Bad',
(.x > 3) & (. <= 4) ~ 'Good',
(.x > 4) ~ 'Excellent',
TRUE ~ as.character(.x)
)
)
)
The .x
above represents the element being evaluated (using a purrr-style functioning). This results in
# A tibble: 5 x 5
x y z a b
<chr> <chr> <chr> <chr> <chr>
1 Bad Bad Bad Bad Bad
2 Bad Bad 3 Excellent 3
3 3 Bad 3 Excellent Good
4 Good Bad 3 Good Excellent
5 Excellent 3 Bad Excellent Bad
For changing only select columns, use a selection in your .cols
parameter for across
:
df %>% mutate(
across(
.cols = c('a', 'x', 'b'),
.fns = ~ case_when(
.x <= 2 ~ 'Bad',
(.x > 3) & (. <= 4) ~ 'Good',
(.x > 4) ~ 'Excellent',
TRUE ~ as.character(.x)
)
)
)
This yields
# A tibble: 5 x 5
x y z a b
<chr> <int> <int> <chr> <chr>
1 Bad 1 1 Bad Bad
2 Bad 2 3 Excellent 3
3 3 2 3 Excellent Good
4 Good 2 3 Good Excellent
5 Excellent 3 2 Excellent Bad
Upvotes: 2
Reputation: 13125
x<-c('x','y','z')
df[,x] <- lapply(df[,x], function(x)
cut(x ,breaks=c(-Inf,2,4,Inf),labels=c('Bad','Good','Excellent'))))
Data
df<-structure(list(x = 1:5, y = c(1L, 2L, 2L, 2L, 3L), z = c(1L,3L, 3L, 3L, 2L),
a = c(1L, 5L, 6L, 4L, 8L),b = c(1L, 3L, 4L, 7L, 1L)),
class = "data.frame", row.names = c(NA, -5L))
Upvotes: 1