Recology
Recology

Reputation: 137

Partial grouping inside a dataframe in R

For statistical analysis purpose, I would like to regroup some rows inside a data frame based on their values.

What I have:

number latitude
30 57
12 59
01 68
12 66
101 55
47 61
05 60
288 67

The desired output would be, for example, to regroup every latitude above 66 (66+67+68) in a single category 66+ and the desired output would be like this:

number latitude new
30 57 57
12 59 59
01 68 66+
12 66 66+
101 55 55
47 61 61
05 60 60
288 67 66+

I do not want to use an if loop because I feel that it is not really R friendly. I would also like to keep the initial column, that way I can try different combinations later on.

Thank you very much.

Upvotes: 0

Views: 43

Answers (3)

akrun
akrun

Reputation: 887098

We can use

df1$new <- df1$latitude
df1$new[df1$latitude >=66] <- "66+"

or with ifelse

df1$new <- with(df1, ifelse(latitude >=66, "66+", latitude))

-output

> df1
  number latitude new
1     30       57  57
2     12       59  59
3      1       68 66+
4     12       66 66+
5    101       55  55
6     47       61  61
7      5       60  60
8    288       67 66+

Also, as @Mael commented about the type of 'new' column, if we want to preserve the type, can also use pmin

library(dplyr)
df1 %>%
    mutate(new = pmin(latitude, 66))
   number latitude new
1     30       57  57
2     12       59  59
3      1       68  66
4     12       66  66
5    101       55  55
6     47       61  61
7      5       60  60
8    288       67  66

Upvotes: 2

Quinten
Quinten

Reputation: 41265

Option mutate and ifelse:

library(dplyr)
df %>%
  mutate(new = ifelse(latitude >= 66, "66+", latitude))

Output:

  number latitude new
1     30       57  57
2     12       59  59
3     01       68 66+
4     12       66 66+
5    101       55  55
6     47       61  61
7     05       60  60
8    288       67 66+

Data

df <- data.frame(number = c("30","12","01","12","101","47","05","288"),
                 latitude = c(57,59,68,66,55,61,60,67))

Upvotes: 1

Anurag N. Sharma
Anurag N. Sharma

Reputation: 372

library(tidyverse)

tribble(~"number",  ~"latitude",
        30, 57,
        12, 59,
        01, 68,
        12, 66,
        101,55,
        47, 61,
        05, 60,
        288,67) %>% 
  dplyr::mutate(
    new = if_else(latitude > 66,
                  "66+",
                  as.character(latitude)))

Upvotes: 1

Related Questions