Anna Bokun
Anna Bokun

Reputation: 55

Generate new variable based on conditions

So I have this df with state and year information. My goal is to generate a new variable, state_year, so that Alabama in 1982 gets assigned a 1, Alabama in 1983 gets assigned a 2, Alabama in 1984 gets assigned a 3, etc.

When I try the following, I get "TRUE" for the right cases, but I want it to say "1" (and then subsequently 2 for AL in 1983, etc).

test <- df %>%
    mutate(state_year = statefip == 1 & year == 1982)

enter image description here

Upvotes: 0

Views: 38

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389045

For each state you can convert year to factor and then to integer to get a unique number.

library(dplyr)
df %>%
  group_by(state) %>%
  mutate(state_year = as.integer(factor(year)))

If we want to unique number for each state-year combination we can paste state and year together and convert to factor and then integer.

df %>%
  mutate(state_year = paste0(state, year), 
         state_year = as.integer(factor(state_year, levels = unique(state_year))))

Upvotes: 1

akrun
akrun

Reputation: 887221

We could group by 'state' and get the unique ids by applying rleid on the 'statefip', 'year' (assuming the columns are ordered)

library(data.table)
setDT(df)[, state_year := rleid(statefip, year), state]

Or with dplyr

library(dplyr)
library(stringr)
df %>%
    mutate(state_year = str_c(state_fip, year)) %>%
    group_by(state) %>%
    mutate(state_year = match(state_year, unique(state_year))

Upvotes: 1

Related Questions