Change multiple values in a dataframe based on two other values

Question

If anyone mind lending some knowledge... What I am trying to do is make a new dataframe based on the below data frame values.

id   value
ant    10
cat    4
cat    6
dog    5
dog    3
dog    2
fly    9

What I want to do next is, in sequential order I want to make a dataframe that looks like the following.

Every time we see a new id, we create a column. The max value is 10 so there should be 10 rows.
Our first word is ant and so therefore for every row of ant, I would like a 0.
Our next column is cat. We have 2 values and what I would like to do is for the first value we see, the first 4 rows must be 0 which is followed by 6 rows of 1.
Same logic for dog, with first five rows as 0 and next three rows as 1 and last 2 as 0.
Fly has only 9 rows of 0 and the last row should contain NA.

It should look like this

ant  cat  dog  fly
0    0    0    0
0    0    0    0
0    0    0    0
0    0    0    0
0    1    0    0
0    1    1    0
0    1    1    0
0    1    1    0
0    1    0    0
0    1    0    NA

I know how to do this the long way by

newdf <- data.frame(matrix(2, ncol = length(unique(df[,"id"])) , nrow = 10))
newdf$X1[1:10] <- 0
newdf$X2[1:4] <- 0
newdf$X2[5:10] <- 1
...

However, is there any way to do this more efficiently? Note that my actual data will have roughly 50 rows so that's why I am looking for a more efficient way to complete this!

Ronak Shah · Accepted Answer

Here's a tidyverse answer -

library(dplyr)
library(tidyr)

df %>%
  group_by(id) %>%
  mutate(val = rep(c(0, 1), length.out = n())) %>%
  uncount(value) %>%
  mutate(row = row_number()) %>%
  complete(row = 1:10) %>%
  pivot_wider(names_from = id, values_from = val) %>%
  select(-row)

#     ant   cat   dog   fly
#      
# 1     0     0     0     0
# 2     0     0     0     0
# 3     0     0     0     0
# 4     0     0     0     0
# 5     0     1     0     0
# 6     0     1     1     0
# 7     0     1     1     0
# 8     0     1     1     0
# 9     0     1     0     0
#10     0     1     0    NA

For each id we assign an alternate 0, 1 value and use uncount to repeat the rows based on the count. Get the data in wide format so that we have a separate column for each id.

data

df <- structure(list(id = c("ant", "cat", "cat", "dog", "dog", "dog", 
"fly"), value = c(10, 4, 6, 5, 3, 2, 9)), row.names = c(NA, -7L
), class = "data.frame")

Change multiple values in a dataframe based on two other values

Answers (2)

Related Questions