Adding Dummie rows to your data in R

Question

I am working with R. I have a data set that looks like this.

structure(
  list(
    category = c(
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "2",
      "2",
      "2",
      "2"
    ),
    Cue = c(
      "Kind",
      "love",
      "acount",
      "bright",
      "smart",
      "land",
      "flag",
      "affect",
      "place",
      "street",
      "sun",
      "cold"
    ),
    categoryProduced = c(
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1",
      "1"
    ),
    mean = c(3, 1, 1, 1, 2, 4, 4,
             4, 4, 2, 4, 2)
  ),
  row.names = c(NA, -12L),
  groups = structure(
    list(
      category = c(
        "1",
        "1",
        "1",
        "1",
        "1",
        "1",
        "1",
        "1",
        "2",
        "2",
        "2",
        "2"
      ),
      Cue = c(
        "Kind",
        "love",
        "acount",
        "bright",
        "smart",
        "land",
        "flag",
        "affect",
        "place",
        "street",
        "sun",
        "cold"
      ),
      .rows = structure(
        list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
             9L, 10L, 11L, 12L),
        ptype = integer(0),
        class = c("vctrs_list_of",
                  "vctrs_vctr", "list")
      )
    ),
    row.names = c(NA, 12L),
    class = c("tbl_df",
              "tbl", "data.frame"),
    .drop = TRUE
  ),
  class = c("grouped_df",
            "tbl_df", "tbl", "data.frame")
)

I need to reach 20 observations for each category. Right now, I have 8 observations in the first category and 4 in the second category. Son one needs 12 and the other one 16.

I need the resulting data to look like this..

category    Cue    categoryProduced  mean
1 1         Kind      1                    3
2 1        love       1                    1
3 1        acount     1                    1
4 1        bright     1                    1
5 1        smart      1                    2
6 1        land       1                    4
7 1        flag       1                    4
8 1        affect     1                    4
9 1          NA       0                    0
101          NA       0                    0
11 1         NA       0                    0 
12 2        place     1                    4
13 2        street    1                    2
14 2        sun       1                    4
15 2        cold      1                    2
16 2          NA      0                    0 
17 2          NA      0                    0

That until I have 20 observations for each category. What is written under Cue could be anything. What I sure need is a 0 under category produced and mean.

Any help in this would great. Thanks.

Ronak Shah · Accepted Answer

With the help of tidyr::complete you can do

library(dplyr)
library(tidyr)

df %>%
  group_by(category) %>%
  mutate(count = row_number()) %>%
  complete(count = 1:20, fill = list(categoryProduced = 0, mean = 0)) %>%
  ungroup

# A tibble: 40 x 5
#   category count Cue    categoryProduced  mean
#                      
# 1 1            1 Kind   1                    3
# 2 1            2 love   1                    1
# 3 1            3 acount 1                    1
# 4 1            4 bright 1                    1
# 5 1            5 smart  1                    2
# 6 1            6 land   1                    4
# 7 1            7 flag   1                    4
# 8 1            8 affect 1                    4
# 9 1            9 NA     0                    0
#10 1           10 NA     0                    0
# … with 30 more rows

count column has count of rows for each category from 1 to 20. You can remove the count column with select(-count) if not needed in the final output.

Adding Dummie rows to your data in R

Answers (2)

Related Questions