Mel
Mel

Reputation: 510

Creating data frame with repeat rows

I want to create a data frame with rows that repeat.

Here is my original dataset:

> mtcars_columns_a
  variables_interest data_set data_set_and_variables_interest      mean
1                mpg   mtcars                      mtcars$mpg  20.09062
2               disp   mtcars                     mtcars$disp 230.72188
3                 hp   mtcars                       mtcars$hp 146.68750

Here is my desire dataset

> mtcars_columns_b
  variables_interest data_set data_set_and_variables_interest      mean
1                mpg   mtcars                      mtcars$mpg  20.09062
2                mpg   mtcars                      mtcars$mpg  20.09062
3               disp   mtcars                     mtcars$disp 230.72188
4               disp   mtcars                     mtcars$disp 230.72188
5                 hp   mtcars                       mtcars$hp 146.68750
6                 hp   mtcars                       mtcars$hp 146.68750

I know how to do this the long way manually, but this is time consuming and rigid. Is there a quicker way to do this that is more automated and flexible?



Here is the code I used to create the dataset:

# mtcars data

## displays data
mtcars

## 3 row data set

### lists columns of interest
# ---- NOTE: REQUIRES MANUAL INPUT
# ---- NOTE: lists variables of interest
mtcars_columns_a <- 
  data.frame(
    c(
      "mpg",
      "disp",
      "hp"
    )
  )
# ---- NOTE: REQUIRES MANUAL INPUT
# ---- NOTE: adds colnames
names(mtcars_columns_a)[names(mtcars_columns_a) == 'c..mpg....disp....hp..'] <- 'variables_interest'

### adds data set info
mtcars_columns_a$data_set <- 
  c("mtcars")

### creates data_set_and_variables_interest column
mtcars_columns_a$data_set_and_variables_interest <- 
  paste(mtcars_columns_a$data_set,mtcars_columns_a$variables_interest,sep = "$")

### creates mean column
mtcars_columns_a$mean <-
  c(
    mean(mtcars$mpg),
    mean(mtcars$disp),
    mean(mtcars$hp)
  )

## 6 row data set., the long way

### lists columns of interest
# ---- NOTE: REQUIRES MANUAL INPUT
# ---- NOTE: lists variables of interest
mtcars_columns_b <- 
  data.frame(
    c(
      "mpg",
      "mpg",
      "disp",
      "disp",
      "hp",
      "hp"
    )
  )
# ---- NOTE: REQUIRES MANUAL INPUT
# ---- NOTE: adds colnames
names(mtcars_columns_b)[names(mtcars_columns_b) == 'c..mpg....mpg....disp....disp....hp....hp..'] <- 'variables_interest'

### adds data set info
mtcars_columns_b$data_set <- 
  c("mtcars")

### creates data_set_and_variables_interest column
mtcars_columns_b$data_set_and_variables_interest <- 
  paste(mtcars_columns_b$data_set,mtcars_columns_b$variables_interest,sep = "$")

### creates mean column
mtcars_columns_b$mean <-
  c(
    mean(mtcars$mpg),
    mean(mtcars$mpg),
    mean(mtcars$disp),
    mean(mtcars$disp),
    mean(mtcars$hp),
    mean(mtcars$hp)
  )

Upvotes: 0

Views: 68

Answers (4)

akrun
akrun

Reputation: 886938

Another option is uncount

library(dplyr)
library(tidyr)
mtcars_columns_a %>%
   uncount(2)

Upvotes: 3

ThomasIsCoding
ThomasIsCoding

Reputation: 101024

You can try rep like below

mtcars_columns_a[rep(seq(nrow(mtcars_columns_a)), each = 2),]

Upvotes: 2

bcarlsen
bcarlsen

Reputation: 1441

The order of records in a data.frame object is usually not meaningful, so you could just do:

rbind(mtcars_columns_a, mtcars_columns_a)

If you need it to be in the order you showed, this is also simple:

mtcars_columns_b <- rbind(mtcars_columns_a, mtcars_columns_a)
mtcars_columns_b[order(mtcars_columns_b, mtcars_columns_b$name),]

Upvotes: 2

Peter
Peter

Reputation: 12699

Based on your expected output is this the sort of thing you were after?

The selection of required variables is made with the select function and the mean calculated using the summarise function following group_by variables.

The duplication of data and adding of additional variables (not really sure if these are necessary) is carried out using mutate.

You can edit variable names using the dplyr::rename function.

library(dplyr)
library(tidyr)


df <- 
  mtcars %>% 
  select(mpg, disp, hp) %>% 
  pivot_longer(everything()) %>% 
  group_by(name) %>% 
  summarise(mean = mean(value))

df1 <- 
  bind_rows(df, df) %>% 
  arrange(name) %>% 
  mutate(dataset = "mtcars",
         variable = paste(dataset, name, sep = "$"))

df1
#> # A tibble: 6 x 4
#>   name   mean dataset variable   
#>   <chr> <dbl> <chr>   <chr>      
#> 1 disp  231.  mtcars  mtcars$disp
#> 2 disp  231.  mtcars  mtcars$disp
#> 3 hp    147.  mtcars  mtcars$hp  
#> 4 hp    147.  mtcars  mtcars$hp  
#> 5 mpg    20.1 mtcars  mtcars$mpg 
#> 6 mpg    20.1 mtcars  mtcars$mpg

Created on 2021-04-06 by the reprex package (v1.0.0)

Upvotes: 2

Related Questions