Reputation: 61
Here is a sample of my dataframe:
df3 <- data.frame(Frame = c(219388, 219389, 219390, 211387, 211388, 211389), Time = c("2020-06-05 13:26:39", "2020-06-05 13:26:39", "2020-06-05 13:26:39", "2020-06-05 13:26:39", "2020-06-05 13:26:39", "2020-06-05 13:26:39"),task = c("hop", "hop", "hop", "vj", "vj", "vj"), limb = c("L", "L", "L", "R", "R", "R"), trial = c("trial1", "trial1", "trial1", "trial2", "trial2", "trial2"))
I want to add NA's to specific rows in the Frame and Time column (amount of NA rows to be added will vary in my real dataset). I also need to continue the task, limb, and trial column accordingly (i.e. hop, L, trial1 continues even on NA rows). My expected output to look like this:
> df3
Frame Time task limb trial
219388 2020-06-05 13:26:39 hop L trial1
219389 2020-06-05 13:26:39 hop L trial1
219390 2020-06-05 13:26:39 hop L trial1
NA NA hop L trial1
NA NA hop L trial1
NA NA hop L trial1
211387 2020-06-05 13:26:39 vj R trial2
211388 2020-06-05 13:26:39 vj R trial2
211389 2020-06-05 13:26:39 vj R trial2
NA NA vj R trial2
NA NA vj R trial2
I've tried insertRows from the berryFunctions package, however this changes the whole row to NA and I need task, limb, and trial columns to continue.
insertRows(df3, r=c(3:5), new=NA, rcurrent=FALSE)
Any help or suggestions would be much appreciated, thank you!
Upvotes: 1
Views: 1227
Reputation: 887941
We could group_split
based on 'task' to 'trial' column into a list
of data.frames, then loop over the list with map2
, slice
the first row, convert the 'Frame', 'Time' to NA
, expand the dataset rows with uncount
using the replication values passed in map2
, bind the dataset with the original dataset (bind_rows
) and as we are using map2_dfr
, it returns a single data.frame by row binding the list
library(dplyr) #1.0.0
library(purrr)
library(tidyr)
df3 %>%
group_split(across(task:trial)) %>%
map2_dfr(c(3, 2), ~
slice(.x, 1) %>%
mutate(across(Frame:Time, ~NA)) %>%
uncount(.y) %>%
bind_rows(.x, .))
# A tibble: 11 x 5
# Frame Time task limb trial
# <dbl> <chr> <chr> <chr> <chr>
# 1 219388 2020-06-05 13:26:39 hop L trial1
# 2 219389 2020-06-05 13:26:39 hop L trial1
# 3 219390 2020-06-05 13:26:39 hop L trial1
# 4 NA <NA> hop L trial1
# 5 NA <NA> hop L trial1
# 6 NA <NA> hop L trial1
# 7 211387 2020-06-05 13:26:39 vj R trial2
# 8 211388 2020-06-05 13:26:39 vj R trial2
# 9 211389 2020-06-05 13:26:39 vj R trial2
#10 NA <NA> vj R trial2
#11 NA <NA> vj R trial2
The group_split
is similar to base R split
except that it have some options to either keep the grouping variables in the list
of data.frames or not (and it won't name the list
elements). The idea is to split into chunks of data.frame in a list
where the values are the same in the grouping columns. So, it is a way of splitting the dataset automatically without manually suggesting the row at which it needs to add more NA rows.
Also, if the number of NAs
to be added are constant, another option is group_by
, summarise
(in the dplyr 1.0.0 - summarise
can return more than 1 row)
df3 %>%
group_by(across(task:trial)) %>%
summarise(across(everything(), ~ c(., rep(NA, 3))))
# A tibble: 12 x 5
# Groups: task, limb, trial [2]
# task limb trial Frame Time
# <chr> <chr> <chr> <dbl> <chr>
# 1 hop L trial1 219388 2020-06-05 13:26:39
# 2 hop L trial1 219389 2020-06-05 13:26:39
# 3 hop L trial1 219390 2020-06-05 13:26:39
# 4 hop L trial1 NA <NA>
# 5 hop L trial1 NA <NA>
# 6 hop L trial1 NA <NA>
# 7 vj R trial2 211387 2020-06-05 13:26:39
# 8 vj R trial2 211388 2020-06-05 13:26:39
# 9 vj R trial2 211389 2020-06-05 13:26:39
#10 vj R trial2 NA <NA>
#11 vj R trial2 NA <NA>
#12 vj R trial2 NA <NA>
Also, with berryFunctions
, after creating NA
rows using insertRows
, fill
the columns of interest
library(berryFunctions)
insertRows(df3, r=4:6, new=NA, rcurrent= FALSE) %>%
insertRows(., r = 10) %>%
fill(task:trial)
# Frame Time task limb trial
#1 219388 2020-06-05 13:26:39 hop L trial1
#2 219389 2020-06-05 13:26:39 hop L trial1
#3 219390 2020-06-05 13:26:39 hop L trial1
#4 NA <NA> hop L trial1
#5 NA <NA> hop L trial1
#6 NA <NA> hop L trial1
#7 211387 2020-06-05 13:26:39 vj R trial2
#8 211388 2020-06-05 13:26:39 vj R trial2
#9 211389 2020-06-05 13:26:39 vj R trial2
#10 NA <NA> vj R trial2
#11 NA <NA> vj R trial2
Upvotes: 1