Reputation: 795
I have a dataframe like so:
df<- data.frame(date= c(rep("10-29-16", 3), rep("11-14-16", 2),
"12-29-16","10-2-17","9-2-17"),
loc= c(rep("A", 3), rep("B", 2),"A","PlotA","PlotB"),
obs_network= c(rep("NA", 3), rep("NA", 2),"NA","PlotA","PlotB"))
For obs_network
which are NA
I want to give them a name for each unique date
and loc
combo. I would like the unique groups to be assigned a unique number and the prefix "pseudoplot" for this naming scheme. So the output would look like this:
output<- data.frame(date= c(rep("10-29-16", 3), rep("11-14-16", 2),
"12-29-16","10-2-17","9-2-17"),
loc= c(rep("A", 3), rep("B", 2),"A","PlotA","PlotB"),
obs_network= c(rep("pseudoplot_1", 3),rep("pseudoplot_2", 2),"pseudoplot_3","PlotA","PlotB"))
I have tried the following without success and I cannot identify my error. Using the code below all the levels read "pseudoplot1". I would greatly appreciate it if someone explained why my code is not working in addition to providing a solution.
output<-
df %>%
group_by(date, loc)%>%
mutate(obs_network=ifelse(is.na(obs_network),
paste0("pseudoplot", "_", match(loc, unique (loc))),
obs_network))
Upvotes: 1
Views: 805
Reputation: 23574
This is something I could come up with. There are conditions: 1) date
is a date object, and 2) loc
and obs_network
are character vectors. I create a sample example below. date
is a date object, loc
and obs_network
are character vectors.
date loc obs_network
1 2016-10-29 A <NA>
2 2016-10-29 A <NA>
3 2016-10-29 A <NA>
4 2016-11-14 B <NA>
5 2016-11-14 B <NA>
6 2016-12-29 A <NA>
7 2017-10-02 PlotA PlotA
8 2017-09-02 PlotB PlotB
9 2017-10-10 A <NA>
10 2017-10-10 B <NA>
I used two things. One is that I used differences between two dates. The other is that I used the differences in order to create unique group numbers for unique dates with cumsum()
. By pasting unique group numbers and loc
, I created unique groups.
mydf %>%
mutate(obs_network = if_else(is.na(obs_network),
paste0("pseudoplot_", cumsum(c(T, abs(diff(date)) > 0)), loc, sep = ""),
obs_network))
# date loc obs_network
#1 2016-10-29 A pseudoplot_1A
#2 2016-10-29 A pseudoplot_1A
#3 2016-10-29 A pseudoplot_1A
#4 2016-11-14 B pseudoplot_2B
#5 2016-11-14 B pseudoplot_2B
#6 2016-12-29 A pseudoplot_3A
#7 2017-10-02 PlotA PlotA
#8 2017-09-02 PlotB PlotB
#9 2017-10-10 A pseudoplot_6A
#10 2017-10-10 B pseudoplot_6B
mydf <- structure(list(date = structure(c(17103, 17103, 17103, 17119,
17119, 17164, 17441, 17411, 17449, 17449), class = "Date"), loc = c("A",
"A", "A", "B", "B", "A", "PlotA", "PlotB", "A", "B"), obs_network = c(NA,
NA, NA, NA, NA, NA, "PlotA", "PlotB", NA, NA)), .Names = c("date",
"loc", "obs_network"), row.names = c(NA, -10L), class = "data.frame")
Upvotes: 1
Reputation: 2050
A few notes:
You have included "NA"
in your dataframe - so these are text (actually factors) not actually NA
values. I recommend changing your original dataframe.
df <- tibble(date= c(rep("10-29-16", 3),
rep("11-14-16", 2),"12-29-16","10-2-17","9-2-17"),
loc= c(rep("A", 3), rep("B", 2), "A", "PlotA", "PlotB"),
obs_network= c(rep(NA, 6), "PlotA", "PlotB"))
There are going to be issues using factors (what you were creating in your database) and character vectors or integers using ifelse. I've change the dataset to a tibble
so that everything is a character and am using if_else
.
Last don't use a group_by
for this simply keep everything flat
df %>%
mutate(obs_network = if_else(is.na(obs_network),
paste0("pseudoplot", "_", match(paste0(date,loc), unique(paste0(date,loc)))),
obs_network))
Upvotes: 0