Reputation: 5
I use an API call to LimeSurvey to get data into a Shiny R app I'm working on. I then manipulate the dataframe so that I have only the responses given by a certain individual over time. The dataframe can look like this:
Appetite <- c("No","Yes","No","No","No","No","No","No","No")
Dental.Health <- c("No","Yes","No","No","No","No","Yes","Yes","No")
Dry.mouth <- c("No","Yes","Yes","Yes","Yes","No","Yes","Yes","No")
Mouth.opening <- c("No","No","Yes","Yes","Yes","No","Yes","Yes","No")
Pain.elsewhere <- c("No","Yes","No","No","No","No","No","No","No")
Sleeping <- c("No","No","No","No","No","Yes","No","No","No")
Sore.mouth <- c("No","No","Yes","Yes","No","No","No","No","No")
Swallowing <- c("No","No","No","No","Yes","No","No","No","No")
Cancer.treatment <- c("No","No","Yes","Yes","No","Yes","No","No","No")
Support.for.my.family <- c("No","No","Yes","Yes","No","No","No","No","No")
Fear.of.cancer.coming.back <- c("No","No","Yes","Yes","No","No","Yes","No","No")
Intimacy <- c("Yes","No","No","No","No","No","No","No","No")
Dentist <- c("No","Yes","No","No","No","No","No","No","No")
Dietician <- c("No","No","Yes","Yes","No","No","No","No","No")
Date.submitted <- c("2002-07-25 00:00:00",
"2002-09-05 00:00:00",
"2003-01-09 00:00:00",
"2003-01-09 00:00:00",
"2003-07-17 00:00:00",
"2003-11-06 00:00:00",
"2004-12-17 00:00:00",
"2005-06-03 00:00:00",
"2005-12-17 00:00:00")
theDataFrame <- data.frame( Date.submitted,
Appetite,
Dental.Health,
Dry.mouth,
Mouth.opening,
Pain.elsewhere,
Sleeping,
Sore.mouth,
Swallowing,
Cancer.treatment,
Support.for.my.family,
Fear.of.cancer.coming.back,
Intimacy,
Dentist,
Dietician)
To be clear, this dataframe could contain more (or fewer) observations of more (or fewer) variables than the example above.
My goal is to make a dynamic histogram that looks like the following:
library(dplyr)
library(ggplot2)
library(tidyr)
df <- data.frame(timeline = Sys.Date() - 1:10,
q3 = sample(c("Yes", "No"), size = 10, replace = T),
q4 = sample(c("Yes", "No"), size = 10, replace = T),
q5 = sample(c("Yes", "No"), size = 10, replace = T),
q6 = sample(c("Yes", "No"), size = 10, replace = T),
q7 = sample(c("Yes", "No"), size = 10, replace = T),
q8 = sample(c("Yes", "No"), size = 10, replace = T),
stringsAsFactors = F) %>%
mutate(q3 = ifelse(q3 == "Yes", 1, 0),
q4 = ifelse(q4 == "Yes", 1, 0),
q5 = ifelse(q5 == "Yes", 1, 0),
q6 = ifelse(q6 == "Yes", 1, 0),
q7 = ifelse(q7 == "Yes", 1, 0),
q8 = ifelse(q8 == "Yes", 1, 0)
) %>%
gather(key = question, value = value, q3, q4, q5, q6, q7, q8)
g <- ggplot(df, aes(x = timeline, y = value, fill = question)) +
geom_bar(stat = "identity")
g
I think I will need to use library(lubridate) for the timeline, as the entire dataframe is plain text. I deal with the '.' in the column names like this:
myColNames <- colnames(theDataFrame)
myNames <- myColNames
myNames <- gsub("^X\\.\\.", "", myNames)
myNames <- gsub("\\.", " ", myNames)
names(theDataFrame) <- myNames # items in myChoices get "labels" from myNames
But the most challenging aspect is getting this to work dynamically. The datasets will only contain Date.submitted and (x)number of additional columns that will only be "Yes" or "No"
I hope I've given enough information (this is my first question on Stack Exchange!)
Upvotes: 0
Views: 418
Reputation: 1721
You could also use dplyr::mutate_all
and purrr::map
Note: I used stringsAsFactors = F
in theDataFrame
theDataFrame <- data.frame( Date.submitted,
Appetite,
Dental.Health,
Dry.mouth,
Mouth.opening,
Pain.elsewhere,
Sleeping,
Sore.mouth,
Swallowing,
Cancer.treatment,
Support.for.my.family,
Fear.of.cancer.coming.back,
Intimacy,
Dentist,
Dietician, stringsAsFactors = F)
-Create a function to do the conversion you want, for instance:
ConvertYesNo<- function(x){
if(x=="Yes") y <- as.integer(1)
else if (x=="No") y <- as.integer(0)
else y <- x
return(y)
}
-Use it with mutate_all
, which considers all the columns or pick the columns you want using mutate_at
. And map
the function as follows:
theDataFramex <- theDataFrame %>%
mutate_all(funs(map_chr(.,ConvertYesNo)))
> head(theDataFramex,3 )
Date.submitted Appetite Dental.Health Dry.mouth Mouth.opening Pain.elsewhere Sleeping
1 2002-07-25 00:00:00 0 0 0 0 0 0
2 2002-09-05 00:00:00 1 1 1 0 1 0
3 2003-01-09 00:00:00 0 0 1 1 0 0
Sore.mouth Swallowing Cancer.treatment Support.for.my.family Fear.of.cancer.coming.back
1 0 0 0 0 0
2 0 0 0 0 0
3 1 0 1 1 1
Intimacy Dentist Dietician
1 1 0 0
2 0 1 0
3 0 0 1
Upvotes: 0
Reputation: 887048
We can update it using base R
theDataFrame[-1] <- +(theDataFrame[-1]=="Yes")
Or with lapply
when the dataset is big
theDataFrame[-1] <- lapply(theDataFrame[-1], function(x) as.integer(x=="Yes"))
Upvotes: 1