Reputation: 240
In my Knitted document, I am trying to print a dataframe's column. Just to aid the visualization, I would like to change the output color to change based on the value of another column. I have a simple example as follows.
date_vector <- rep(NA, 10)
type_vector <- rep(NA, 10)
types <- c("A", "B")
CDate <- Sys.Date()
date_vector[1] <- as.character(CDate)
type_vector[1] <- sample(types, size = 1)
for (i in 2:10) {
CDate <- as.Date(CDate) + rexp(n = 1, rate = 1/5)
date_vector[i] <- as.character(CDate)
type_vector[i] <- sample(types, size = 1)
}
test_df <- data.frame(Date=date_vector, Type=type_vector)
When I print test_df$Date
, I see the following
date_vector
[1] "2016-01-06" "2016-01-07" "2016-01-22" "2016-01-28" "2016-01-29" "2016-02-01" "2016-02-04"
[8] "2016-02-12" "2016-02-13" "2016-02-15"
Instead, would like to see the following
Since the type of the entries was as follows
type_vector
[1] "A" "A" "B" "B" "A" "A" "B" "A" "B" "A"
So blue represents dates with type A
and green represents dates with type B
.
Upvotes: 4
Views: 364
Reputation: 14957
This answer is more general than the question. The question asks for a way to color one column of a data frame depending on another column. This answer addresses the more general case of highlighting elements in a vector depending on a second, logical vector indicating which elements to highlight.
In principle, this is quite trivial: Print a vector, highlighting the elements indicated by another, logical vector. Highlighting x
can be as simple as wrapping it in \\textcolor{blue}{x}
or \\emph{x}
.
In practice, it's not that simple … print(x)
does a lot of useful things: It arranges the data nicely in columns, adds quotes around character data, wraps the output to respect getOption("width)
, adds the index of the first element to each line of output, and so on. The problem is, we cannot use print
to print the highlighted data because print
escapes the backslashes in \\textcolor
. The standard solution to this issue is to use cat
instead of print
. However, cat
does not apply any of the nice formattings listed above.
So the challenge is to write a function that reproduces some/the required features of print
. This is a quite involved task, so I limit myself to the following main features:
<= getOption("width")
.quote
is not set).printIndex = TRUE
).digits
).Plus, these two highlighting-features:
x
indicated by condition
in a "highlighting pattern"Note that this function lacks important features of print
like handling missing values. Besides, it converts the input x
to character (via as.character
). The result of this might be different than with print
because the S3 methods (print.*
) corresponding to the input class are not used at all.
printHighlighted <- function(x, condition = rep(FALSE, length(x)), highlight = "\\emph{%s}", printIndex = TRUE, width = getOption("width"), digits = getOption("digits"), quote = NULL) {
stopifnot(length(x) == length(condition))
stopifnot(missing(digits) || (!missing(digits) && is.numeric(x))) # Raise error when input is non-numeric but "digits" supplied.
if (missing(quote)) {
if (is.numeric(x) || is.logical(x)) {
quote <- FALSE
} else {
quote <- TRUE
}
}
nquotes <- 0
if (!printIndex) {
currentLineIndex <- ""
}
if (is.numeric(x)) {
x <- round(x, digits = digits)
}
fitsInLine <- function(x, elementsCurrentLine, currentLineIndex, nquotes, width) {
return(sum(nchar(x[elementsCurrentLine])) + # total width of elements in current line
nchar(currentLineIndex) + # width of the index of the first element (if shown)
sum(elementsCurrentLine) - 1 + # width of spaces between elements
nquotes <= # width of quotes added around elements
width)
}
x <- as.character(x)
elementsCurrentLine <- rep(FALSE, times = length(x))
for (i in seq_along(x)) {
if (!any(elementsCurrentLine) && printIndex) { # this is a new line AND show index
currentLineIndex <- sprintf("[%s] ", i)
}
elementsCurrentLine[i] <- TRUE # Add element i to current line. Each line holds at least one element. Therefore, if i is the first element of this line, add it regardless of line width. If there already are elements in the line, the previous loop iteration checked that this element will fit.
if (i < length(x)) { # not the last element
# check whether next element will fit in this line
elementsCurrentLineTest <- elementsCurrentLine
elementsCurrentLineTest[i + 1] <- TRUE
if (quote) {
nquotes <- sum(elementsCurrentLineTest) * 2
}
if (fitsInLine(x, elementsCurrentLineTest, currentLineIndex, nquotes, width)) {
next # Next element will fit; do not print yet.
}
}
# Next element won't fit in current line. Print and start a new line.
# print
toPrint <- x[elementsCurrentLine]
toMarkup <- condition[elementsCurrentLine]
toPrint[toMarkup] <- sprintf(fmt = highlight, toPrint[toMarkup]) # add highlighting
if (quote) {
toPrint <- sprintf('"%s"', toPrint)
}
cat(currentLineIndex)
cat(toPrint)
cat("\n")
# clear line
elementsCurrentLine <- rep(FALSE, times = length(x))
}
}
To use this function with knitr
, the chunk option results = "asis"
must be used because otherwise the output is wrapped in a verbatim
environment where the markup responsible for the highlighting is displayed instead of used.
Finally, to reproduce the look of normal chunks, wrap the whole chunk in
\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}
\begin{kframe}
\begin{alltt}
<<your-chunk>>=
printHighlighted(...)
@
\end{alltt}
\end{kframe}
\end{knitrout}
To save some space, the example assumes that the function definition of printHighlighted
is available in a file printHighlighted.R
.
\documentclass{article}
\begin{document}
Some text ....
\begin{knitrout}\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}\begin{alltt}
<<results = "asis", echo = FALSE>>=
source("printHighlighted.R")
data <- seq(from = as.Date("2015-01-15"), by = "day", length.out = 100)
cond <- rep(FALSE, 100)
cond[c(3, 55)] <- TRUE
printHighlighted(x = data, condition = cond, highlight = "\\textcolor{blue}{%s}", width = 60)
@
\end{alltt}\end{kframe}\end{knitrout}
Some text ....
\end{document}
This turned out to be quite lenghty ... if someone thinks this was an overkill for such a simple question, I'd love to see shorter solutions.
Upvotes: 1