Reputation: 4481
library(tidyverse)
df0 <- data.frame(col1 = c(5, 2), col2 = c(6, 4))
df1 <- data.frame(col1 = c(5, 2),
col2 = c(6, 4),
col3 = ifelse(apply(df0[, 1:2], 1, sum) > 10 &
df0[, 2] > 5,
"True",
"False"))
df2 <- as_tibble(df1)
I've got my data frame df1
above. I've basically "copied" it as a tibble df2
. Let's mimic an analysis for this df1
data frame and df2
tibble.
identical(df1[[2]], df1[, 2])
# [1] TRUE
identical(df2[[2]], df2[, 2])
# [1] FALSE
Since df1
and df2
are essentially the "same", why do I get the TRUE/FALSE dichotomy in my code block above. What is the tibble()
property that has changed?
The same question asked another way - what is the difference between [[X]]
and [, X]
, when applied to base R, and also when used in the tidyverse?
Upvotes: 3
Views: 265
Reputation: 13309
Since all lists are vectors, we can think of this in terms of list subsetting. Take for instance:
L <- list(A = c(1, 2), B = c(1, 4))
L[[2]]
This Extract
s the second element of the list. Extrapolate this to:
df1[[2]]
We get the same output as df1[, 2]
hence identical(df1[[2]], df1[, 2])
returns TRUE
.
The second part is to do with tibble
structure ie:
typeof(as_tibble(df1)[[2]])
[1] "double"
typeof(as_tibble(df1[, 2]))
[1] "list"
The second is a list
while the first
is a vector hence identical
returns FALSE
.
Objects of class tbl_df have:(From the docs)
A class attribute of c("tbl_df", "tbl", "data.frame")
.
A base type of "list", where each element of the list has the same NROW().
A names attribute that is a character vector the same length as the underlying list.
A row.names attribute, included for compatibility with the base data.frame class. This attribute is only consulted to query the number of rows, any row names that might be stored there are ignored by most tibble methods.
Upvotes: 4