J. Mini
J. Mini

Reputation: 1610

What can a data frame do that a tibble cannot?

Fans of the Tidyverse regularly give several advantages of using tibbles rather than data frames. Most of them seem designed to protect the user from making mistakes. For example, unlike data frames, tibbles:

I'm steadily becoming convinced to replace all of my data frames with tibbles. What are the primary disadvantages of doing so? More specifically, what can a data frame do that a tibble cannot?

Preemptively, I would like to make it clear that I am not asking about data.table or any big-picture objections to the Tidyverse. I am strictly asking about tibbles and data frames.

Upvotes: 9

Views: 5591

Answers (2)

TarJae
TarJae

Reputation: 79164

Learned here: https://cran.r-project.org/web/packages/tibble/vignettes/tibble.html

There are three key differences between tibbles and data frames:

  • printing
  • subsetting
  • recycling rules

Tibbles:

  • Never change an input’s type (i.e., no more stringsAsFactors = FALSE!)
  • Never adjust the names of variables
  • Evaluate arguments lazily and sequentially
  • Never use row.names()
  • Only recycle vectors of length 1

Large data frames are displayed with as many rows as possible until the memory buffer is overwhelmed. R will stop in this situation at an arbitrary section of the data frame.

In tibble format only the first ten rows and all fitting columns are displayed. Colum data type and size of the data set is also displayed.

Upvotes: 0

Waldi
Waldi

Reputation: 41240

From the trouble with tibbles, you can read :

there isn’t really any trouble with tibbles

However,

Some older packages don’t work with tibbles because of their alternative subsetting method. They expect tib[,1] to return a vector, when in fact it will now return another tibble.

This is what @Henrik pointed out in comments.

As an example, the length function won't return the same result:

library(tibble)
tibblecars <- as_tibble(mtcars)
tibblecars[,"cyl"]
#> # A tibble: 32 x 1
#>      cyl
#>    <dbl>
#>  1     6
#>  2     6
#>  3     4
#>  4     6
#>  5     8
#>  6     6
#>  7     8
#>  8     4
#>  9     4
#> 10     6
#> # ... with 22 more rows
length(tibblecars[,"cyl"])
#> [1] 1
mtcars[,"cyl"]
#>  [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
length(mtcars[,"cyl"])
#> [1] 32

Other example :

Invariants for subsetting and subassignment explains where the behaviour from tibble differs from data.frame.

These limitations being known, the solution given by Hadley in interacting with legacy code is:

A handful of functions don’t work with tibbles because they expect df[, 1] to return a vector, not a data frame. If you encounter one of these functions, use as.data.frame() to turn a tibble back to a data frame:

Upvotes: 3

Related Questions