alko989
alko989

Reputation: 7908

Strange behaviour for data.frames without column names

There is an unexpected behaviour for data.frames without column names. The following works as expected:

df <- data.frame(a = 1:5, b = 5:9)
df + 1
##   a  b
## 1 2  6
## 2 3  7
## 3 4  8

but if we remove the column names then the behaviour is strange:

names(df) <- NULL
df + 1
## data frame with 0 columns and 0 rows

The same is happening if the names are removed with unname, setNames. Any ideas of why this happens and is it (for some reason) expected behaviour?

Edit: So it is documented that nameless data.frames have unsupported results (thanks @neilfws, @Suren) but I am also interested to the why this happens. I try to find the actual c (?) code that makes this simple example to brake.

Upvotes: 11

Views: 481

Answers (2)

Thomas Guillerme
Thomas Guillerme

Reputation: 1857

I think this ultimately comes from the fact that R considers the data.frame object as a list with specific attributes:

## A list with no attributes
list_no_attr1 <- list(c(1,2,3), c(3,2,1))

## The attributes and class of the list
attributes(list_no_attr1)
#> NULL
class(list_no_attr1)
#> "list"

We can then manually add all the data.frame attributes without changing the structure of the list:

## Adding the names to the list (not in the attributes)
list2 <- list_no_attr1
attr(list2, "names") <- c("A", "B")

## The attributes and class of the list
attributes(list2)
#> $names
#> [1] "A" "B"
class(list2)
#> "list"

## Adding the "row.names" attributes
list3 <- list2
attr(list3, "row.names") <- c("1", "2", "3")

## The attributes and class of the list
attributes(list3)
#> $names
#> [1] "A" "B"
#> $row.names
#> [1] "1" "2" "3"

class(list3)
#> "list"

This is still a list. Now when we change the class of the object to "data.frame" and it will then use the S3 method for data.frame for print and all other associated functions

## Adding a data.frame class attribute
list_data_frame <- list3
attr(list_data_frame, "class") <- "data.frame"

## The attributes and class of the list
attributes(list_data_frame)
#> $names
#> [1] "A" "B"
#> $row.names
#> [1] "1" "2" "3"
#> $class
#> [1] "data.frame"

class(list_data_frame)
#> "data.frame"

This will now print as a proper data.frame. Note that it works exactly the same the way around and can transform a data.frame back into a list if we remove the class attribute.

## The dataframe
data_frame <-  data.frame("A" = c(1,2,3), "B" = c(3,2,1))
## The attributes and class of the list
attributes(data_frame)
#> $names
#> [1] "A" "B"
#> $row.names
#> [1] "1" "2" "3"
#> $class
#> [1] "data.frame"

class(data_frame)
#> "data.frame"

## "Converting" into a list
attr(data_frame, "class") <- NULL

attributes(data_frame)
#> $names
#> [1] "A" "B"
#> $row.names
#> [1] "1" "2" "3"

class(data_frame)
#> "list"

Of course it only works if the elements in the list are of the same length:

## Creating an unequal list with data.frame attributes
wrong_list <- list(c(1,2,3), c(3,2,1,0))
attr(wrong_list, "names") <- c("A", "B")
attr(wrong_list, "row.names") <- c("1", "2", "3")
attr(wrong_list, "class") <- "data.frame"

wrong_list
#>   A B
#> 1 1 3
#> 2 2 2
#> 3 3 1
#> Warning message:
#> In format.data.frame(x, digits = digits, na.encode = FALSE) :
#>   corrupt data frame: columns will be truncated or padded with NAs

And it also bugs when omitting the names and row.names attributes as mentioned in the other comments and answers to this question:

## A list coerced into a data.frame without the right attributes
wrong_list <- list(c(1,2,3), c(3,2,1))
attr(wrong_list, "class") <- "data.frame"
wrong_list
#> NULL
#> <0 rows> (or 0-length row.names)

Upvotes: 1

kangaroo_cliff
kangaroo_cliff

Reputation: 6222

In the documentation for data.frame, it says:

The column names should be non-empty, and attempts to use empty names will have unsupported results.

So, it is expected that outcome may not be the desired ones if the column names are empty.

Upvotes: 7

Related Questions