George Sotiropoulos
George Sotiropoulos

Reputation: 2133

Weird case with data tables in R, column names are mixed

So I have created this variable that is called mc_split_device inside the datatable called mc_with_devices. However, If I type mc_with_devices$mc_split I get the values of the column mc_split_device while I never created any variable with the name mc_split.

enter image description here

Upvotes: 3

Views: 168

Answers (3)

akrun
akrun

Reputation: 887118

According to ?Extract

name - A literal character string or a name (possibly backtick quoted). For extraction, this is normally (see under ‘Environments’) partially matched to the names of the object.

and exact

exact - Controls possible partial matching of [[ when extracting by a character vector (for most objects, but see under ‘Environments’). The default is no partial matching. Value NA allows partial matching but issues a warning when it occurs. Value FALSE allows partial matching without any warning.

So, when we do

mtcars$m
#[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3
#[27] 26.0 30.4 15.8 19.7 15.0 21.4

mtcars$d
#NULL

Because there are multiple names that starts with 'd'

 names(mtcars)
 #[1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

If we are specific, it does the partial match for the 'disp' column

mtcars$di
#[1] 160.0 160.0 108.0 258.0 360.0 225.0 360.0 146.7 140.8 167.6 167.6 275.8 275.8 275.8 472.0 460.0 440.0  78.7  75.7  71.1 120.1
#[22] 318.0 304.0 350.0 400.0  79.0 120.3  95.1 351.0 145.0 301.0 121.0

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

It matches the name of the column partially. From ?Extract

names : For extraction, this is normally (see under ‘Environments’) partially matched to the names of the object.

Character indices can in some circumstances be partially matched (see pmatch) to the names or dimnames of the object being subsetted

Thus the default behaviour is to use partial matching only when extracting from recursive objects (except environments) by $.

Hence, when you do

mtcars$m

You get

#[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4
#[17] 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4

which is same as mtcars$mpg

This can be sometimes confusing and if you want to make sure to be notified when such partial matching is done. You can turn on the warning by

options(warnPartialMatchDollar = TRUE)
mtcars$m
# [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4
#[17] 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4

Warning message: In $.data.frame(mtcars, m) : Partial match of 'm' to 'mpg' in data frame

Upvotes: 5

g_t_m
g_t_m

Reputation: 714

See Hadley Wickham's Advanced R:

$ is a shorthand operator, where x$y is equivalent to x[["y", exact = FALSE]]. It’s often used to access variables in a data frame, as in mtcars$cyl or diamonds$carat.

So the exact=FALSE is the reason why $mc_split works despite there not being a column with that exact name.

As an aside, I don't believe mc_with_devices[,.(mc_split)] will work without doublequotes. The following will work:

mc_with_devices[,"mc_split_resp"]

Upvotes: 7

Related Questions