CodingMatters
CodingMatters

Reputation: 1431

Does R automatically remove '.' from data.frame column names?

does d1$patient....age = d1$patient....age ?

I'm guessing these is a simple concept that causes this behaviour. Is this a reliable behaviour that is predictable? ie: if I name a data.frame column a...b can I always references it by $a ?

The example provided in the source doesn't explain what I'm seeing in R.

from http://biostat.mc.vanderbilt.edu/wiki/pub/Main/SvetlanaEdenRFiles/regExprTalk.pdf

d1 = data.frame(id...of....patient = c(1, 2), patient....age = c(3, 4))
d1$patient....age
#[1] 3 4
d1$patient
#[1] 3 4
d1$age
#NULL
d1$id...of....patient
#[1] 1 2
d1$id
#[1] 1 2
d1$id...of
#[1] 1 2
names(id)
#NULL
names(d1)
#[1] "id...of....patient" "patient....age"

Upvotes: 0

Views: 49

Answers (1)

David Robinson
David Robinson

Reputation: 78590

R's $ operator accepts any unambiguous prefix of a column name as referring to that column. It has nothing to do with .s.

For example, try:

d1$id...of....patient
# [1] 1 2
d1$id...of....pati
# [1] 1 2
d1$id...o
# [1] 1 2

This is true whether or not there are dots in the column name: mtcars$disp, mtcars$dis, and mtcars$di all return the disp column of the mtcars dataset. (However, mtcars$d returns NULL, as both the disp and drat columns start with d).

Upvotes: 4

Related Questions