Reputation: 1430
With such rudimentary application, I'm having trouble removing data.table column labels/attributes from imported data (SAS)
My data.table DT is an import from a SAS file. Not all columns have labels, and some have two labels. I can't share my data as it's imported (so i can't replicate it), but here is a partial structure of DT:
> str(DT)
Classes ‘data.table’ and 'data.frame': 96293709 obs. of 150 variables:
$ Col1 : chr "Y" "N" "N" "N" ...
..- attr(*, "label")= chr "some label, description goes on and on"
$ Col2 : chr "N" "N" "N" "Y" ...
..- attr(*, "label")= chr "some label 2, description goes on and on"
$ Col3 : Date, format: "1994-08-07" "1994-08-07" "1994-08-07" "1994-08-07" ...
$ Col4 : chr "M" "M" "M" "M" ...
..- attr(*, "label")= chr "some label 3, description goes on and on"
..- attr(*, "format.sas")= chr "$"
$ Col5 : num 1e+07 1e+07 1e+07 1e+07 1e+07 ...
..- attr(*, "label")= chr "some label 4, description goes on and on"
$ Col6 : Date, format: "2000-01-01" "2005-03-10" "2013-06-01" "2015-06-01" ...
I'm trying to remove all attributes, because when I use certain columns to create news ones these attributes are inherited in the new column, which is very annoying and undesired (prevents me from merging with another data.table without the labels). I thought the only way to prevent that is to remove the attributes (labels) from the original data DT.
I tried
> setattr(DT, "label", NULL)
> setattr(DT, "format.sas", NULL)
and i get no error. but nothing happens. after I try the above and check the structure, i get the same thing as before. labels/attributes have not been removed. what am I doing wrong here? I know i have to use setattr somehow as I don't want DT to be copied (it's rather large)
Upvotes: 3
Views: 2014
Reputation: 93843
The attributes are stored against each column, not for the data.table as a whole I think. Check attributes(DT)
vs lapply(DT, attributes)
and see if this is the case. Here's an example which I think replicates what you're trying to do:
DT <- data.table(a=1:3,b=2:4)
attr(DT$a, "label") <- "a label"
attr(DT$b, "label") <- "a label"
attr(DT$b, "sas format") <- "ddmmyy10."
str(DT)
#Classes ‘data.table’ and 'data.frame': 3 obs. of 2 variables:
# $ a: atomic 1 2 3
# ..- attr(*, "label")= chr "a label"
# $ b: atomic 2 3 4
# ..- attr(*, "label")= chr "a label"
# ..- attr(*, "sas format")= chr "ddmmyy10."
# - attr(*, ".internal.selfref")=<externalptr>
DT[, names(DT) := lapply(.SD, setattr, "label", NULL)]
DT[, names(DT) := lapply(.SD, setattr, "sas format", NULL)]
str(DT)
#Classes ‘data.table’ and 'data.frame': 3 obs. of 2 variables:
# $ a: int 1 2 3
# $ b: int 2 3 4
# - attr(*, ".internal.selfref")=<externalptr>
Upvotes: 3