Camilo
Camilo

Reputation: 201

Error using function in data.table in r

I am using a function in data.table that for some reason does not work when used in data.table but works well when used in the R environment. Will anyone know why this happens?

Basically, the function assigns the closest year in an array to a given year in the data.table. The function requires one variable being a year (which is in the data.table) and a second variable being the array of possible years, where the closest year is to be obtained. The example of the code is below.

I get the warning:

"Warning message: In YearsArray - YearI : longer object length is not a multiple of shorter object length"

library (data.table)

DAT<-data.table(Yr=1950:1960)
ArrayYearsB<- c(1950, 1955, 1960)

#---start---pair-years function----#
YearPairing <- function (YearI,YearsArray)
{
YearB=c(abs(YearsArray-YearI))
YearA=min(YearB)
YearA=grep(paste0("^",YearA,"$"),YearB)
YearA= YearsArray[YearA][1]
return(YearA)
}
#---end---pair-years function----#


DAT[,YearB:=YearPairing(Yr,ArrayYearsB)]

YearPairing(1950,ArrayYearsB)

Upvotes: 1

Views: 195

Answers (1)

lmo
lmo

Reputation: 38500

For this particular problem, you can use the roll argument as follows.

data.table(Yr=ArrayYearsB)[DAT, roll="nearest", .(Yr=i.Yr, that=x.Yr), on="Yr"]
      Yr that
 1: 1950 1950
 2: 1951 1950
 3: 1952 1950
 4: 1953 1955
 5: 1954 1955
 6: 1955 1955
 7: 1956 1955
 8: 1957 1955
 9: 1958 1960
10: 1959 1960
11: 1960 1960

Here, the vector is converted into a data.table, adding the desired name of the variable in DAT, and then the data.table, DAT is used as a left join using on="Yr". The roll argument is given "nearest" which selects the nearest value of the vector. This result is fed to the j statement and the desired results are extracted using i. and x..

To assign back to the main table:

DAT[, that := data.table(Yr = ArrayYearsB)[DAT, on=.(Yr), roll="nearest", x.Yr]]

Upvotes: 3

Related Questions