Reputation: 201
I am using a function in data.table that for some reason does not work when used in data.table but works well when used in the R environment. Will anyone know why this happens?
Basically, the function assigns the closest year in an array to a given year in the data.table. The function requires one variable being a year (which is in the data.table) and a second variable being the array of possible years, where the closest year is to be obtained. The example of the code is below.
I get the warning:
"Warning message: In YearsArray - YearI : longer object length is not a multiple of shorter object length"
library (data.table)
DAT<-data.table(Yr=1950:1960)
ArrayYearsB<- c(1950, 1955, 1960)
#---start---pair-years function----#
YearPairing <- function (YearI,YearsArray)
{
YearB=c(abs(YearsArray-YearI))
YearA=min(YearB)
YearA=grep(paste0("^",YearA,"$"),YearB)
YearA= YearsArray[YearA][1]
return(YearA)
}
#---end---pair-years function----#
DAT[,YearB:=YearPairing(Yr,ArrayYearsB)]
YearPairing(1950,ArrayYearsB)
Upvotes: 1
Views: 195
Reputation: 38500
For this particular problem, you can use the roll argument as follows.
data.table(Yr=ArrayYearsB)[DAT, roll="nearest", .(Yr=i.Yr, that=x.Yr), on="Yr"]
Yr that
1: 1950 1950
2: 1951 1950
3: 1952 1950
4: 1953 1955
5: 1954 1955
6: 1955 1955
7: 1956 1955
8: 1957 1955
9: 1958 1960
10: 1959 1960
11: 1960 1960
Here, the vector is converted into a data.table, adding the desired name of the variable in DAT, and then the data.table, DAT is used as a left join using on="Yr". The roll argument is given "nearest" which selects the nearest value of the vector. This result is fed to the j statement and the desired results are extracted using i.
and x.
.
To assign back to the main table:
DAT[, that := data.table(Yr = ArrayYearsB)[DAT, on=.(Yr), roll="nearest", x.Yr]]
Upvotes: 3