Reputation: 1057
I have distance matrix and each row is an individual, and each column is a facility. The cell shows the length from an individual to the facility.
> head(ODMatrix, 5)
toFacility1 toFacility2 toFacility3 toFacility4 toFacility5 toFacility6 toFacility7 toFacility8 toFacility9 toFacility10
1: 4154.229 1835.176 5228.835 8093.985 7813.0557 2396.326 4055.081 4199.636 6790.750 4206.637
2: 4075.044 4848.875 3403.399 2575.370 501.4027 1072.520 1860.508 3188.388 2639.671 6118.273
3: 5660.299 3767.281 7249.469 4276.207 1917.6547 1288.333 3956.757 4511.083 1576.480 4940.198
4: 6853.425 1385.334 8696.045 7012.102 3201.9396 1708.367 4052.216 5352.751 5315.842 3218.540
5: 6746.253 1735.916 8397.047 5014.986 4820.9541 1681.347 3728.737 5334.818 6826.545 2085.071
Some of the facilities are stations and some of the facilities are poll stations. I want to know which minimum distance is shorter. Facility 1, 2, and 3 are stations, so station_col_numbers <- c(1,2,3)
. Other facilities are poll stations.
For example, in the case of the first row, the nearest station for him is Faciity2 (1835.176m), and the closest poll station for him is Facility6 (2396.326). Then, what I actually want to know is which one is closer. In this case, since 1835.176 < 2396.326, the station is closer for him, so 0 is the dummy variable for this row.
analyse <- function(row_I){
row_I_withoutStation <- row_I[ , -station_col_numbers, with=F]
row_I_ToStation <- row_I[ , station_col_numbers, with=F]
toStation_min <- min(row_I_ToStation)
toPollStation_min <- min(col_I_withoutStation)
if (toStation_min >= toPollStation_min){
return(1)
}else{
return(0)
}
}
However, when I use apply()
, it fails.
Dummy <- apply(ODMatrix, 1, analyse)
Error in row_I[, -station_col_numbers, with = F] :
incorrect number of dimensions
Is this a misuse of apply()
? How can I solve it?
Upvotes: 1
Views: 91
Reputation: 83215
In base R you can create a logical integer vector indicating whether a polling station is closest with:
ODMatrix$poll.closest <- +(apply(ODMatrix[,1:3], 1, min) > apply(ODMatrix[,4:10], 1, min))
which gives:
> ODMatrix
toFacility1 toFacility2 toFacility3 toFacility4 toFacility5 toFacility6 toFacility7 toFacility8 toFacility9 toFacility10 poll.closest
1: 4154.229 1835.176 5228.835 8093.985 7813.0557 2396.326 4055.081 4199.636 6790.750 4206.637 0
2: 4075.044 4848.875 3403.399 2575.370 501.4027 1072.520 1860.508 3188.388 2639.671 6118.273 1
3: 5660.299 3767.281 7249.469 4276.207 1917.6547 1288.333 3956.757 4511.083 1576.480 4940.198 1
4: 6853.425 1385.334 8696.045 7012.102 3201.9396 1708.367 4052.216 5352.751 5315.842 3218.540 0
5: 6746.253 1735.916 8397.047 5014.986 4820.9541 1681.347 3728.737 5334.818 6826.545 2085.071 1
With data.table you could do:
stations <- names(ODMatrix)[1:3]
pollstations <- names(ODMatrix)[4:10]
ODMatrix[, idx:=.I
][, dist.station := min(.SD), idx, .SDcols=stations
][, dist.poll := min(.SD), idx, .SDcols=pollstations
][, poll.closest := +(dist.station > dist.poll)
][, c("idx","dist.station","dist.poll"):=NULL]
to get the same result. Alternatively, you could also use:
ODMatrix[, poll.closest := pmin(toFacility1,toFacility2,toFacility3) >
pmin(toFacility4,toFacility5,toFacility6,toFacility7,toFacility8,toFacility9,toFacility10),
by = 1:nrow(ODMatrix)]
Upvotes: 1
Reputation: 5152
Modify your function, has some typos/error:
analyse <- function(row_I){ #row_I=ODMatrix[1,]
col_I_withoutStation <- row_I[ -station_col_numbers]
col_I_ToStation <- row_I[ station_col_numbers]
toStation_min <- min(col_I_ToStation)
toPollStation_min <- min(col_I_withoutStation)
#cat(toStation_min , toPollStation_min)
if (toStation_min >= toPollStation_min){
return(1)
}else{
return(0)
}
}
apply(ODMatrix, 1, analyse)
You wil get
[1] 0 1 1 0 1
Upvotes: 1