Reputation: 61
I have 2 files of data that looks like this:
Model Data
long lat count
96.25 18.75 4
78.75 21.25 3
86.75 23.25 7
91.25 33.75 10
Observation Data
long lat count
96.75 25.75 10
86.75 23.25 7
78.75 21.25 11
95.25 30.25 5
I'm trying to subtract the counts of the lat/long combinations (model data-observation data) that match such that the first combination of 78.75 & 21.25 would give a difference count of -8. Any lat/long points without a match to subtract with would just be subtracted by or from 0.
I've tried an if statement as such to match points for subtraction:
if (modeldata$long == obsdata$long & modeldata$lat == obsdata$lat) {
obsdata$difference <- modeldata$count - obsdata$count
}
However, this just subtracts rows in order, not by matching points, unless matching points happen to fall within the same row.
I also get these warnings:
Warning messages:
1: In modeldata$long == obsdata$long : longer object length is not a multiple of shorter object length
2: In modeldata$lat == obsdata$lat : longer object length is not a multiple of shorter object length
3: In if (modeldata$long == obsdata$long & modeldata$lat == : the condition has length > 1 and only the first element will be used
Any help would be greatly appreciated!
Upvotes: 1
Views: 372
Reputation: 887691
Here is an option with dplyr
library(dplyr)
left_join(mdl, obs, by = c("long", "lat")) %>%
transmute(long, lat, count = count.x - replace(count.y, is.na(count.y), 0))
# long lat count
#1 96.25 18.75 4
#2 78.75 21.25 -8
#3 86.75 23.25 0
#4 91.25 33.75 10
Upvotes: 2
Reputation: 70653
You can merge on coordinates, add 0 for NA and substract.
mdl <- read.table(text = "long lat count
96.25 18.75 4
78.75 21.25 3
86.75 23.25 7
91.25 33.75 10", header = TRUE)
obs <- read.table(text = "long lat count
96.75 25.75 10
86.75 23.25 7
78.75 21.25 11
95.25 30.25 5", header = TRUE)
xy <- merge(mdl, obs, by = c("long", "lat"), all.x = TRUE)
xy[is.na(xy)] <- 0
xy$diff <- xy$count.x - xy$count.y
xy
long lat count.x count.y diff
1 78.75 21.25 3 11 -8
2 86.75 23.25 7 7 0
3 91.25 33.75 10 0 10
4 96.25 18.75 4 0 4
Upvotes: 3
Reputation: 26258
You can do this using a data.table
join & update
library(data.table)
## reading your supplied data
# dt_model <- fread(
# 'long lat count
# 96.25 18.75 4
# 78.75 21.25 3
# 86.75 23.25 7
# 91.25 33.75 10'
# )
#
#
# dt_obs <- fread(
# "long lat count
# 96.75 25.75 10
# 86.75 23.25 7
# 78.75 21.25 11
# 95.25 30.25 5"
# )
setDT(dt_model)
setDT(dt_obs)
## this join & update will update the `dt_model`.
dt_model[
dt_obs
, on = c("long", "lat")
, count := count - i.count
]
dt_model
# long lat count
# 1: 96.25 18.75 4
# 2: 78.75 21.25 -8
# 3: 86.75 23.25 0
# 4: 91.25 33.75 10
Noting the obvious caveat that joining on coordinates (floats/decimals) may not always give the right answer
Upvotes: 2