Reputation: 11
R programming
I'm a new R programmer, trying to write a script that will calculate windchill with 3 different equations dependent on different wind speed paramaters. I was trying to use an apply function to do this.
The ingested data is a dataframe (csv file) where each row is a different weather observation, including station name, station id, time (year, month, day, hour), then the meteorological data, temp (DRY_BUILB_TEMP) and wind speed (WIND_SPEED). Only temp and windspeed are required to calculate the windchill.
The algorithm I used is as follows:
The problem occurs with my apply function, I only want one response for each row in the data frame, the calculated windchill for that row of data. However I receive a calculated windchill for each column (all the same value) for each row. Which apply function should I use? Or should I subset the data out farther to only have the temp and wind speed data? Of course since I can't get a list of just windchills the end portion of the program does not work correctly either and the data cannot be cbinded to the subsets.
Here's the program I have so far:
data <- read.csv("HfxTestWind.csv")
data_1 <- subset(data, WIND_SPEED >= 5)
data_2 <- subset(data, WIND_SPEED > 1 & WIND_SPEED < 5)
data_3 <- subset(data, WIND_SPEED < 1)
windchill_1 <- function(temp1, wind1){
temp1 <- data_1$DRY_BULB_TEMP
wind1 <- data_1$WIND_SPEED
result_1 <- 13.12 + 0.6215 * temp1 - 11.37 * (wind1 ^ (0.16)) + 0.3965*
temp1 * (wind1 ^ (0.16))
return(result_1)
}
windchill_2 <- function(temp2, wind2){
temp2 <- data_2$DRY_BULB_TEMP
wind2 <- data_2$WIND_SPEED
result_2 <- temp2 + ((-1.59 + 0.1345 * (temp2) / 5 * wind2))
return(result_2)
}
WIND_CHILL_1 <- lapply(data_1, windchill_1)
data_comp_1 <- cbind(data_1, WIND_CHILL_1)
WIND_CHILL_2 <- lapply(data_2, windchill_2)
data_comp_2 <- cbind(data_2, WIND_CHILL_2)
WIND_CHILL_3 <- data_3$DRY_BULB_TEMP
data_comp_3 <- cbind(data_3, WIND_CHILL_3)
Upvotes: 1
Views: 319
Reputation: 6784
R provides many ways of achieving the same result. My take on this one might be something like
windchill <- function(temp, wind, lower=1, upper=5){
ifelse(wind < lower, temp,
ifelse(wind < upper, temp + ((-1.59 + 0.1345 * (temp) / 5 * wind)),
13.12 + 0.6215*temp - 11.37*(wind^(0.16)) + 0.3965*temp*(wind^(0.16)) ) )
}
data$WIND_CHILL <- windchill(data$DRY_BULB_TEMP, data$WIND_SPEED)
head(data)
Upvotes: 0
Reputation: 132746
You don't need a loop (*apply
functions are loops) if you write a vectorized function. You should also study the topic of variable scoping in R.
#some data
set.seed(42)
DF <- data.frame(DRY_BULB_TEMP = runif(100, -10, 30),
WIND_SPEED = runif(100, 0, 10))
windchill_1 <- function(temp, wind){
#note how I use the function's arguments inside the function
result <- 13.12 + 0.6215 * temp - 11.37 * (wind ^ (0.16)) + 0.3965*
temp * (wind ^ (0.16))
return(result)
}
windchill_2 <- function(temp, wind){
result <- temp + ((-1.59 + 0.1345 * (temp) / 5 * wind))
return(result)
}
windchill <- function (temp, V) {
#you could nested ifelse here, but this is more efficient:
wc <- temp
wc[V >= 5] <- windchill_1(temp[V >= 5], V[V >= 5])
wc[V > 1 & V < 5] <- windchill_2(temp[V > 1 & V < 5], V[V > 1 & V < 5])
wc
}
DF <- within(DF, wchill <- windchill(DRY_BULB_TEMP, WIND_SPEED))
head(DF)
# DRY_BULB_TEMP WIND_SPEED wchill
#1 26.592242 6.262453 28.53904759
#2 27.483017 2.171577 27.49844851
#3 1.445581 2.165673 -0.06020394
#4 23.217905 3.889450 24.05710651
#5 15.669821 9.424557 15.47513150
#6 10.763838 9.626080 9.60641633
Upvotes: 1