Reputation: 11
I am trying to create this function in R:
get_mpg(): given the name of a car, the type of speed, and the data frame of cars, this function returns the corresponding value of fuel-consumption (i.e. miles-per-gallon).
and this is all the data i have:
car_names <- c("mazda3", "civic", "focus", "prius",
"a6quattro", "tacoma", "camaro", "challenger")
speed <- c("city", "hwy")
mpg <- c(30, 41, 31, 41, 29, 40, 53, 46, 18,
28, 17, 21, 16, 24, 14, 23)
cars <- data.frame(car = car_names, speed = speed, mpg = mpg)
The function I've written is:
get_mpg <- function(car_names, speed, frame)
{
subset_mpg <- subset(frame,
cars == car_names, speed == speed)
return(as.numeric(subset_mpg[, 3]))
}
however when i am doing
get_mpg("a6quattro", "hwy", cars)
I get 29 16
whereas i should be getting just 28.
Can someone please help me out and correct the code?
Upvotes: 1
Views: 83
Reputation: 93938
A couple of issues here:
1.) You don't have a row with "a6quattro" and "hwy" at all, so you should be expecting no data returned.
2.) Using subset
is causing dramas as speed==speed
is comparing frame$speed==frame$speed
rather than frame$speed==speed
- it is not recommended to use subset
for non-interactive coding purposes for this very reason.
3.) You need to combine your selections with &
instead of separating them with a comma in subset
anyway.
4.) Instead, use something like:
get_mpg2 <- function(car_names, speed, frame) {
frame[frame$car %in% car_names & frame$speed == speed, "mpg"]
}
get_mpg2("a6quattro", "city", cars)
#[1] 29 16
Upvotes: 4
Reputation: 486
The problem is probably that your data frame is not exactly what you are expecting it to be. Here's what your data frame looks like:
> car_names <- c("mazda3", "civic", "focus", "prius", "a6quattro", "tacoma", "camaro", "challenger")
> speed <- c("city", "hwy")
> mpg <- c(30, 41, 31, 41, 29, 40, 53, 46, 18, 28, 17, 21, 16, 24, 14, 23)
> carsx <- data.frame(car = car_names, speed = speed, mpg = mpg)
> carsx
car speed mpg
1 mazda3 city 30
2 civic hwy 41
3 focus city 31
4 prius hwy 41
5 a6quattro city 29
6 tacoma hwy 40
7 camaro city 53
8 challenger hwy 46
9 mazda3 city 18
10 civic hwy 28
11 focus city 17
12 prius hwy 21
13 a6quattro city 16
14 tacoma hwy 24
15 camaro city 14
16 challenger hwy 23
As you can see, every car does not get a corresponding value for "city" and "hwy". For instance, mazda3 gets two instances of "city"; civic gets two instances of "hwy". The car in question, a6quattro, has two instances of "city" and no "hwy", which is probably what's causing subset to misbehave.
If you subset using "[" as shown below, you get the right answer (which is to get nothing since the data is incorrect).
#You can replace "a6quattro" and "hwy" with arguments passed to a function
carsx[carsx$car == "a6quattro" & carsx$speed == "hwy", "mpg"]
numeric(0)
Once you correct your data frame, this problem should get resolved.
Upvotes: 2