Reputation: 4406
I am trying to extract data from a data frame for analysis.
heightweight <- function(person, health) {
## Read in data
data <- read.csv("heightweight.csv", header = TRUE,
colClasses = "character")
## Check that the outcomes are valid
measure = c("height", "weight")
if(health %in% measure == FALSE){
stop("Valid inputs are height and weight")
}
## Truncate the data matrix to only what columns are needed
data <- data[c(1, 5, 7)]
## Rename columns
names(data)[1] <- "Name"
names(data)[2] <- "Height"
names(data)[3] <- "Weight"
## Convert numeric columns to numeric
data[, 2] <- as.numeric(data[, 3])
data[, 3] <- as.numeric(data[, 4])
## Convert NAs to 0 after coercion
data[is.na(data)] <- 0
## Check that the name is valid
name <- data[, 1]
name <- unique(name)
if(person %in% name == FALSE){
stop("Invalid person")
}
## Return person with lowest height or weight
list <- data[data$name == person & data[health],]
outcomes <- list[, health]
minumum <- which.min(outcomes)
## Min Rate
minimum[rowNum, ]$name
}
The problem I am having is occurring with
list <- data[data$name == person & data[health],]
That is, I run heightweight("Bob", "weight")
, I get the following message
Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
length of 'dimnames' [2] not equal to array extent
I have Googled this message and checked out some threads here but can't determine what the problem is.
Upvotes: 0
Views: 159
Reputation: 46
Unless I'm missing something, if you only need the lowest weight or height for a given name, the last three lines of code are a bit redundant.
Here's a simple way to get the minimum health measurement for a given person:
min(data[data$name==person, "height"])
The first part selects only the rows of data that correspond to that person, it acts as a row index. The second part, after the comma, selects only the desired variable (column). Once you have selected the desired data, you look for the minimum in that subset of the data.
An example to illustrate the result:
data<-data.frame(name=as.character(c(rep("carlos",2),rep("marta",3),rep("johny",2),"sara")))
set.seed(1)
data$height <- rnorm(8,68,3)
data$weight <- rnorm(8,160,10)
The corresponding data frame:
name height weight
1 carlos 66.12064 165.7578
2 carlos 68.55093 156.9461
3 marta 65.49311 175.1178
4 marta 72.78584 163.8984
5 marta 68.98852 153.7876
6 johny 65.53859 137.8530
7 johny 69.46229 171.2493
8 sara 70.21497 159.5507
Let's say we want the minimum weight for marta:
person <- "marta"
health <- "weight"
The minimum "weight" for "marta" is,
min(data[data$name==person,health])
which gives the desired result:
[1] 153.7876
Upvotes: 3
Reputation: 13304
Here is the simplified analogue of your function:
heightweight <- function(person,health) {
data.set <- data.frame(names=rep(letters[1:5],each=3),height=171:185,weight=seq(95,81,by=-1))
d1 <- data.set[data.set$name == person,]
d2 <- d1[d1[,health]==min(d1[,health]),]
d2[,c('names',health)]
}
The first line produces a sample data set. The second line selects all records for a given person
. The last line finds a record corresponding to the minimum value of health
.
heightweight('b','height')
# names height
# 4 b 174
Upvotes: 0