Reputation: 1
In the following CSV file:
Species, Age
australian, 2.6
australian, 2.3
brown, 2.3
brown, 2.3
brown, 3.4
brown, 3.4
dalmatian, 5.1
dalmatian, 4.4
dalmatian, 4.4
dalmatian, 4.1
dalmatian, 4.2
dalmatian, 4.7
dalmatian, 5.5
I am attempting to calculate the mean for the Pelican species, but R is displaying an error about unequal lengths.
df <- read.csv('c:/Users/Michelle/Downloads/pelican.csv')
tapply(df$Species, df$Age, mean)
Error in tapply(df$Species, df$Age, mean) : arguments must have same length
I assumed the tapply function would output each pelican species with the mean age of each. Unfortunately, the director at the University of Florida is insisting I use base R functions.
Edit 1:
str(df) 'data.frame': 13 obs. of 2 variables: $ Species: chr "australian" "australian" "brown" "brown" ... $ Age : num 2.6 2.3 2.3 2.3 3.4 3.4 5.1 4.4 4.4 4.1 ...
dput(df) structure(list(Species = c("australian", "australian", "brown", "brown", "brown", "brown", "dalmatian", "dalmatian", "dalmatian", "dalmatian", "dalmatian", "dalmatian", "dalmatian"), Age = c(2.6, 2.3, 2.3, 2.3, 3.4, 3.4, 5.1, 4.4, 4.4, 4.1, 4.2, 4.7, 5.5)), class = "data.frame", row.names = c(NA, -13L))
Thank you Pedro for the help.
Thank you for any help you can provide.
M.
Upvotes: 0
Views: 1022
Reputation: 869
Welcome Michelle! The tapply
function works with two main objets (these objects need to be vectors), called X
and INDEX
. What the error messages is telling you, is that X
and INDEX
does not have the same length.
The example below, reproduces the same error that you are facing. See that the X
object have 4 elements, but INDEX
have only 2.
tapply(X = c(5, 6, 7, 8), INDEX = c(1, 2), mean)
This means that, to fix your error, the first and second objects that you pass to tapply()
, need to have the same length. In your example, these two objects are df$Species
and df$Age
. You can confirm if df$Species
and df$Age
does not have the same length, by comparing the result of length(df$Species)
and length(df$Age)
. If they are equal, then, these two vectors have the same length. But, if they are not equal, then these two vectors have different lengths.
What is probably going wrong in your code, is that the read.csv()
function is not correctly reading your CSV file. Maybe df
was transformed to a list, and not a data.frame
. We cannot give better help than this for you, because we do not know what the df
object is, or, how it is structured in your R session.
You could give these useful information for us, by copying and pasting the result of str(df)
command, or, dput(df)
. Both of these commandos would give us enough information to probably point out exactly what you need to do. So, next time, when you post a question, is good idea to include these infos.
Anyway, when I copy and paste the CSV file that you passed, and try to run your code, everything works fine. So, again, your df
object is probably not structured as you expected, probably because of some problem at the read.csv()
function.
text <- "
Species, Age
australian, 2.6
australian, 2.3
brown, 2.3
brown, 2.3
brown, 3.4
brown, 3.4
dalmatian, 5.1
dalmatian, 4.4
dalmatian, 4.4
dalmatian, 4.1
dalmatian, 4.2
dalmatian, 4.7
dalmatian, 5.5"
data <- readr::read_csv(text)
tapply(data$Age, data$Species, mean)
Result:
australian brown dalmatian
2.450000 2.850000 4.628571
Upvotes: 0