Reputation: 21
I'm going to do a Kruskal-Wallis test in R (testing whether there is a difference between dominance status in fish (five groups, measured from 1-5) and attacks by the fish) but it seems I have som problems with the factor. I import the dataset from Excel. R doesn't recognize the dominance status as a factor (returning FALSE when asked is.factor(dominance_status). When I import the dataset as a text file R doesn't recognize the first row as column names but instead writes V1 and V2 as the name of the columns.
I would be very thankful if somebody could please help me with this problem!
Attack_data
Indvid Dominance_status Attacks
<chr> <dbl> <dbl>
1 a1 3 0
2 a2 3 0
3 a3 4 0
# ... with 22 more rows
is.factor(Dominance_status)
[1] FALSE
Upvotes: 0
Views: 684
Reputation: 872
Because the Dominance_status is coded numerically, most read* functions will guess that it is a numeric class rather than a factor.
After reading in the data you could simply change the class to factor using as.factor() to force R to treat Dominance_status as a factor.
df <- data.frame(stringsAsFactors=FALSE,
Indvid = c("a1", "a2", "a3"),
Dominance_status = c(3, 3, 4),
Attacks = c(0, 0, 0)
)
is.factor(df$Dominance_status)
#> [1] FALSE
df$Dominance_status <- as.factor(df$Dominance_status)
is.factor(df$Dominance_status)
#> [1] TRUE
Created on 2019-02-20 by the reprex package (v0.2.0).
Alternatively, you could use colClasses to specify that Dominance_status is a factor when you read in the data in the first place. Here is how you would do this using read.csv:
read.csv(filename, colClasses = c(Dominance_status = "factor"))
Upvotes: 2