Anna
Anna

Reputation: 21

R doesn't recognize factor

I'm going to do a Kruskal-Wallis test in R (testing whether there is a difference between dominance status in fish (five groups, measured from 1-5) and attacks by the fish) but it seems I have som problems with the factor. I import the dataset from Excel. R doesn't recognize the dominance status as a factor (returning FALSE when asked is.factor(dominance_status). When I import the dataset as a text file R doesn't recognize the first row as column names but instead writes V1 and V2 as the name of the columns.

I would be very thankful if somebody could please help me with this problem!

Attack_data
Indvid Dominance_status Attacks
<chr> <dbl> <dbl>
1 a1 3 0
2 a2 3 0
3 a3 4 0

# ... with 22 more rows

is.factor(Dominance_status) [1] FALSE

Upvotes: 0

Views: 684

Answers (1)

Emily Kothe
Emily Kothe

Reputation: 872

Because the Dominance_status is coded numerically, most read* functions will guess that it is a numeric class rather than a factor.

After reading in the data you could simply change the class to factor using as.factor() to force R to treat Dominance_status as a factor.

df <- data.frame(stringsAsFactors=FALSE,
             Indvid = c("a1", "a2", "a3"),
   Dominance_status = c(3, 3, 4),
            Attacks = c(0, 0, 0)
)

is.factor(df$Dominance_status)
#> [1] FALSE

df$Dominance_status <- as.factor(df$Dominance_status)

is.factor(df$Dominance_status)
#> [1] TRUE

Created on 2019-02-20 by the reprex package (v0.2.0).

Alternatively, you could use colClasses to specify that Dominance_status is a factor when you read in the data in the first place. Here is how you would do this using read.csv:

read.csv(filename, colClasses = c(Dominance_status = "factor"))

Upvotes: 2

Related Questions