Paige Kemp
Paige Kemp

Reputation: 97

How to drop a factor from a contrast in R?

in my original data set I have five factors, however I want to drop one of them and retain the remaining for to do forward contrast coding. However, I am unable to subset the data to remove the factor that I do not want.

Here is an example of what it is right now:

           Factor_2   Factor_3   Factor_4   Factor_5
Factor_1         0          0          0          0
Factor_2         1          0          0          0
Factor_3         0          1          0          0
Factor_4         0          0          1          0
Factor_5         0          0          0          1

And this is what I tried:

all_data %>% filter(FactorNumber == c("Factor_2", "Factor_3", "Factor_4", "Factor_5"))

And this is what I want:

           Factor_3   Factor_4   Factor_5
Factor_2         0          0          0
Factor_3         1          0          0
Factor_4         0          1          0
Factor_5         0          0          1

Any suggestions would be extremely helpful.

Upvotes: 0

Views: 292

Answers (4)

Ronak Shah
Ronak Shah

Reputation: 389047

Do you want to drop specific rownames and columnnames by their name ? You can do :

delete_col <- 'Factor_2'
delete_row <- 'Factor_1'

df[setdiff(rownames(df), delete_row), setdiff(colnames(df), delete_col)]

#         Factor_3 Factor_4 Factor_5
#Factor_2        0        0        0
#Factor_3        1        0        0
#Factor_4        0        1        0
#Factor_5        0        0        1

Upvotes: 0

jared_mamrot
jared_mamrot

Reputation: 26650

I believe @AllenCameron is correct: contr.treatment(n = 5 , contrasts = 4)[-1, -1] gives you:

  3 4 5
2 0 0 0
3 1 0 0
4 0 1 0
5 0 0 1

To use this for e.g. linear modelling, you could use:

contr <- contr.treatment(n = 5 , contrasts = 4)[-1, -1]
lm(y ~ cond, data=all_data, contrasts=contr)

Upvotes: 1

SteveM
SteveM

Reputation: 2301

df
         Factor_2 Factor_3 Factor_4 Factor_5
Factor_1        0        0        0        0
Factor_2        1        0        0        0
Factor_3        0        1        0        0
Factor_4        0        0        1        0
Factor_5        0        0        0        1
df2 <- df[-1, ]
df2
         Factor_2 Factor_3 Factor_4 Factor_5
Factor_2        1        0        0        0
Factor_3        0        1        0        0
Factor_4        0        0        1        0
Factor_5        0        0        0        1
df3 <- df2[ , -1]
df3
         Factor_3 Factor_4 Factor_5
Factor_2        0        0        0
Factor_3        1        0        0
Factor_4        0        1        0
Factor_5        0        0        1

Upvotes: 1

Duck
Duck

Reputation: 39605

Maybe reshaping can help if you want to set a filter. Otherwise, the tremendous advice from @AllanCameron is pretty useful. Here the code using tidyverse functions:

library(tidyverse)
#Code
df %>% rownames_to_column('id') %>%
  pivot_longer(-id) %>%
  filter(id%in% c("Factor_2", "Factor_3", "Factor_4", "Factor_5") &
           name %in% c("Factor_2", "Factor_3", "Factor_4", "Factor_5")) %>%
  pivot_wider(names_from = name,values_from=value) %>%
  column_to_rownames('id')

Output:

         Factor_2 Factor_3 Factor_4 Factor_5
Factor_2        1        0        0        0
Factor_3        0        1        0        0
Factor_4        0        0        1        0
Factor_5        0        0        0        1

Some data used:

#Data
df <- structure(list(Factor_2 = c(0L, 1L, 0L, 0L, 0L), Factor_3 = c(0L, 
0L, 1L, 0L, 0L), Factor_4 = c(0L, 0L, 0L, 1L, 0L), Factor_5 = c(0L, 
0L, 0L, 0L, 1L)), class = "data.frame", row.names = c("Factor_1", 
"Factor_2", "Factor_3", "Factor_4", "Factor_5"))

Upvotes: 1

Related Questions