Icewaffle
Icewaffle

Reputation: 19

How to do linear regression using columns-names as dependent variables

Suppose I have the following data set

d <- data.frame(1:31, 31:1)
names(d) <- c("cats", "dogs")

And I want to do a linear regression with columns as the dependent variables and the values as independent - if I had 2 columns, 1 columns named "Animals" containing 31 rows with the value "Cat" and 31 rows with the value "Dog" and 1 column named "values" with 62 rows containing the values 1:31-31:1 I think I could use

lm(Animals ~ values, data=df)

but is there a way to do this by just using column names as the first part of the expression?

Any help is much appreciated

Upvotes: 0

Views: 265

Answers (2)

George Savva
George Savva

Reputation: 5336

If you only have two columns then a t-test is exactly the same as a linear regression (the effects, p-values etc will be identical):

t.test(d$cats, d$dogs, var.equal=TRUE)

But suppose you did want to reshape a more complex dataset then @akrun's answer is fine. If you don't want to use tidyverse there is a base R reshape function that does the same thing:

d2 <- reshape(data=d, varying=list(1:2), 
        direction="long", 
        times = names(d), 
        timevar="animals",
        v.names="value")

lm( value ~ animals, data=d2)

Upvotes: 2

akrun
akrun

Reputation: 887223

We can convert to long format and then do an lm

library(tidyr)
library(dplyr)
d %>%
   pivot_longer(everything(), names_to = 'Animals', values_to = 'values') %>%
   {lm(values~ Animals, data = .)}

Upvotes: 3

Related Questions