R two regressions from one table

Question

I am trying to plot two different regression lines (with the formula: salary = beta0 + beta1D3 + beta2spending + beta3*(spending*D3) + w) into one scatter plot by deviding the data I have into two subsets as seen in the following code:

salary = data$salary
spending = data$spending
D1 = data$North
D2 = data$South
D3 = data$West

subsetWest = subset(data, D3 == 1)
subsetRest = subset(data, D3 == 0)

abab = lm(salary ~ 1 + spending + 1*spending, data=subsetWest) #red line
caca = lm(salary ~ 0 + spending + 0*spending, data=subsetRest) #blue line


plot(spending,salary)

points(subsetWest$spending, subsetWest$salary, pch=25, col = "red")
points(subsetRest$spending, subsetRest$salary, pch=10, col = "blue")

abline(abab, col = "red")
abline(caca, col = "blue")

This is a sample of my data table:

enter image description here

And this is the plot I get when running the code:

[enter image description here][2] [2]: https://i.sstatic.net/It8ai.png

My problem is that the intercept for my second regression is wrong, in fact I do not even get an intercept when looking at the summary, unlike with the first regression.

Does anybody see where my problem is or does anybody know an alternative way of plotting the two regression lines?

Help would be much appreciated. Thank you very much!

This is the whole table:

structure(list(salary = c(39166L, 40526L, 40650L, 53600L, 58940L, 
53220L, 61356L, 54340L, 51706L, 49000L, 48548L, 54340L, 60336L, 
53050L, 54720L, 43380L, 43948L, 41632L, 36190L, 41878L, 45288L, 
49248L, 54372L, 67980L, 46764L, 41254L, 45590L, 43140L, 44160L, 
44500L, 41880L, 43600L, 45868L, 36886L, 39076L, 40920L, 42838L, 
50320L, 44964L, 41938L, 54448L, 51784L, 45288L, 49280L, 44682L, 
51220L, 52030L, 51576L, 58264L, 51690L), spending = c(6692L, 
6228L, 7108L, 9284L, 9338L, 9776L, 11420L, 11072L, 8336L, 7094L, 
6318L, 7242L, 7564L, 8494L, 7964L, 7136L, 6310L, 6118L, 5934L, 
6570L, 7828L, 9034L, 8698L, 10040L, 7188L, 5642L, 6732L, 5840L, 
5960L, 7462L, 5706L, 5066L, 5458L, 4610L, 5284L, 6248L, 5504L, 
6858L, 7894L, 5018L, 10880L, 8084L, 6804L, 5658L, 4594L, 5864L, 
7410L, 8246L, 7216L, 7532L), North = c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), South = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L), West = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-50L))

Robert Long · Accepted Answer

My problem is that the intercept for my second regression is wrong, in fact I do not even get an intercept when looking at the summary, unlike with the first regression.

That is because your second model specifies no intercept, since you use ... ~ 0 + ...

Also, your first model doesn't make sense because it includes spending twice. The second entry for spending will be ignored by lm

R two regressions from one table

Answers (1)

Related Questions