Reputation: 1522
I have this data
Resistance CO_part_l H2_part_l C2H2_part_l rH T_amb
1 7.334982 44.59499 2.33e+19 6.95e+17 36 25
2 7.192182 44.59499 2.33e+19 6.95e+17 36 25
3 7.548556 44.59499 2.33e+19 6.95e+17 36 25
4 7.287561 44.59499 2.33e+19 6.95e+17 36 25
5 5.476464 44.59499 2.33e+19 6.95e+17 36 25
6 5.433722 44.59499 2.33e+19 6.95e+17 36 25
and I wanna' use this model:
m4<- lm(Resistance ~ (CO_part_l + H2_part_l + C2H2_part_l + rH + T_amb), data = df)
then to predict the values via
pred_df <- data.frame(R_pred = predict(m4, df), CO_part_l = df$CO_part_l)
and plot it finally:
ggplot(df, aes(x = exp(CO_part_l), y = exp(Resistance))) +
geom_point(color = "blue", size = 3, alpha = 0.4) +
geom_line(color='red',data = pred_df, aes(x=exp(CO_part_l), y=exp(R_pred)), alpha = 0.5, size =1.15) +
theme_bw() + xlab(TeX("CO / [part/l]")) + ylab(TeX("R / $ \\Omega $ ")) + labs(title="CO")
and I don't understand why it looks like pieces of linear functions connected to each other..
Note: Resistance
and CO_part_l
is logarithmized in the dataset because the relationship is logarithmic and to center it I have to do that in advance. That's why I exponentiate it in the plot then.
You can find the entire data here https://workupload.com/file/WuwqNeyKnAk
I used the dput
output, so I hope you can read it in.
Upvotes: 1
Views: 46
Reputation: 173813
If you want a single smooth line through the plot, you can hold the covariates steady (at their means, for example) while changing only the variable plotted on your x axis. In your case, the code to produce the prediction set might look something like this:
pred_df <- do.call(rbind, lapply(seq(40, 45.2, 0.1), function(x)
within(as.data.frame(t(colMeans(df)[3:6])), CO_part_l <- x)
))
Now pred_df
is a data frame of all your regressors held at their means apart from CO_part_l
which is varied evenly throughout its range. We can use this to see how the output variable changes according to a change in CO_part_l
when all else is equal:
pred_df$R_pred <- predict(m4, newdata = pred_df)
And that means your plot will look like this:
ggplot(df, aes(x = exp(CO_part_l), y = exp(Resistance))) +
geom_point(color = "blue", size = 3, alpha = 0.4) +
geom_line(color = 'red',data = pred_df,
aes(x = exp(CO_part_l), y = exp(R_pred)),
alpha = 0.5, size = 1.15) +
theme_bw() + xlab(TeX("CO / [part/l]")) +
ylab(TeX("R / $ \\Omega $ ")) +
labs(title="CO")
This probably looks more convincing on a log scale (or just not exponentiating your y axis; I'm not sure of the physical relevance of the numbers, so I'll simply add a log scale here)
ggplot(df, aes(x = exp(CO_part_l), y = exp(Resistance))) +
geom_point(color = "blue", size = 3, alpha = 0.4) +
geom_line(color = 'red',data = pred_df,
aes(x = exp(CO_part_l), y = exp(R_pred)),
alpha = 0.5, size = 1.15) +
theme_bw() +
xlab(TeX("CO / [part/l]")) +
ylab(TeX("R / $ \\Omega $ ")) +
labs(title="CO") +
scale_y_log10()
And of course, making the x axis a scale_x_log10
would give a straight line, though not quite as nice a plot:
ggplot(df, aes(x = exp(CO_part_l), y = exp(Resistance))) +
geom_point(color = "blue", size = 3, alpha = 0.4) +
geom_line(color = 'red',data = pred_df,
aes(x = exp(CO_part_l), y = exp(R_pred)),
alpha = 0.5, size = 1.15) +
theme_bw() +
xlab(TeX("CO / [part/l]")) +
ylab(TeX("R / $ \\Omega $ ")) +
labs(title="CO") +
scale_y_log10() +
scale_x_log10()
Upvotes: 3