Reputation: 943
I am attempting to build a Partial Least Squares Path Model using 'plspm'. After reading through the tutorial and formatting my data I am getting hung up on an error:
"Error in if (w.dif < tol || itermax == iter) break : missing value where TRUE/FALSE needed".
I assume that this error is the result of missing values for some of the latent variables (e.g. Soil_Displaced) has a lot of NAs because this variable was only measured in a subset of the replicates in the experiment. Is there a way to get around this error and work with variables with a lot of missing values. I am attaching my code and dateset here and the dataset can also be found in this dropbox file; https://www.dropbox.com/sh/51x08p4yf5qlbp5/-al2pwdCol
this is my code for now:
# inner model matrix
warming = c(0,0,0,0,0,0)
Treatment=c(0,0,0,0,0,0)
Soil_Displaced = c(1,1,0,0,0,0)
Mass_Lost_10mm = c(1,1,0,0,0,0)
Mass_Lost_01mm = c(1,1,0,0,0,0)
Daily_CO2 = c(1,1,0,1,0,0)
Path_inner = rbind(warming, Treatment, Soil_Displaced, Mass_Lost_10mm, Mass_Lost_01mm,Daily_CO2 )
innerplot(Path_inner)
#develop the outter model
Path_outter = list (3, 4:5, 6, 7, 8, 9)
# modes
#designates the model as a reflective model
Path_modes = rep("A", 6)
# Run it plspm(Data, inner matrix, outer list, modes)
Path_pls = plspm(data.2011, Path_inner, Path_outter, Path_modes)
Any input on this issue would be helpful. Thanks!
Upvotes: 1
Views: 2062
Reputation: 2425
plspm does work limited with missing values, you have to set the scaling to numeric.
for your example the code looks as follows:
example_scaling = list(c("NUM"),
c("NUM", "NUM"),
c("NUM"),
c("NUM"),
c("NUM"),
c("NUM"))
Path_pls = plspm(data.2011, Path_inner, Path_outter, Path_modes, scaling = example_scaling)
But heres the limitation:
However if your dataset contains one observation where all indicators of a latent variable are missing values, this won't work.
First Case: F.e. the latent variable "Treatment" has 2 indicators, if one of them is NA, it works fine.
Second Case: But if there is just one observation where both indicators are NA, it won't work.
Since youre measuring the other 5 latent variables with just one indicator and you say your data contains lots of missing values, the second one will likely be the case.
Upvotes: 2
Reputation: 943
PLSPM will not work with missing values therefore I had to interpolate some of the missing values from known observations. When this was done the code above worked great!.
Upvotes: 0