Reputation: 1117
I have two data-sets as follows:
http://www.filedropper.com/dataa_1 ## DataA
http://www.filedropper.com/datab ## DataB
In dataA, we have 42 rows and 8 columns and in DataB 42 rows and 6 columns. We wanted to do CCA and sPLS using both of these data in R. But my question here is when we look at DataB, always every eleven rows will have the same values. Will this affect the results or cause a discrepancy in either the CCA or sPLS?
Upvotes: 0
Views: 179
Reputation: 3429
After looking at block B, it looks like the variables are discrete.
It is not a (technical) problem to use such variables in PLS or CCA, but it poses statistical "challenges": the use of bootstap or jackknife may be required to go further into the statistical interpretation of the results.
You should also ask yourself if this "discrete" representation is accurate for your data. It may be wrong if the original variables are categorical, in which case you should use dummy variables.
Upvotes: 1