Survey design related issues in R

Question

I have joined five datasets using full_join function of dplyr package. The first dataset had 6,165 rows; second datasets had 5,827 rows. The final joined dataset has 33,503 rows. I used the following code to join the five datasets.

n2<-full_join(n96, n01)
    n3<-full_join(n2, n06)
    n4<-full_join(n3, n11)
    nf<-full_join(n4, n16)
    View(nf)

The final dataset look like following....

 v000    v005     age  v021  v022  v023    v024    resi  region    v102 education pregnant  v445    v501    v717  wealth occupation marital  wgtv   BMI obov 
                        
1 NP3   412612 6 [40-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2285 1 [mar~ 4 [agr~ 1 [poo~ 2 [cleric~ 1 [mar~ 0.413  22.8 0    
2 NP3   412612 3 [25-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2159 1 [mar~ 4 [agr~ 1 [poo~ 2 [cleric~ 1 [mar~ 0.413  21.6 0    
3 NP3   412612 4 [30-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2167 1 [mar~ 4 [agr~ 3 [mid~ 2 [cleric~ 1 [mar~ 0.413  21.7 0    
4 NP3   412612 5 [35-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2039 1 [mar~ 4 [agr~ 4 [ric~ 2 [cleric~ 1 [mar~ 0.413  20.4 0    
5 NP3   412612 2 [20-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 1 [prima~ 0 [no o~  2163 1 [mar~ 4 [agr~ 3 [mid~ 2 [cleric~ 1 [mar~ 0.413  21.6 0    
6 NP3   412612 5 [35-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  3785 1 [mar~ 4 [agr~ 2 [poo~ 2 [cleric~ 1 [mar~ 0.413  37.8 2    
# ... with 6 more variables: over , age1 , working_status , education1 , year , stra

As it a complex survey dataset. I used survey design.

svs<-svydesign(id=nf$v021, strata=nf$stra, nest=TRUE, weights=nf$wgtv, data=nf)

It works. During analysis, I found object-related errors. To fix this, I used the following code-

svs1 <- 
  update(
    svs, 
    one=1, 
    edu = factor( education, levels = c(0, 1, 2, 3), labels = 
                    c("no edu", "primary", "secondary", "higher") ),
    
    wealth =factor( wealth, levels = c(1, 2, 3, 4, 5) , labels = 
                      c("poorest", "poorer", "middle", "richer", "richest")),
    marital = factor( marital, levels = c(0, 1) , labels = 
                        c( "never married", "married")),
    occu = factor( occu, levels = c(0, 1, 2, 3) , labels =
                           c( "not working" , "professional/technical/manageral/clerial/sale/services" , "agricultural", "skilled/unskilled manual") ),
    age1 = factor(age1, levels = c(1, 2, 3), labels =
                   c( "early" , "mid", "late") ),
    obov= factor(obov, levels = c(0, 1, 2), labels= 
                      c("normal", "overweight", "obese")),
    
    over= factor(over, levels = c(0, 1), labels= 
                   c("normal", "overweight/obese")),
    
    working_status= factor (working_status, levels = c(0, 1), labels = c("not working", "working")),
    education1= factor (education1, levels = c(0, 1, 2), labels= 
                          c("no education", "primary", "secondary/secondry+")),
    resi= factor (resi, levels= c(0,1), labels= c("urban", "rural"))
  )

Now, I found the following error

Error in `[<-.data.frame`(`*tmp*`, , newnames[j], value = c(3L, 3L, 3L,  : 
  replacement has 12674 rows, data has 33503

Would please suggest how can I fix this error?

Survey design related issues in R

Answers (1)

Related Questions