Joost Maxen
Joost Maxen

Reputation: 179

Labels applied in R do not save when writing as a Stata file

I added variable (and value, for some) labels in R, using the apply_labels function from 'expss'. When I want to save the data using 'write.dta' and open it in Stata (or reopening the newly saved data in R), the labels do not appear.

I am suspecting that it has something to do with this line in the write.dta documentation:

If the "var.labels" attribute contains a character vector with a string label for each variable then this is written as the variable labels. Otherwise the variable names are repeated as variable labels.

Because this is exactly what is happening (the variable names are repeated as variable labels). When checking with attr(df$variable, "label") before trying writing the data using write.dta, the labels appear.

I get the warning message:

"In write.dta [...] abbreviating variable names".

Not sure if this has to do with the problem.

A reproducible example of the code used to add the varibale, labels, and write the data:

library(expss)
library(dplyr)
library(foreign)

df <- data.frame(country = rep(c("NL", "DE", "FR", "AT"), 2),
                 year = rep(c(2012,2014), 4),
                 LS_medianpovgap60_disp_wa = c(0.448257605781815, 0.468249874784546, 0.473270740126805, 0.483814288478694, 0.486781335455043, 0.49246341926957, 0.51121872756711, 0.556027028656306))

df <- apply_labels(df,
                   country = "Country",
                   year = "Year",
                   LS_medianpovgap60_disp_wa = "Median shortfall from the poverty thresholds using 60% of the median income, disposable income only households with working age (LIS and SILC average)")

write.dta(df, "df_labelled.dta")

Upvotes: 2

Views: 521

Answers (1)

xilliam
xilliam

Reputation: 2259

For Stata version > 7, write.dta attempts to abbreviate variable label's if the label attributes is longer than 31 characters.

You may get a better result by using the haven package for the writing and reading steps of your code.

haven::write_dta(df, "df_labelled.dta")
temp <- haven::read_dta("df_labelled.dta")
temp

Edit The comments below point out that Stata imposes a limit on a variable label's length (80 characters). So R-based work-arounds will all be subject to this constraint.

Upvotes: 1

Related Questions