ebay
ebay

Reputation: 143

failed to parse using R package haven read_sas

I'm importing a sas data file (.sas7bdat) and its corresponding formats (in R language called "labels") with a (.sas7bcat) file to R using haven's package command "read_sas". Importing the sas file on its own work just fine. However, when I try to import it with the formats using the following code import the formats file sas7bcat I get the following error message :

pri <- read_sas (path = "Datasets/pri.sas7bdat",
                 path.cat = "Datasets/formats.sas7bcat")

Error in df_parse_sas_file(spec_data, spec_cat, encoding = encoding, catalog_encoding = catalog_encoding, Failed to parse .../formats.sas7bcat: Invalid file, or file has unsupported features.

I don't have any idea where the problem lies. I have almost 250 variables on the SAS file and they all have both labels and formats (in SAS terminology).

I have read everything related to this question on stack overflow. I believe that the problem probably lies with the formats file itself, but I can't figure out what's the problem.

I'm using the latest version of Haven 2.3.1 : and R v. 4.0.3 (2020-10-10) using RStudio v.1.3.1093 on Windows 10.

This link has the SAS code for the formats

This link has the .sas7bcat library

Upvotes: 0

Views: 6146

Answers (1)

barboulotte
barboulotte

Reputation: 405

sessionInfo()
#> R version 4.0.4 (2021-02-15)
#> Platform: i386-w64-mingw32/i386 (32-bit)
#> Running under: Windows 10 x64 (build 17763)
#> 
#> other attached packages:
#> [1] haven_2.3.1

I created a database class in SAS as:

data temp.class;
  set sashelp.class;
  if sex = "M" then sexnum = 1; else sexnum = 2;
  format sexnum sexfmt.;
run;

Note that your syntax doesn't work for me:

haven::read_sas (path = "d:/temp/class.sas7bdat", 
                 path.cat = "d:/temp/formats.sas7bcat")
#> Error in read_sas(path = "d:/temp/class.sas7bdat", path.cat = "d:/temp/formats.sas7bcat") : 
#>   unused arguments (path = "d:/temp/class.sas7bdat", path.cat = "d:/temp/formats.sas7bcat")

Then I used the following syntax and I got the same message as you:

haven::read_sas (data_file = "d:/temp/class.sas7bdat", 
                 catalog_file = "d:/temp/formats.sas7bcat")
#> Error in df_parse_sas_file(spec_data, spec_cat, encoding = encoding, catalog_encoding = catalog_encoding,  : 
#>   Failed to parse D:/temp/formats.sas7bcat: Invalid file, or file has unsupported features.

or if language for message is set to fr:

haven::read_sas (data_file = "d:/temp/class.sas7bdat", 
                 catalog_file = "d:/temp/formats.sas7bcat")
#> Error in df_parse_sas_file(spec_data, spec_cat, encoding = encoding, catalog_encoding = catalog_encoding,  : 
#>   Failed to parse D:/temp/formats.sas7bcat: Unable to allocate memory.

It's due to the format rekof which has no value. It's OK removing it :

haven::read_sas (data_file = "d:/temp/class.sas7bdat", 
                 catalog_file = "d:/temp/formats.sas7bcat")
#> # A tibble: 19 x 6
#>    Name    Sex     Age Height Weight     sexnum
#>    <chr>   <chr> <dbl>  <dbl>  <dbl>  <dbl+lbl>
#>  1 Alfred  M        14   69    112.  1 [Male]  
#>  2 Alice   F        13   56.5   84   2 [Female]
#>  3 Barbara F        13   65.3   98   2 [Female]
#>  4 Carol   F        14   62.8  102.  2 [Female]
#>  5 Henry   M        14   63.5  102.  1 [Male]  
#>  6 James   M        12   57.3   83   1 [Male]  
#>  7 Jane    F        12   59.8   84.5 2 [Female]
#>  8 Janet   F        15   62.5  112.  2 [Female]
#>  9 Jeffrey M        13   62.5   84   1 [Male]  
#> 10 John    M        12   59     99.5 1 [Male]  
#> 11 Joyce   F        11   51.3   50.5 2 [Female]
#> 12 Judy    F        14   64.3   90   2 [Female]
#> 13 Louise  F        12   56.3   77   2 [Female]
#> 14 Mary    F        15   66.5  112   2 [Female]
#> 15 Philip  M        16   72    150   1 [Male]  
#> 16 Robert  M        12   64.8  128   1 [Male]  
#> 17 Ronald  M        15   67    133   1 [Male]  
#> 18 Thomas  M        11   57.5   85   1 [Male]  
#> 19 William M        15   66.5  112   1 [Male]  

To remove the format rekof, you can either:

  • delete the formats catalog
  • comment the line as /* value rekof ; */
  • regenerate the formats catalog (execute the proc format)

or use the following SAS code:

proc catalog catalog=lcoc.formats; 
  delete rekof (et=format);
run;

Minimal example producing the error

In SAS

libname temp "d:/temp";
option fmtsearch=(temp);

proc format lib=temp;
  value test;
run;

data temp.class;
  set sashelp.class;
run;

In R

haven::read_sas (data_file = "d:/temp/class.sas7bdat", 
                 catalog_file = "d:/temp/formats.sas7bcat")
#> Error in df_parse_sas_file(spec_data, spec_cat, encoding = encoding, catalog_encoding = catalog_encoding,  : 
#>   Failed to parse D:/temp/formats.sas7bcat: Unable to allocate memory.

Regards,

Upvotes: 1

Related Questions