ATU
ATU

Reputation: 111

unable to install.packages('arrow') to read parquet file (read_parquet). Any other way to read parquet file or use any different library?

I'm very new to R or even bash. I'm trying to read Parquet file from my local using read_parquet function, but it requires to install arrow library: install.packages('arrow'), which is taking forever (read it as stuck/hang on installation step) on my Ubuntu WSL. I have tried everything else.

install.packages('arrow')  #Taking forever to install
library(arrow)
df <- read_parquet("Financial_Sample.parquet")

Could someone please help me to find any other function or library to read parquet file. any lead would be appreciated!

Upvotes: 5

Views: 2756

Answers (3)

bjek30d10
bjek30d10

Reputation: 166

The fastest way I found to install on remote servers:

Sys.setenv(NOT_CRAN = "true")
install.packages("arrow")

This is the recommended method based on the warning messages I also got from a hanging (forever) load time with a standard CRAN install.

Upvotes: 2

Gorka
Gorka

Reputation: 2071

To be able to use read_parquet, I had to install arrow with:

Sys.setenv(LIBARROW_MINIMAL = "false")
install.packages("arrow")

which installs arrow with the following capabilities:

dataset    TRUE
parquet    TRUE
json       TRUE
s3         TRUE
utf8proc   TRUE
re2        TRUE
snappy     TRUE
gzip       TRUE
brotli     TRUE
zstd       TRUE
lz4        TRUE
lz4_frame  TRUE
lzo       FALSE
bz2        TRUE
jemalloc   TRUE
mimalloc   TRUE

Upvotes: 2

darked89
darked89

Reputation: 529

Check:

library(arrow)
arrow_info()
Arrow package version: 6.0.1

Capabilities:
               
dataset    TRUE
parquet    TRUE
json       TRUE
<snip>

The default install of arrow in R does not provide you with bunch of compression algorithms (or rather any of these):

snappy    FALSE
gzip      FALSE
brotli    FALSE
zstd      FALSE
lz4       FALSE
lz4_frame FALSE
lzo       FALSE
bz2       FALSE

Upvotes: 0

Related Questions