Reputation: 111
I'm very new to R or even bash. I'm trying to read Parquet file from my local using read_parquet function, but it requires to install arrow library: install.packages('arrow'), which is taking forever (read it as stuck/hang on installation step) on my Ubuntu WSL. I have tried everything else.
install.packages('arrow') #Taking forever to install
library(arrow)
df <- read_parquet("Financial_Sample.parquet")
Could someone please help me to find any other function or library to read parquet file. any lead would be appreciated!
Upvotes: 5
Views: 2756
Reputation: 166
The fastest way I found to install on remote servers:
Sys.setenv(NOT_CRAN = "true")
install.packages("arrow")
This is the recommended method based on the warning messages I also got from a hanging (forever) load time with a standard CRAN install.
Upvotes: 2
Reputation: 2071
To be able to use read_parquet
, I had to install arrow
with:
Sys.setenv(LIBARROW_MINIMAL = "false")
install.packages("arrow")
which installs arrow
with the following capabilities:
dataset TRUE
parquet TRUE
json TRUE
s3 TRUE
utf8proc TRUE
re2 TRUE
snappy TRUE
gzip TRUE
brotli TRUE
zstd TRUE
lz4 TRUE
lz4_frame TRUE
lzo FALSE
bz2 TRUE
jemalloc TRUE
mimalloc TRUE
Upvotes: 2
Reputation: 529
Check:
library(arrow)
arrow_info()
Arrow package version: 6.0.1
Capabilities:
dataset TRUE
parquet TRUE
json TRUE
<snip>
The default install of arrow in R does not provide you with bunch of compression algorithms (or rather any of these):
snappy FALSE
gzip FALSE
brotli FALSE
zstd FALSE
lz4 FALSE
lz4_frame FALSE
lzo FALSE
bz2 FALSE
Upvotes: 0