ankit
ankit

Reputation: 347

R: How to read a parquet file from a datastore in R in Azure ML Notebook

I have used the following code in python to read a parquet file from a datastore as:

from azureml.core import Dataset, Datastore, Workspace

subscription_id = 'xyz'
resource_group = 'abc'
workspace_name = 'pqr'

workspace = Workspace(subscription_id, resource_group, workspace_name)
datastore = Datastore.get(workspace, 'workspaceblobstore')

tabular_dataset_3 = Dataset.Tabular.from_parquet_files(path=(datastore,'/UI/09-17-2022_125003_UTC/userdata1.parquet'))

df=tabular_dataset_3.to_pandas_dataframe()

I have checked it here but have not found any documentation to read a parquet file from a datastore.

Since, I am using the Azure ML notebook with R kernel, So, Can anyone please help how to write the equivalent R code in Azure ML notebook with R kernel ?

Any help would be appreciated.

Upvotes: 0

Views: 966

Answers (1)

Kyle M
Kyle M

Reputation: 11

I'm completely new to R, but I've been able to read parquet files in our storage account using the AzureStor R package, along with the apache "arrow" package

install.packages("AzureStor")
install.packages("arrow")

library(AzureStor)
library(arrow)

token <- AzureRMR::get_azure_token("https://storage.azure.com", tenant="<tenant-id>", app="<client-id>", password="<client-secret>")
ad_endp_tok2 <- storage_endpoint("https://<mystorageaccount>.dfs.core.windows.net", token=token)
container <- storage_container(ad_endp_tok2, "<mycontainername>")
rawdata <- storage_download(container, src="<somefile>.parquet", dest=NULL)
parq_df <- read_parquet(rawdata)

Upvotes: 1

Related Questions