Reputation: 347
I have used the following code in python to read a parquet file from a datastore as:
from azureml.core import Dataset, Datastore, Workspace
subscription_id = 'xyz'
resource_group = 'abc'
workspace_name = 'pqr'
workspace = Workspace(subscription_id, resource_group, workspace_name)
datastore = Datastore.get(workspace, 'workspaceblobstore')
tabular_dataset_3 = Dataset.Tabular.from_parquet_files(path=(datastore,'/UI/09-17-2022_125003_UTC/userdata1.parquet'))
df=tabular_dataset_3.to_pandas_dataframe()
I have checked it here but have not found any documentation to read a parquet file from a datastore.
Since, I am using the Azure ML notebook with R kernel, So, Can anyone please help how to write the equivalent R code in Azure ML notebook with R kernel ?
Any help would be appreciated.
Upvotes: 0
Views: 966
Reputation: 11
I'm completely new to R, but I've been able to read parquet files in our storage account using the AzureStor R package, along with the apache "arrow" package
install.packages("AzureStor")
install.packages("arrow")
library(AzureStor)
library(arrow)
token <- AzureRMR::get_azure_token("https://storage.azure.com", tenant="<tenant-id>", app="<client-id>", password="<client-secret>")
ad_endp_tok2 <- storage_endpoint("https://<mystorageaccount>.dfs.core.windows.net", token=token)
container <- storage_container(ad_endp_tok2, "<mycontainername>")
rawdata <- storage_download(container, src="<somefile>.parquet", dest=NULL)
parq_df <- read_parquet(rawdata)
Upvotes: 1