Reputation: 179
I'm using azure databricks to create a simple batch to copy data from a databricks filesystem to another location.
as command in a cell, I passed this :
val df = spark.read.text("abfss://" + fileSystemName + "@" + storageAccountName + ".dfs.core.windows.net/fec78263-b86d-4531-ad9d-3139bf3aea31.txt")
where the source file name is : fec78263-b86d-4531-ad9d-3139bf3aea31.txt
but when running cmd, I receive this error message :
HEAD https://bassamsacc01.dfs.core.windows.net/bassamdatabricksfs01/fec78263-b86d-4531-ad9d-3139bf3aea31.txt?timeout=90
StatusCode=403
StatusDescription=This request is not authorized to perform this operation using this permission.
ErrorCode=
ErrorMessage=
at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:134)
at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getPathProperties(AbfsClient.java:353)
at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:498)
at shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:405)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1439)
at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:47)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:386)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:366)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:355)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:355)
at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:927)
at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:893)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3220206233807239:1)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3220206233807239:46)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$$iw$$iw$$iw$$iw.<init>(command-3220206233807239:48)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$$iw$$iw$$iw.<init>(command-3220206233807239:50)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$$iw$$iw.<init>(command-3220206233807239:52)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$$iw.<init>(command-3220206233807239:54)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read.<init>(command-3220206233807239:56)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$.<init>(command-3220206233807239:60)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$read$.<clinit>(command-3220206233807239)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$eval$.$print$lzycompute(<notebook>:7)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$eval$.$print(<notebook>:6)
at lineb130a1e5c98d4e8d87dcdb2af6c5332443.$eval.$print(<notebook>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021)
at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574)
at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41)
at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:570)
at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:215)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:202)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:714)
at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:667)
at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:202)
at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:396)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:238)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:233)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:230)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:49)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:275)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:268)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:49)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:373)
at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645)
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
at java.lang.Thread.run(Thread.java:748)
At a first view, it seems that there is an authentication issue with access to filesystem hosted in azure account storage, but don't know how to add appropriate strings.
If the question helped, up-vote it. Thanks in advance.
Upvotes: 1
Views: 3061
Reputation: 42043
From the error message, you didn't give the correct role to your service principal in the Data Lake Storage Gen2 scope.
To fix the issue, navigate to the storage account in the portal -> Access control (IAM)
-> add your service principal as a role e.g. Storage Blob Data Contributor
like below.
For more details, refer to this doc - Create and grant permissions to service principal.
Upvotes: 1