Reputation: 93
I'm working on a project that ingests files and stores them on ADLS and stores their location and some metadata in CosmosDB. In order to curate this, I am writing a .NET console application to try to locate and optionally delete the files in ADLS which are no longer referenced in the CosmosDB records.
This console application uses user credentials rather than service credentials because it will be used by the devops team and we don't want to give them access to delete files they do not have permission to delete.
The essential code for the files was based on the microsoft documentation page .
When running on my own credentials in the developer ADLS, I can list the files fine. I have write access to the files that I should be deleting, and write and execute access to the parent folder. Additionally, I can delete the files in the azure portal and using the Microsoft Azure Storage Explorer.
However when I use the lines of code:
// helper function taken from MS documentation
var tokenCache = GetTokenCache(Path.Combine(MY_DOCUMENTS, "my.tokencache"));
string adlsFqdn = "<myadlsacount>.azuredatalakestore.net";
// helper function taken from same documentation
var adlCreds = GetCreds_User_Popup(adlsTenant, ADL_TOKEN_AUDIENCE,
adlsClientId, tokenCache);
var adlsClient = AdlsClient.CreateClient(adlsFqdn, adlCreds);
IEnumerable<string> = FindUnwantedFilePaths(adlsClient);
// ...
adlsClient.Delete("<samplepath>");
Then the call to AdlsClient.Delete
fails with the following message:
Operation: DELETE failed with HttpStatus:BadRequest Error: Uexpected error in JSON parsing.
Last encountered exception thrown after 1 tries. [Uexpected error in JSON parsing]
[ServerRequestId:]
I can see that it's easily possible that I might be doing something slightly wrong, but as I can delete some of these files from ADLS using other tools, it looks like it isn't a problem with my account. (Although I am the owner of the ADLS component). I had a look around, and didn't see anyone with a problem like this.
Does anyone have any clue what I am doing wrong? Is it impossible to delete files programmatically with a user account if you're not the file owner? Failing that, have someone correctly intepret what the error means would be helpful. My guess is there is a subtle issue around rights, but I can't quite see what exactly I need to do.
Finally, I can delete these files in other ways, but that isn't the point - I need this tool to be able to do this for the good of the project I am working on.
Upvotes: 0
Views: 300
Reputation: 11
One scenario that causes the "unexpected error in JSON parsing"
is if the filepath
does not start with a leading slash, so make sure you are sending up the path in the /{path}
format and not just {path}
.
Upvotes: 1
Reputation: 93
I found a way to delete the files by avoiding the wrapper library Microsoft.Azure.DataLake.Store
and instead using the REST API exposing the HDFS interface as documented here with the addition of an authentication header containing a correctly setup bearer token.
The code snippet to do this is this:
var request = new HttpRequestMessage(HttpMethod.Delete,
"https://<accountname>.azuredatalakestore.net/webhdfs/v1/<path>?op=DELETE&recursive=false");
// The same adlCreds as in the question. This adds the correct authentication
// token to the request
await adlCreds.ProcessHttpRequestAsync(request, CancellationToken.None);
var client = new HttpClient();
var response = await client.SendAsync(request);
string responseContent = response.Content != null
? await response.Content.ReadAsStringAsync()
: string.Empty;
// response content is JSON equivalent to {"boolean":true}
// if the call to delete succeeded.
What I don't undestand is why I'm getting a bad request formed in the library when the straightfoward implementation works. But at least I can delete the files I need to now. This isn't a way of bypassing the security, but I should be able to delete the files I am trying to delete, and now I can.
Upvotes: 0