Reputation: 514
Question: Say a user uploads highly confidential information. This is placed in a third party storage server. This third party bucket uses different authentication systems to the web application. What is the best practice for ensuring only the user or an admin staff member can access the file url?
More Context: A Django web application is running on Google App Engine Flexible. Google Storage is used to serve static and media files through Django. The highly confidential information is passports, legal contracts etc.
Static files are served in a fairly insecure way. The /static/
bucket is public, and files are served through django's static files system. This works because
For media files however, we need user specific permissions, if user A uploads an image, then user A can view it, staff can view it, but user B & unauthenticated users cannot under any circumstances view it. This includes if they have the url.
My preferred system would be, that GCP storage could use the same django authentication server, and so when a browser requested ...google.storage..../media/user_1/verification/passport.png
, we could check what permissions this user had, compare it against the uploaded user ID, and decide whether to show a 403 or the actual file.
What is the industry standard / best practice solution for this issue?
Do I make both buckets only accessible to the application, using a service account, and ensure internally that the links are only shared if the correct user is viewing the page? (anyone for static, and {user or staff} for media?)
My questions, specifically (regarding web application security):
Upvotes: 6
Views: 808
Reputation: 514
Following on from Nahuel Varela's answer:
My system now consists of 4 buckets:
Both the static buckets are public, and the media buckets are only accessible to the app engine service account created within the project. (The settings are different for dev / test)
I'm using the django-storages[google]
with @elnygrens modification. I modified this to remove the url
method for Media (so that we create signed URLS) but keep it in for static (so that we access the public URL of the static files).
The authentication of each file access is done in Django, and if the user passes the test (is_staff or id matches file id), then they're given access to the file for a given amount of time (currently 1 hour), this access refreshes when the page loads etc.
Follow up question: What is the best practice for this time limit, I've heard people use anywhere from 15mins to 24 hours?
Upvotes: 1
Reputation: 1040
Of course, do not use the same Public bucket to store private/public stuff. Public with Public, Private with Private.
For example, if you authenticate your user using any Django methods, then the user will be authenticated to Django, but for Cloud Storage it will be an stranger. Also, even a user authorized on GCP may not be authorized to a bucket on Cloud Storage.
The important thing here is that the one that communicates back and forth with Cloud Storage is not the User, its Django. It could achieve this by using the python SDK of Cloud Storage, which takes the credentials of the service account that is being used on the instance to authenticate any request to Cloud Storage. So, the service account that is running the VM (because you are in Flexible) is the one that should be authorized to Cloud Storage.
Regarding your first question at the top of the post, Cloud Storage lets you create signed urls. This urls allow anyone on the internet to upload/download files from Cloud Storage by just holding the url. So you only need to authorize the user on Django to obtain the signed url and that's it. He does not need to be "authorized" on Cloud Storage(because the url already does it)
Taken from the docs linked before:
When should you use a signed URL?
In some scenarios, you might not want to require your users to have a Google account in order to access Cloud Storage, but you still want to control access using your application-specific logic. The typical way to address this use case is to provide a signed URL to a user, which gives the user read, write, or delete access to that resource for a limited time. Anyone who knows the URL can access the resource until the URL expires. You specify the expiration time in the query string to be signed.
Upvotes: 5