S3 Background
Amazon AWS has a service called S3, which up until recently was used as the storage backing for Dropbox. It is incredibly fast and affordable and combined with another AWS service called CloudFront the storage can be distributed and served up closer to the users requesting it. The AWS-SDK is a wonderful library which can make interacting with S3 almost trivial. Or, requests can be made directly to the S3 REST API. Now that we have some background, let’s talk about a few different ways we can safely let our users pipe their data up to our buckets. The goal for doing this is to save you precious bandwidth and to skip the middleman (your server!). My weapon of choice for examples will be JavaScript.
The three ways I will discuss today are:
The Grit
Some background on permissions for the above methods
Many of these methods integrate tightly with IAM (Identity Access Management). Policies govern access to the API. Policies are chunks of JSON that contain a set of rules and permissions. Policies can be attached directly to groups or users and sometimes to the resources themselves. Like when you want to let the public request resources from your bucket.
AWS-SDK getSignedURL()
The easiest possible way to make this happen is to use the AWS-SDK to create a signed URL which lets the user perform some action against the bucket. These URLS are one time use, and they expire. Init your SDK with credentials that are attached to a polciy which can run putObject.
Server Side
Client Side
Pros
- Incredibly easy to create a URL and return it to your user.
Cons
- No control over things like max file size upload allowed.
- A server is required. You must have an SDK initialized with elevated credentials vet the reuest and sign off on it.
Browser-Based Uploads Using POST (Signature Version 4)
This one is a little more tricky, but allows an entire policy to be attached to the upload. This tutorial by Amazon is a good summary.
First you create a policy which is a JSON object that has things like the expiration, the file size allowed, the content type, etc.
Then you base64 encode it and you sign it using data like your AWS Secret Access Token.
Lastly you create a form and include the signed policy among other fields as hidden inputs.
IMPORTANT: Make sure the Access ID signing the policy has access to set ACLs (permissions) on the bucket objects.
Pros
- Incredibly fine grained control over uploads via policies.
Cons
- Signing is a little complex.
- Revoking access to upload for a specific user becomes more difficult.
AWS Cognito
Honestly, this one is deserving of its own blogpost entirely. And I think it will eventually get one.
AWS Cognito is a recently released service that has two main components. User Pools and Identity Pools.
User Pools are just that.
They are collections of users. How do they login? The simplest way possible is to create a user and set its password with the AWS GUI. Then in your code you can use the AWS-SDK to call a method called adminInitiateAuth()
where you pass the username and password and if all goes well you get a handful of items back which you can use for all sorts of things. Like a JWT session token which you can validate when routing.
Identity Pools
Here is where things get interesting. Identity pools give you real live AWS credentials to hand to your authenticated and unauthenticated users. How do Identity Pools authenticate a user? It is important to note that Identity Pools don’t have to have anything to do with User Pools. They CAN however; I will tie them in here shortly. They have a list of built-in identity “providers” like Facebook or Google. When your user logs in with Facebook you can then kick of a process where Cognito Federated Identities will verify the token from the provider is legit, and then provide you with credentials.
Pros
- Tight control over access.
- Can be completely server-less.
Cons
- Can be tricky to tie into your own back-end. (hopefully I will turn this into a blog post)