Import from AWS S3 Bucket
Import samples from an S3 bucket directly into the Universal Data Tool!
Overview
If you have an S3 Bucket full of data to label, we'll need to do a couple of things to make that data accessible to UDT.
Configure the bucket so the files can be loaded
Create a user (or use an existing user) that can access the bucket
Import the data into S3
It should only take a couple minutes, let's do it!
1. Configure Bucket
First we need to make sure our files can be loaded from the web. To do this, we need to add a CORs policy in our Bucket Permissions. We can do this from the web on the AWS Buckets page.
You can paste these CORs permissions in to make files web-accessible.
After you save that, you can pick the directories you'd like to be accessible by clicking on the directory, then clicking Actions > Make Public
If your entire bucket is public, you can skip this step.
2. Add IAM Credentials
Next, we need to get the keys to allow the UDT to browse for the S3 bucket files. We can do this by copying our AWS Access Key and Secret Access Key.
It's a good idea to limit the permissions of the user you're getting the access keys from, that way this key can only be used for it's intended purpose!
Navigate to the IAM service and select (or create) a user. The user must have permissions to access S3 buckets. Then click Create Access Key to create your keys!
One simple, but dangerous way to give the AWS permissions is to give the S3FullAccess permission, as show below. Fine-grained permissions are more secure!
3. Add Keys to the Universal Data Tool
Navigate to udt.dev or open the UDT. Click "Add Authentication" and paste your keys.
4. Browse Buckets and Import
You can now create a new UDT Dataset and navigate to Samples > Import from S3
(which will be enabled). You'll be able to select from all the buckets accessible to this user.
Bonus: Import via Uploading to S3
You can also upload to S3 directly from the UDT, after doing so, your files will automatically be added to your UDT dataset.
Last updated