One of the most powerful aspects of AWS is their Identity and Access Management (IAM) service. The obvious aspect of its power is that it controls who can do what with all the resources inside your AWS account. But the non-obvious side is how configurable it is. You can encode permissions that are so finely grained that a Lambda Function could, for example, be given just enough permissions to be able to read one attribute from one record for the current user of a DynamoDB Table. The upshot, however, is that IAM policies are very hard to implement correctly. To achieve the aforementioned DynamoDB example, the policy might look like:
{
"Effect": "Allow",
// Only allow reading single records, as opposed to querying for many records
"Action": "dynamodb:GetItem",
// Only allow access to the StatRecords DynamoDB Table in the us-west-2 region
"Resource": "arn:aws:dynamodb:us-west-2:012345678901:table/StatRecords",
"Condition": {
"ForAllValues:StringEquals": {
// Only allow reading the stat_value attribute of records
"dynamodb:Attributes": [
"stat_value"
],
// Only allow reading records where the partition key is
// the user id from 'Login with Amazon ID'
"dynamodb:LeadingKeys": [
"${www.amazon.com:user_id}"
]
},
"StringEquals": {
// Only allow reading specific attributes instead of all attributes
"dynamodb:Select": "SPECIFIC_ATTRIBUTES"
}
}
}
Whew! Oftentimes when you read code you aren't familiar with you can follow it along and figure out how it works.
But if you're new to IAM you likely have questions like:
Resource
?"ForAllValues:StringEquals
mean?"${www.amazon.com:user_id}
come from?"IAM truly is very complex!
In this guide we'll take a look at the basics of IAM policies, just enough to understand best practices, and then look at some of the tools available to help us validate that our permissions follow best practices to secure our resources.
Now that we've seen a complex policy example, let's look at a different example:
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "*"
}
Here we see the three common properties of an IAM policy:
Allow
s or Deny
s access to resourcesThese are just the three most-common properties of an IAM policy. If you want all the nitty-gritty details you can read the full IAM spec here.
In plain English, the above policy grants permission to perform any interaction with AWS Simple Storage Service (S3) on any resource in this AWS account. S3 stores files in buckets, and you can find examples where these permissions are granted in order for a Lambda Function or EC2 Server to upload or download files to a bucket.
While this policy looks very simple, it opens up all kinds of potential issues.
This policy is a great example of an overly broad permission set that can lead to data manipulation and/or exfiltration, both highly concerning security issues.
What can we do about this? We can train everyone who writes policies to follow all applicable best practices and scope the permissions so they pose less risk, but this can be very difficult to achieve across an entire organization of developers. Another approach that is easier to apply at a larger scale is automatically checking for whether IAM best practices are being followed.
One of the best practices for web application development is to provision resources, including IAM policies, using Infrastructure-as-Code (IaC). We won't dive into all the concepts behind IaC here, but it's important to know that IaC gives us a consistent mechanism to review IAM policies by analyzing those written inside IaC templates.
IaC templates are written in a declarative syntax that is easy for computers to analyze. This has given rise to tools that evaluate templates for best practices. At stack.new, we use Stelligent's cfn_nag. These evaluate the templates for best practices, including looking for issues like overly broad permissions via *
actions and resources.
Audit results for the aws-samples/happy-path backend template
Here we see some problematic audit results from an AWS example showing how to build an API that manages state park information. Let's take a look at the first two issues.
The lone FAILURE result is due to the unscoped action in the permission policy on line 182. This gives the ProcessDynamoDBStream
function the ability to perform any AWS IoT action, such as creating an Over-the-Air (OTA) update to IoT devices using the iot:CreateOTAUpdate
action. If we were malicious, we might be able to send an update to all connected IoT devices that causes them to malfunction or send data to somewhere else instead of the secure application it was intended to reach.
The first WARNING exacerbates the prior FAILURE. The WARNING concerns the use of *
in the Resource
statement on line 183. For example, if the Resource
statement only allowed actions on an IoT Topic resource, like arn:aws:iot:us-east-1:012345678901:topic/MyTopic
, then it would block me from being able to call the iot:CreateOTAUpdate
action because the Resource
for the CreateOTAUpdate
call would have a different schema and would not match. Because the Resource
is *
, this function would match an AWS IoT OTA Update ARN and be allowed to create an OTA update.
We now know there is an IAM policy that should be scoped better. It's time to dive into the code to figure out how to fix this!
We can take a look at the code for this function in streams/ddb/app.js. We see it creates an iotdata
object to make requests to the IoT service on line 20.
We then see the iotdata.publish()
action invoked on line 59.
This is the only AWS SDK action invoked in this Lambda Function. That means we can fix the FAILURE result by updating the Action
in the policy on line 182 of our template to iot:Publish
.
Looking at the Actions, resources, and condition keys for AWS IoT page (most AWS services have a documentation page like this) we can find the Publish
action to see what kind of resource is allowed for scoping the permission. We see the Publish
action allows an IoT Topic ARN to be specified.
The Topic ARN format, shown below, has four variables we need to supply.
The first three, Partition
, Region
, and Account
, are specific to where we deploy our app and can be substituted in by AWS CloudFormation, which we'll see in a moment. The fourth, TopicName
, is the name of the topic we are publishing to. If we go back to our Function code and trace the logic we'll see the Topic we publish events to is based on the Place ID of state parks in the app's DynamoDB Table. This means the Topic name is not a small set of Topics we can encode into the policy, so we should simply put a wildcard *
in for it.
We can now update the Resource
property of our policy in line 183 of our template to be !Sub arn:${AWS::Partition}:iot:${AWS::Region}:${AWS::AccountId}:topic/*
. The !Sub
syntax asks CloudFormation to substitute in the pseudo-parameters for the AWS partition, account, and region we are deploying into.
With this update we are now doubly prevented from the possibility of malicious actions being invoked by our Function because actions other than iot:Publish
are explicitly not allowed anymore, and even if they were they would have a resource ARN that would not match the policy's Resource
specification.
We hope through stack.new folks can learn all about the architecture of applications that to this point are shared as hard-to-follow CloudFormation templates, and further learn best practices based on the output of auditing tools like cfn_nag. We'd love to hear what you think or if you learned anything! Give us a shout @stackeryio with your thoughts!
Learn how our Design Canvas helps you visualize and edit your code