Building a VPC with AWS Cloudformation

One of the advantages of Amazon Web Services is the ability to quickly create complex infrastructures for development and testing, and then, when you’re done, to tear the infrastructure down.  The simplest way to reproducibly provision infrastructure is through the use of Cloudformation Templates.  These templates allow you to describe your infrastructure in JSON or YAML which AWS will then provision for you.

Amazon provides detailed documentation for Cloudformation, unfortunately the documentation is very strong on the reference side, but not so good at worked examples.  There’s also lots of examples on blogs around the Internet, however these tend to be of the variety that state “Here’s our template to do X”, again with little or no explanation about how the templates work.

I’ve been doing a lot of work with distributed systems recently and I wanted to be able to create Virtual Private Cloud containing a number of servers where I could install the software I was experimenting with.  An AWS VPC was the perfect solution but has a larger number of moving parts than you might think, especially if you want to limit the exposure your instances have to the Internet.  This blog post describes the Cloudformation template I created to spin up a VPC in a single availability zone.  Mainly for my own benefit the rest of the post explains how the template works, hopefully this might be of use to other people as well.

VPC Template

I’ve chosen to write my template in YAML.  This is a relatively new feature of Cloudformation, previously templates had to be written in JSON.  YAML has a number of advantages including the ability to have inline comments and a (I think) a cleaner syntax.  I’m also seeing YAML being used in many other projects so it seemed to be a useful thing to learn.

All Cloudformation templates have the following structure:

---
AWSTemplateFormatVersion: "2010-09-09"
Description: My VPC Template

Resources:
    ...

The three hyphens on the first line are part of the YAML specification, indicating the start of a document.  It used to be that the AWSTemplateFormatVersion section was mandatory, but the  latest documentation says that it’s now optional, I tend to include it anyway.  The Description section is also optional, but I would say that it’s best practice to include.

Before we get to the Resources section I should point out that you can do many complex things to make your templates very flexible.  A lot of this flexibility is driven using the Parameters and Mappings sections.  I’ve chosen not to use these two features to keep this template as simple as possible, I may do a follow up post showing how to use these sections to do deployment time customisation.  One other section I’ve not included but I should mention is the Outputs section.  This allows you to output values when AWS is deploying the stack based on your template.  You can use this to get values relating to the created resources such as public IP addresses.

The Resources section is where you specify the items that you want AWS to create when deploying a stack based on the template.  I’ll now go through each of these in turn, explaining the parameters I’ve chosen.  The full template is available from GitHub.

VPC

VPC:
    Type: AWS::EC2::VPC
    Properties:
        CidrBlock: 10.1.0.0/16
        EnableDnsSupport: true
        EnableDnsHostnames: true
        InstanceTenancy: default
        Tags:
          - Key: Name
            Value: Cloudformation Test VPC

All resources have the same basic structure: the logical ID of the resource, a Type, and then a Properties section.  In this example the logical ID of the resource is VPC and the type is AWS::EC2::VPC.  The properties I’m setting are:

  • CidrBlock: The subnet for the VPC.  You need to make sure your VPC has enough IP address space to carve out all the subnets you need.  I’m using an RFC1918 range, though AWS supports any address range you want.
  • EnableDnsSupport: If set to true the AWS DNS server resolves hostnames for instances in the VPC.
  • EnableDnsHostnames: If set to true instances get allocated DNS hostnames, you need to have EnableDnsSupport set to true as well for this to work.
  • InstanceTenancy: You can have your instances run on dedicated hardware assigned to only you if this is set to dedicated.  Understandably setting this to dedicated costs more!

You can optionally set tags on resources to make it easier to manage your AWS account.  Setting the Name tag makes resources identifiable in the AWS web interface.

Internet Gateway

An Internet Gateway is an Amazon managed device that allows resources in your VPC to connect to the Internet.  As I want to connect to my VPC over the Internet and I want instances in my VPC to be able to download from the Internet I need to create an Internet Gateway.

InternetGateway:
    Type: AWS::EC2::InternetGateway
    Properties:
        Tags:
          - Key: Name
            Value: Internet Gateway

You only create a single Internet Gateway per VPC, even if your VPC spans multiple availability zones.  Amazon take care of making the Internet Gateway highly available.  As you can see the Internet Gateway doesn’t need any extra properties.

Creating an Internet Gateway is a two stage operation.  First, as above, you declare the gateway, then you need to attach it to your VPC:

AttachGateway:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
        VpcId:
            Ref: VPC
        InternetGatewayId:
            Ref: InternetGateway

The AttachGateway section has two properties, these are both references to other sections in your Cloudformation template.  Here we are referencing the VPC and the InternetGateway that we have already declared in the template.

Note that these types of attach sections (we’ll see more shortly) don’t have Tag properties.  Adding a Tag section will cause your template to fail.

Bastion Host Subnet

My VPC design has three subnets: one for the bastion host that will allow SSH access to my VPC from the Internet, one for the NAT Gateway that will allow my instances access to the Internet, and one for my worker instances that shouldn’t be reachable directly from the Internet.

We’ll start by creating the subnet for the bastion host:

BastionHostSubnet:
    Type: AWS::EC2::Subnet
    Properties:
        VpcId:
            Ref: VPC
        CidrBlock: 10.1.1.0/28
        MapPublicIpOnLaunch: true
        Tags:
          - Key: Name
            Value: Bastion Host Subnet

As well as the Type, we need to define the following properties for the subnet:

  • VpcId: This is a reference to the VPC which will contain the subnet.
  • CidrBlock: The IP address range for the subnet in CIDR notation.  Note that AWS reserves 5 addresses from the range, that’s why I’ve chosen a /28 subnet mask leaving me with 11 usable addresses.
  • MapPublicIpOnLaunch: With this set to true, instances launched into the subnet will be allocated a public IP address by default.  This means that any instances in this subnet will be reachable from the Internet, subject to Security Groups and Network ACLs.

Once the subnet has been declared we need to configure routing.  By default a VPC is created with a main route table which allows instances to send traffic to each other even if they are in different subnets.  However, we want instances on this subnet to be able to communicate across the Internet so we need to create a subnet specific route table that will route Internet traffic via the Internet Gateway we declared previously.

It’s a three step process to declare and configure the subnet route table.  Step one is declaration of the route table:

BastionHostSubnetRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
        VpcId:
            Ref: VPC
        Tags:
          - Key: Name
            Value: Bastion Host Subnet Route Table

The route table is a very simple object, all it contains is a Type, a reference to the VPC and a tag giving it a name.

Step two is to declare the route entry that will send Internet bound traffic to our Internet Gateway:

BastionHostInternetRoute:
    Type: AWS::EC2::Route
    DependsOn: InternetGateway
    Properties:
        DestinationCidrBlock: 0.0.0.0/0
        GatewayId:
            Ref: InternetGateway
        RouteTableId:
            Ref: BastionHostSubnetRouteTable

This section introduces a item: DependsOn which instructs AWS not to create this resource until the InternetGateway has been created.  This is how we ensure that resources are created in the correct order.  The DestinationCidrBlock describes which traffic we want this route to apply to.  A value of 0.0.0.0/0 means all traffic.  It’s important to note that routes operate on a most specific match first and 0.0.0.0/0 is the least specific of all routes.  This means that the default VPC route entry of 10.1.0.0/16 will match first ensuring that traffic does not leak out of the VPC.

The GatewayId reference specifies where traffic matching this route should be sent, in this case it’s to the Internet Gateway we previously declared.  The RouteTableId reference connects this route to the route table.

Finally, step three is to associate the route table with the subnet:

BastionHostSubnetRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
    RouteTableId:
        Ref: BastionHostSubnetRouteTable
    SubnetId:
        Ref: BastionHostSubnet

The route table association connects the route table (and route) we’ve just declared with the subnet we declared earlier.  With this in place any instances created within the bastion host subnet should be able to route traffic to and from the Internet (security groups and network ACLs permitting).

Bastion Host Security Group

Before we deploy a bastion host we need to declare a security group.  By default instances are firewalled off from all network traffic so the security group needs to describe what traffic to let in and out of the instance.

BastionHostSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
        GroupDescription: Allow SSH to Bastion Host
        VpcId:
            Ref: VPC
        SecurityGroupIngress:
          - IpProtocol: tcp
            FromPort: '22'
            ToPort: '22'
            CidrIp: 0.0.0.0/0
        SecurityGroupEgress:
          - IpProtocol: -1
            CidrIp: 0.0.0.0/0
        Tags:
          - Key: Name
            Value: Bastion Host Security Group

As you can see a security group has a number of properties you can configure:

  • GroupDescription: A free text field that you can use to describe what the security group allows.
  • VpcId: The VPC where we will be using the security group.
  • SecurityGroupIngress: This property describes the traffic we should allow through the security group to the instance.
    • IpProtocol: This should be pretty obvious, we’re interested in TCP traffic.
    • FromPort and ToPort: Combined these describe a range of ports to which traffic should be allowed.
    • CidrIp: The range of IP addresses from which we should allow traffic that matched the IpProtocol and FromPort/ToPort range.
  • SecurityEgressGroup: This property describes traffic we should allow from the instance into the VPC and beyond.
    • IpProtocol: Same as before but this time we are using the value -1 which means all traffic types on all ports.
    • CidrIp: The range of IP addresses to which we should allow traffic that matches the IpProtocol setting (all traffic in this case).

Bastion Host Launch Configuration and Autoscaling

At this stage we could just create our bastion host instance and start using it, however it’s better to expend the extra effort and create a launch configuration and autoscaling group.  By launching our bastion host instance from within an autoscaling group we benefit from the feature that AWS will automatically restart our instance should it die for any reason.

BastionHostLaunchConfig:
    Type: "AWS::AutoScaling::LaunchConfiguration"
    Properties:
        AssociatePublicIpAddress: true
        ImageId: ami-9398d3e0 # Amazon Linux in eu-west-1
        InstanceMonitoring: false
        InstanceType: t2.micro
        KeyName: TestStack
        PlacementTenancy: default
        SecurityGroups:
          - Ref: BastionHostSecurityGroup

Launch configurations have a large number of properties that you can configure, we’re only using a small subset here:

  • AssociatePublicIpAddress: Make sure that instance has a public IP address when it launches, we don’t strictly need this as we configured the subnet to have this feature.
  • ImageId: The AMI to use to create the instance.  Note that I am using the Amazon Linux AMI in the eu-west-1 region.  If you’re in a different region, want to use a different operating system, or Amazon have released an updated version of their Linux, then you’ll need to change this value.
  • InstanceMonitoring: Setting to true enables detailed monitoring for your instance, this costs extra so I don’t use it in throwaway environments.
  • InstanceType: The type (size) of instance you want.  I’m just using the small and cheap t2.micro.
  • KeyName: The SSH key pair to use to access this instance.  You need to have previously created the key pair, obviously your key pair will probably be called something different.
  • PlacementTenancy: Same as for the VPC, we don’t want dedicated hardware for our instance.
  • SecurityGroups: A reference to the security group we declared previously.

After we’ve declared our launch configuration we need to create the autoscaling group:

BastionHostScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
        LaunchConfigurationName:
            Ref: BastionHostLaunchConfig
        MinSize: '1'
        MaxSize: '1'
        VPCZoneIdentifier:
          - Ref: BastionHostSubnet
        Tags:
          - Key: Name
            Value: Bastion Host
            PropagateAtLaunch: true

The autoscaling group references the launch configuration we declared using the LaunchConfigurationName reference.  The other properties are:

  • MinSize and MaxSize: By setting these both to one the autoscaling group will create a single instance and restart it if it fails.
  • VPCZoneIdentifier: A list of references to subnets into which the instances will be launched.  If we had multiple subnets spread across multiple availability zones we could reference them here to create a highly available system.

NAT Gateway, Subnet, and Elastic IP

The bastion host is configured to allow SSH traffic into our VPC from the Internet, however we also want our worker instances to be able to access the internet, mainly so that they can download software updates.  To do this we need create a NAT gateway, a public subnet to host it, and an elastic IP address.

NatGatewaySubnet:
    Type: AWS::EC2::Subnet
    Properties:
        VpcId:
            Ref: VPC
        CidrBlock: 10.1.1.16/28
        MapPublicIpOnLaunch: true
        Tags:
          - Key: Name
            Value: NAT Gateway Host Subnet

NatGatewaySubnetRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
        VpcId:
            Ref: VPC
        Tags:
          - Key: Name
            Value: NAT Gateway Subnet Route Table

NatGatewayInternetRoute:
    Type: AWS::EC2::Route
    DependsOn: InternetGateway
    Properties:
        DestinationCidrBlock: 0.0.0.0/0
        GatewayId:
            Ref: InternetGateway
        RouteTableId:
            Ref: NatGatewaySubnetRouteTable

NatGatewaySubnetRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
        RouteTableId:
            Ref: NatGatewaySubnetRouteTable
        SubnetId:
            Ref: NatGatewaySubnet

We declare the subnet for the NAT gateway in exactly the same way we did for the bastion host subnet.  The only differences are in the logical IDs for the resources we declare and the subnet IP address range we are using.

With the subnet declared we can declare the elastic IP address that we’ll assign to the NAT gateway:

NatGatewayEIP:
    Type: AWS::EC2::EIP
    Properties:
        Domain: vpc

The Domain property needs to be set to vpc as we are working within a VPC.

Finally we can create the NAT gateway:

NatGateway:
    Type: AWS::EC2::NatGateway
    DependsOn: AttachGateway
    Properties:
        AllocationId:
            Fn::GetAtt:
              - NatGatewayEIP
              - AllocationId
        SubnetId:
            Ref: NatGatewaySubnet

The AllocationId property is interesting, Fn::GetAtt is an intrinsic function.  Basically what it does is get the AllocationId attribute if the NatGatewayEIP resource that we declared previously.  The SubnetId property is a reference to the subnet where the NAT gateway should be deployed.

Private Subnet

The declarations for the private subnet are much the same as the previous subnets.

PrivateSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
        VpcId:
            Ref: VPC
        CidrBlock: 10.1.2.0/24
        MapPublicIpOnLaunch: false
        Tags:
          - Key: Name
            Value: Private Subnet A

Note that MapPublicIpOnLaunch is set to false so that instances in this subnet don’t get a public IP address.

PrivateSubnetARouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
        VpcId:
            Ref: VPC
        Tags:
          - Key: Name
            Value: Private Subnet A Route Table

PrivateSubnetANatInternetRoute:
    Type: AWS::EC2::Route
    DependsOn: NatGateway
    Properties:
        DestinationCidrBlock: 0.0.0.0/0
        NatGatewayId:
            Ref: NatGateway
        RouteTableId:
            Ref: PrivateSubnetARouteTable

The Internet route for the private subnet is a little different.  We declare the NatGatewayId property as a reference to the Nat gateway we declared previously.

PrivateSubnetARouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
        RouteTableId:
            Ref: PrivateSubnetARouteTable
        SubnetId:
            Ref: PrivateSubnetA

Private Subnet Security Group

The private subnet security group is pretty much the same as the bastion host security group:

PrivateSubnetASecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
        GroupDescription: Allow SSH from Bastion Host
        VpcId:
            Ref: VPC
        SecurityGroupIngress:
          - IpProtocol: tcp
            FromPort: '22'
            ToPort: '22'
            SourceSecurityGroupId:
                Fn::GetAtt:
                  - BastionHostSecurityGroup
                  - GroupId
        SecurityGroupEgress:
          - IpProtocol: -1
            CidrIp: 0.0.0.0/0
        Tags:
          - Key: Name
            Value: Private Subnet A Security Group

The difference is that in the SecurityGroupIngress property we don’t use the CidrIp property instead we use the SourceSecurityGroupId to link this security group to the bastion host security group allowing SSH traffic.

Private Subnet Launch Configuration and Autoscaling Group

The final part of the puzzle is the launch configuration and autoscaling group for launching worker instances into the private subnet.

PrivateSubnetALaunchConfig:
    Type: "AWS::AutoScaling::LaunchConfiguration"
    Properties:
        AssociatePublicIpAddress: false
        ImageId: ami-9398d3e0 # Amazon Linux in eu-west-1
        InstanceMonitoring: false
        InstanceType: t2.micro
        KeyName: TestStack
        PlacementTenancy: default
        SecurityGroups:
          - Ref: PrivateSubnetASecurityGroup

PrivateSubnetAScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
        LaunchConfigurationName:
            Ref: PrivateSubnetALaunchConfig
        MinSize: '1'
        MaxSize: '1'
        VPCZoneIdentifier:
          - Ref: PrivateSubnetA
        Tags:
          - Key: Name
            Value: Worker Host
            PropagateAtLaunch: true

These are pretty much identical to the bastion host launch configuration and autoscaling group.  The changes are that the AssociatePublicIpAddress is set to false so that instances don’t get public IP addresses (they use the NAT gateway to access the Internet) and the logical IDs for various resources point to those in the private subnet.

Network Access Control Lists

If you’re familiar with AWS VPCs you’ll have noticed that I am not declaring any network access control lists.  By not declaring and ACLs AWS will create a default ACL that allows all inbound and outbound traffic and our subnets will be associated with this ACL.  In a production deployment you might want to use ACLs to limit traffic flows between subnets.  I’ll leave that as an exercise for the reader.

Final Thoughts

Well that turned out far longer than I though (just over 2800 words!).

Cloudformation templates have a huge amount of flexibility, so I’m sure that there are different, and probably better, ways to achieve what I’ve created here.  If you’ve got any tips please feel free to leave a comment.

The full template is available from my GitHub.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s