3. July 2017 22:25
by Aaron Medacco
15 Comments

Scheduling Automated AMI Backups of Your EC2 Instances

3. July 2017 22:25 by Aaron Medacco | 15 Comments

When considering disaster recovery options for systems or applications running on Amazon Web Services, a frequent solution is to use AMIs to restore instances to a known acceptable state in the event of failure or catastrophe. If you're team has decided on this approach, you'll want to automate the creation and maintenance of these AMIs to prevent mistakes or somebody "forgetting" to do the task. In this post, I'll walk through how to set this up in AWS within a matter of minutes using Amazon's serverless compute offering, Lambda

Automated AMI Backups

In departure from many of my other posts involving Lambda, the following steps will make use of the AWS CLI so I will assume that you've already installed and configured it on your machine. This is to grant longevity to these posts and to protect their relevance from slipping due to the browser-based console's rate of change.

I have also added some flexibility in the maintenance of these backups I encourage the reader to configure to their liking. These include a variable number of backups the reader would like to maintain for each EC2 instance they wish to have AMIs taken, a customizable tag the reader can assign to EC2 instances they'd like to backup, and the option to also delete snapshots of the AMIs being de-registered once they exit the backup window. I have included default values, but I still encourage you to read the options before implementing this solution.

Let's get started.

Creating an IAM policy for access permissions:

  1. Create a file named iam-policy.json with the following contents and save it in your working directory:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "Stmt1499061014000",
                "Effect": "Allow",
                "Action": [
                    "ec2:CreateImage",
                    "ec2:CreateTags",
                    "ec2:DeleteSnapshot",
                    "ec2:DeregisterImage",
                    "ec2:DescribeImages",
                    "ec2:DescribeInstances"
                ],
                "Resource": [
                    "*"
                ]
            }
        ]
    }
  2. In your command prompt or terminal window, invoke the following command:
    aws iam create-policy --policy-name ami-backup-policy --policy-document file://iam-policy.json
  3. You'll receive output with details of the policy you've just created. Write down the ARN value as you will need it later.

Creating the IAM role for the Lambda function:

  1. Create a file named role-trust-policy.json with the following contents and save it in your working directory:
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": "lambda.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }
  2. In your command prompt or terminal window, invoke the following command:
    aws iam create-role --role-name ami-backup-role --assume-role-policy-document file://role-trust-policy.json
  3. You'll receive output with details of the role you've just created. Be sure to write down the role ARN value provided. You'll need it later.
  4. Run the following command to attach the policy to the role. You must substitute ARN for the policy ARN you wrote down from the prior step: 
    aws iam attach-role-policy --policy-arn ARN --role-name ami-backup-role

Creating the Lambda function:

  1.  Create a file named index.js with the following contents and save it in your working directory:
    Note: The following file is the code managing your AMI backups. There are a number of configurable options to be aware of and I have commented descriptions of each in the code. 
    var AWS = require("aws-sdk");
    var ec2 = new AWS.EC2();
    
    var numBackupsToRetain = 2; // The Number Of AMI Backups You Wish To Retain For Each EC2 Instance.
    var instancesToBackupTagName = "BackupAMI"; // Tag Key Attached To Instances You Want AMI Backups Of. Tag Value Should Be Set To "Yes".
    var imageBackupTagName = "ScheduledAMIBackup"; // Tag Key Attached To AMIs Created By This Process. This Process Will Set Tag Value To "True".
    var imageBackupInstanceIdentifierTagName = "ScheduledAMIInstanceId"; // Tag Key Attached To AMIs Created By This Process. This Process Will Set Tag Value To The Instance ID.
    var deleteSnaphots = true; // True if you want to delete snapshots during cleanup. False if you want to only delete AMI, and leave snapshots intact.
    
    exports.handler = function(event, context) {
        var describeInstancesParams = {
            DryRun: false,
            Filters: [{
                Name: "tag:" + instancesToBackupTagName,
                Values: ["Yes"]
            }]
        };
        ec2.describeInstances(describeInstancesParams, function(err, data) {
            if (err) {
                console.log("Failure retrieving instances.");
                console.log(err, err.stack); 
            }
            else {
                for (var i = 0; i < data.Reservations.length; i++) {
                    for (var j = 0; j < data.Reservations[i].Instances.length; j++) {
                        var instanceId = data.Reservations[i].Instances[j].InstanceId;
                        createImage(instanceId);
                    }
                }
            }
        });
        cleanupOldBackups();
    };
    
    var createImage = function(instanceId) {
        console.log("Found Instance: " + instanceId);
        var createImageParams = {
            InstanceId: instanceId,
            Name: "AMI Scheduled Backup I(" + instanceId + ") T(" + new Date().getTime() + ")",
            Description: "AMI Scheduled Backup for Instance (" + instanceId + ")",
            NoReboot: true,
            DryRun: false
        };
        ec2.createImage(createImageParams, function(err, data) {
            if (err) {
                console.log("Failure creating image request for Instance: " + instanceId);
                console.log(err, err.stack);
            }
            else {
                var imageId = data.ImageId;
                console.log("Success creating image request for Instance: " + instanceId + ". Image: " + imageId);
                var createTagsParams = {
                    Resources: [imageId],
                    Tags: [{
                        Key: "Name",
                        Value: "AMI Backup I(" + instanceId + ")"
                    },
                    {
                        Key: imageBackupTagName,
                        Value: "True"
                    },
                    {
                        Key: imageBackupInstanceIdentifierTagName,
                        Value: instanceId
                    }]
                };
                ec2.createTags(createTagsParams, function(err, data) {
                    if (err) {
                        console.log("Failure tagging Image: " + imageId);
                        console.log(err, err.stack);
                    }
                    else {
                        console.log("Success tagging Image: " + imageId);
                    }
                });
            }
        });
    };
    
    var cleanupOldBackups = function() {
        var describeImagesParams = {
            DryRun: false,
            Filters: [{
                Name: "tag:" + imageBackupTagName,
                Values: ["True"]
            }]
        };
        ec2.describeImages(describeImagesParams, function(err, data) {
            if (err) {
                console.log("Failure retrieving images for deletion.");
                console.log(err, err.stack); 
            }
            else {
                var images = data.Images;
                var instanceDictionary = {};
                var instances = [];
                for (var i = 0; i < images.length; i++) {
                    var currentImage = images[i];
                    for (var j = 0; j < currentImage.Tags.length; j++) {
                        var currentTag = currentImage.Tags[j];
                        if (currentTag.Key === imageBackupInstanceIdentifierTagName) {
                            var instanceId = currentTag.Value;
                            if (instanceDictionary[instanceId] === null || instanceDictionary[instanceId] === undefined) {
                                instanceDictionary[instanceId] = [];
                                instances.push(instanceId);
                            }
                            instanceDictionary[instanceId].push({
                                ImageId: currentImage.ImageId,
                                CreationDate: currentImage.CreationDate,
                                BlockDeviceMappings: currentImage.BlockDeviceMappings
                            });
                            break;
                        }
                    }
                }
                for (var t = 0; t < instances.length; t++) {
                    var imageInstanceId = instances[t];
                    var instanceImages = instanceDictionary[imageInstanceId];
                    if (instanceImages.length > numBackupsToRetain) {
                        instanceImages.sort(function (a, b) {
                           return new Date(b.CreationDate) - new Date(a.CreationDate); 
                        });
                        for (var k = numBackupsToRetain; k < instanceImages.length; k++) {
                            var imageId = instanceImages[k].ImageId;
                            var creationDate = instanceImages[k].CreationDate;
                            var blockDeviceMappings = instanceImages[k].BlockDeviceMappings;
                            deregisterImage(imageId, creationDate, blockDeviceMappings);
                        }   
                    }
                    else {
                        console.log("AMI Backup Cleanup not required for Instance: " + imageInstanceId + ". Not enough backups in window yet.");
                    }
                }
            }
        });
    };
    
    var deregisterImage = function(imageId, creationDate, blockDeviceMappings) {
        console.log("Found Image: " + imageId + ". Creation Date: " + creationDate);
        var deregisterImageParams = {
            DryRun: false,
            ImageId: imageId
        };
        console.log("Deregistering Image: " + imageId + ". Creation Date: " + creationDate);
        ec2.deregisterImage(deregisterImageParams, function(err, data) {
           if (err) {
               console.log("Failure deregistering image.");
               console.log(err, err.stack);
           } 
           else {
               console.log("Success deregistering image.");
               if (deleteSnaphots) {
                    for (var p = 0; p < blockDeviceMappings.length; p++) {
                       var snapshotId = blockDeviceMappings[p].Ebs.SnapshotId;
                       if (snapshotId) {
                           deleteSnapshot(snapshotId);
                       }
                   }    
               }
           }
        });
    };
    
    var deleteSnapshot = function(snapshotId) {
        var deleteSnapshotParams = {
            DryRun: false,
            SnapshotId: snapshotId
        };
        ec2.deleteSnapshot(deleteSnapshotParams, function(err, data) {
            if (err) {
                console.log("Failure deleting snapshot. Snapshot: " + snapshotId + ".");
                console.log(err, err.stack);
            }
            else {
                console.log("Success deleting snapshot. Snapshot: " + snapshotId + ".");
            }
        })
    };
  2. Zip this file to a zip called index.zip.
  3. In your command prompt or terminal window, invoke the following command. You must substitute ARN for the role ARN you wrote down from the prior step: 
    aws lambda create-function --function-name ami-backup-function --runtime nodejs6.10 --handler index.handler --role ARN --zip-file fileb://index.zip --timeout 30
  4. You'll receive output details about the Lambda function you've just created. Write down the Function ARN value for later use.

Scheduling the Lambda function:

  1. In your command prompt or terminal window, invoke the following command:
    Note: Feel free to adjust the schedule expression for your own use.
    aws events put-rule --name ami-backup-event-rule --schedule-expression "rate(1 day)"
  2. You'll get the Rule ARN value back as output. Write this down for later.
  3. Run the following command. Substitute ARN for the Rule ARN you just wrote down:
    aws lambda add-permission --function-name ami-backup-function --statement-id LambdaPermission --action "lambda:InvokeFunction" --principal events.amazonaws.com --source-arn ARN
  4. Run the following command. Substitute ARN for the Function ARN of the Lambda function you wrote down:
    aws events put-targets --rule ami-backup-event-rule --targets "Id"="1","Arn"="ARN"

Remember, you must assign the appropriate tag to each EC2 instance you want windowed AMI backups for. Leave a comment if you run into any issue using this solution.

Cheers!

Comments (15) -

Thanks for this awesome write-up. I followed your directions exactly and am now getting AMI nightly backups. I have 40+ EC2s that Im now backing up but its hard to tell what they are by their instanceId.

                Tags: [{
                    Key: "Name",
                    Value: "AMI Backup I(" + instanceId + ")"
                },

Is it possible to grab a tagged value like the EC2Hostname instead of instanceId?

Such that I can name the AMI more like this.

                Tags: [{
                    Key: "Name",
                    Value: "AMI Backup I(" + TagName-Value + ")"
                },

Hi Eric,

Absolutely! You'll need to modify the code a bit in order to modify the naming convention I used.

Specifically, you might want to expand the createImage(instanceId) function signature to include another parameter that is passed in for naming purposes: createImage(instanceId, nameValue).

On line 27 of the Lambda function, I call this function so right before that call, I'd create another variable that's equal to the tag name / value you want included in the AMI name. I'm pulling the instance ID from the object returned by AWS as output from the describeInstances() call.

You can see the structure of this object by visiting: docs.aws.amazon.com/.../describe-instances.html which contains a lot more information about the instance than instance ID or tag names. From there, you shouldn't need to change anything else once you update the Lambda function with the new code.

Thanks for reading.

Hello,

I'm testing your script and it works perfectly. I only have 2 issues:

1. At the moment I donot know why, but I get 3 AMIs from one Instance within a 2 minute period. The trigger is set to "rate(1 day)". I'm now testing with a dedicated trigger time.

2. Due to the fact that there is a limit of snapshot creations at the same time, I would ran into a problem if there are more than 20 volumes. Is there a possibilty to spread the AMI creation by an additional tag in the instance which indicates the time when the lambda function should do AMI creation.

Best regards

René

Hi, Rene.

1) I included some logging in the Lambda code to output details of what's going on as it executes AMI creation. From that output, can you tell if the extra AMIs are the result of the Lambda function being executed multiple times? or is the Lambda function executing once, but still creating multiple AMIs for a single given instance?

2) That would be possible, but would increase the complexity of the function a bit. You might consider creating a support ticket with AWS support and ask for an increase to the concurrent snapshot creation limit. If I'm not mistaken, that is a soft limit that they can increase for you provided you give them the details. It sounds like you have enough volumes where you may need this.

Alternatively, you could create two copies of the function in this post that occur at separate times. You might split the instances you want to backup into groups, where the instances in each group have a unique tag. You could then alter the instancesToBackupTagName variable at the top of the script to reflect a group's unique tag. That way you're able to stagger the backup process so it doesn't back up everything at once since it sounds like you have enough instances for this to be a problem.

Thanks for reading.

Hello Aaron,

When I run the last step, I get the following error:

PS C:\> aws events put-targets --rule ami-backup-event-rule --targets "Id"="1","Arn"="arn:aws:lambda:sa-east-1:956684404
850:function:ami-backup-function"

Error parsing parameter '--targets': Expected: '=', received: 'EOF' for input:
Id
  ^

I already checked the string, and there is no end-of-line character.

docs.aws.amazon.com/.../put-targets.html

According to the above reference the syntax is correct, can you help me?

Best Regards,
Witallo

Hi Witallo,

I am unclear if the issue you're experiencing is related to the OS or prompt you are running. The commands in this post were run on a Windows instance. I've had trouble with single quotes in the past running AWS CLI on Windows before, which is why these commands use double quotes.

Regardless, you might consider supplying a JSON object / file to specify the parameter values for the last piece of this instead of entering them directly. For instance, if you run:

aws events put-targets --generate-cli-skeleton

...you should get a sample JSON object to submit. Therefore, you could just create a file of JSON, create the object referencing the documentation you linked and running:

aws events put-targets --cli-input-json file://yourfile.json

Thanks for reading.

Aaron

Hi Aron,
Thanks for the details, when I ran it, I could find it works perfectly, but what I found is that I am getting snapshots only the root volume, I don't see the other volumes being backed up. e.g. if I have got C:\, D:\  drives in my windows EC2 instance, I could see only the C:\ being backed up as snsapshot, there is no snapshot for the D:\.  Is this what it is supposed to do? If I take AMI backup does not it take backup of all the volumes?

Regards,
Arnab

Arnab,

If you were to start a new instance using the AMI created by this process, you should notice that it will default to creating as many EBS volumes as you had when it was taken. I believe this is the default behavior, but you should test this to make sure!

In this implementation, I did not specify the BlockDeviceMapping parameter, so whatever is default will occur. If you would like to dictate how volumes are associated with new instances made from the AMI, you could modify the parameters I am sending to the createImage for EC2 (in the Lambda function) to include BlockDeviceMapping:

docs.aws.amazon.com/.../EC2.html

Again, remember to test whatever backups you are taking, whether from this solution or otherwise! Do not rely on any assumption of mine or your own. Test.

Thanks for reading.

Aaron

Hi,

I have tried to change the existing Scheduleexpression as below from 1 to 2 and get error. any specific change i have to make in json script.

# aws events put-rule --name ami-backup-event-rule --schedule-expression "rate(2 day)" --region ap-southeast-1

An error occurred (ValidationException) when calling the PutRule operation: Parameter ScheduleExpression is not valid.

Hello hemant,

Simple change. Use rate(2 days) instead of rate(2 day).

Thanks for reading.

Aaron

Thanks Aaron,

I am trying to get AMI with Instanceid and  name. I have followed your steps createImage(instanceId, nameValue) and made these changes in my script. However i am not getting any backup with the Instance Name. If needed i can provide complete script code.

for (var i = 0; i < data.Reservations.length; i++) {
                for (var j = 0; j < data.Reservations[i].Instances.length; j++) {
                    var instanceId = data.Reservations[i].Instances[j].InstanceId;
                    var nameValue= data.Reservations[i].Tags[j].Name;
                                        createImage(instanceId, nameValue);    }
            }
        }
    });
    cleanupOldBackups();
};

var createImage = function(instanceId, nameValue) {
    console.log("Found Instance: " + instanceId);
    var createImageParams = {
        InstanceId: instanceId,
        Name: "AMI Scheduled Backup I(" + instanceId + ") T(" + new Date().getTime() + ") (" + nameValue + ")",
        Description: "AMI Scheduled Backup for Instance (" + instanceId + ") (" + nameValue + ")",
        NoReboot: true,
        DryRun: false

Hi Hemant,

in your code the problem probably is in the line where you try to get the name tag:
var nameValue= data.Reservations[i].Tags[j].Name;
This will not work as Tags[j] does not point to the name tag. The following approach worked for me:

var instanceName = data.Reservations[i].Instances[j].Tags.find(tag => tag.Key == "Name").Value;
This will search for the Tag where tag.Key equals "Name" and then take the value of that tag.

HTH
Tom

Hi Aaron,

Is it possible if we can only get AMI backup and exclude snapshot backup's. If yes which part of the script need to be changed.

Thanks
Hemant

Aaron -

I found this solution last week, and with the additional comments above I have been able to put a variation of this in place. It works much better than what I was doing before. However I have one question, and I'm honestly not sure how to go about it.

I need the name of the volume populated in the "Name" tag on the snapshots created with the AMI.

Any suggestions?

Got an error when running the last command:
aws events put-targets --rule ami-backup-event-rule --targets "Id"="1","Arn"="ARN"

Error parsing parameter '--targets': Expected: '=', received: 'EOF' for input: ID

How to solve this one?

Add comment

Copyright © 2016-2017 Aaron Medacco