Sunday, November 5, 2017

Perform Root Valume Swap for EC2 via the AWS console

Leave a Comment

I've recently been using the Swap root volume approach for creating a persistent Spot Instance, as described here (Approach 2). Typically it takes 2-5 minutes for my Spot Instance to be fulfilled and the Swap to complete. However, some days, the process never finishes (or at least I get impatient after waiting 20 minutes to an hour!).

To be clear, the Instance is created, but the Swap never happens: I can ssh into the server but my persistent files are not there. I also can see this by going to my AWS console and noting that "spotter" (my persistent storage) has no attachment information:

enter image description here

As the swapping script which I'm using never gives me any errors, it's hard to see what's failing. So, I'm wondering if based on my screenshot I can just use the AWS EC2 Management Console to "manually" perform the swap, and if so, how would I accomplish this.

And, if it helps @Vorsprung,

I initiate the process by running the following script:

    # The config file was created in ondemand_to_spot.sh export config_file=my.conf cd "$(dirname ${BASH_SOURCE[0]})"  . ../$config_file || exit -1  export request_id=`../ec2spotter-launch $config_file` echo Spot request ID: $request_id  echo Waiting for spot request to be fulfilled... aws ec2 wait spot-instance-request-fulfilled --spot-instance-request-ids $request_id  export instance_id=`aws ec2 describe-spot-instance-requests --spot-instance-request-ids $request_id --query="SpotInstanceRequests[*].InstanceId" --output="text"`  echo Waiting for spot instance to start up... aws ec2 wait instance-running --instance-ids $instance_id  echo Spot instance ID: $instance_id  echo 'Please allow the root volume swap script a few minutes to finish.' if [ "x$ec2spotter_elastic_ip" = "x" ] then         # Non elastic IP         export ip=`aws ec2 describe-instances --instance-ids $instance_id --filter Name=instance-state-name,Values=running --query "Reservations[*].Instances[*].PublicIpAddress" --output=text` else         # Elastic IP         export ip=`aws ec2 describe-addresses --allocation-ids $ec2spotter_elastic_ip --output text --query 'Addresses[0].PublicIp'` fi  export name=fast-ai if [ "$ec2spotter_key_name" = "aws-key-$name" ] then    function aws-ssh-spot {         ssh -i ~/.ssh/aws-key-$name.pem ubuntu@$ip         }         function aws-terminate-spot {         aws ec2 terminate-instances --instance-ids $instance_id         }         echo  Jupyter Notebook -- $ip:8888 fi 

where my.conf is:

# Name of root volume. ec2spotter_volume_name=spotter # Location (zone) of root volume. If not the same as ec2spotter_launch_zone, # a copy will be created in ec2spotter_launch_zone. # Can be left blank, if the same as ec2spotter_launch_zone ec2spotter_volume_zone=us-west-2b  ec2spotter_launch_zone=us-west-2b ec2spotter_key_name=aws-key-fast-ai ec2spotter_instance_type=p2.xlarge # Some instance types require a subnet to be specified: ec2spotter_subnet=subnet-c9cba8af  ec2spotter_bid_price=0.55  # uncomment and update the value if you want an Elastic IP # ec2spotter_elastic_ip=eipalloc-64d5890a  # Security group ec2spotter_security_group=sg-2be79356  # The AMI to be used as the pre-boot environment. This is NOT your target system installation. # Do Not Modify this unless you have a need for a different Kernel version from what's supplied. # ami-6edd3078 is ubuntu-xenial-16.04-amd64-server-20170113 ec2spotter_preboot_image_id=ami-bc508adc 

and the ec2spotter-launch script is:

    #!/bin/bash      # "Phase 1" this is the user-facing script for launching a new spot istance      if [ "$1" = "" ]; then echo "USER ERROR: please specify a configuration file"; exit -1; fi      cd $(dirname $0)      . $1 || exit -1      # New instance:     # Desired launch zone     LAUNCH_ZONE=$ec2spotter_launch_zone     # Region is LAUNCH_ZONE minus the last character     LAUNCH_REGION=$(echo $LAUNCH_ZONE | sed -e 's/.$//')     PUB_KEY=$ec2spotter_key_name      # Existing Volume:     # If no volume zone     if [ "$ec2spotter_volume_zone" = "" ]     then # Use instance zone             ec2spotter_volume_zone=$LAUNCH_ZONE     fi      # Name of volume (find it by name later)     ROOT_VOL_NAME=$ec2spotter_volume_name     # zone of volume (needed if different than instance zone)     ROOT_ZONE=$ec2spotter_volume_zone     # Region is Zone minus the last character     ROOT_REGION=$(echo $ROOT_ZONE | sed -e 's/.$//')       #echo "ROOT_VOL_NAME=${ROOT_VOL_NAME}; ROOT_ZONE=${ROOT_ZONE}; ROOT_REGION=${ROOT_REGION}; "     #echo "LAUNCH_ZONE=${LAUNCH_ZONE}; LAUNCH_REGION=${LAUNCH_REGION}; PUB_KEY=${PUB_KEY}"      AWS_ACCESS_KEY=`aws configure get aws_access_key_id`     AWS_SECRET_KEY=`aws configure get aws_secret_access_key`      aws ec2 describe-volumes \             --filters Name=tag-key,Values="Name" Name=tag-value,Values="$ROOT_VOL_NAME" \             --region ${ROOT_REGION} --output=json > volumes.tmp || exit -1      ROOT_VOL=$(jq -r '.Volumes[0].VolumeId' volumes.tmp)     ROOT_TYPE=$(jq -r '.Volumes[0].VolumeType' volumes.tmp)      #echo "ROOT_TYPE=$ROOT_TYPE; ROOT_VOL=$ROOT_VOL";     if [ "$ROOT_VOL_NAME" = "" ] then   echo "root volume lacks a Name tag";   exit -1; fi  cat >user-data.tmp <<EOF #!/bin/sh echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds echo AWSSecretKey=$AWS_SECRET_KEY >> /root/.aws.creds  apt-get update apt-get install -y jq apt-get install -y python-pip python-setuptools apt-get install -y git  pip install awscli  cd /root git clone --depth=1 https://github.com/slavivanov/ec2-spotter.git echo Got spotter scripts from github.  cd ec2-spotter  echo Swapping root volume ./ec2spotter-remount-root  --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip EOF  userData=$(base64 user-data.tmp | tr -d '\n');  cat >specs.tmp <<EOF {   "ImageId" : "$ec2spotter_preboot_image_id",   "InstanceType": "$ec2spotter_instance_type",   "KeyName" : "$PUB_KEY",   "EbsOptimized": true,   "Placement": {      "AvailabilityZone": "$LAUNCH_ZONE"   },   "BlockDeviceMappings": [     {       "DeviceName": "/dev/sda1",       "Ebs": {         "DeleteOnTermination": true,         "VolumeType": "gp2",         "VolumeSize": 128       }     }   ],   "NetworkInterfaces": [       {         "DeviceIndex": 0,         "SubnetId": "${ec2spotter_subnet}",         "Groups": [ "${ec2spotter_security_group}" ],         "AssociatePublicIpAddress": true       }   ],   "UserData" : "${userData}" } EOF  SPOT_REQUEST_ID=$(aws ec2 request-spot-instances --launch-specification file://specs.tmp --spot-price $ec2spotter_bid_price --output="text" --query="SpotInstanceRequests[*].SpotInstanceRequestId" --region ${LAUNCH_REGION}) echo $SPOT_REQUEST_ID # Clean up rm user-data.tmp rm specs.tmp rm volumes.tmp 

1 Answers

Answers 1

This is not an exact answer, but it may help you to find the way to debug the issue. As I understand, this is the part of your setup is in the ec2spotter-launch script responsible for volume swap:

... cat >specs.tmp <<EOF {   "ImageId" : "$ec2spotter_preboot_image_id",   ...   "UserData" : "${userData}" } EOF  SPOT_REQUEST_ID=$(aws ec2 request-spot-instances --launch-specification file://specs.tmp --spot-price $ec2spotter_bid_price --output="text" --query="SpotInstanceRequests[*].SpotInstanceRequestId" --region ${LAUNCH_REGION}) 

The specs.tmp is used as instance launch specification: --launc-specification file:://specs.tmp.

And the "UserData" inside the launch specification is a script which is also generated in es2spotter-launch:

cat >user-data.tmp <<EOF #!/bin/sh echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds echo AWSSecretKey=$AWS_SECRET_KEY >> /root/.aws.creds  apt-get update ...  cd /root git clone --depth=1 https://github.com/slavivanov/ec2-spotter.git echo Got spotter scripts from github.  cd ec2-spotter  echo Swapping root volume ./ec2spotter-remount-root  --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip EOF 

The actual work to swap the root volume is performed by the ec2spotter-remount-root script which is downloaded from github.

There are many echo statements in that script, so I think if you find where the output goes, you'll be able to understand what was wrong. So when you have the issue, you'll ssh to the instance and check the log file. The question is what file to check (and if the script output is being logged into some file).

Here is what I suggest to try:

  1. Check standard logs under /var/log generated when the instance is starting (cloud-init.log, syslog, etc) to see if you can find the ec2spotter-remount-root output

  2. Try to enable logging by yourself, something similar is discussed here

I would try modifying the user-data.tmp part in es2spotter-launch this way:

#!/bin/bash set -x exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1 echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds ... echo Swapping root volume ./ec2spotter-remount-root  --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip EOF 

Here I've changed first three lines to enable logging into /var/log/user-data.log.

  1. If 1 and 2 don't work, I would try asking script author on github. As there are lots of echos in the script, the author should know where to look for that output.

Hope that helps, also you don't need to wait for the issue to appear to try this out, instead look for the script output on successful runs too. Or, if you are able to make few test runs, then do that and make sure you can find the log with script output.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment