I've recently been using the Swap root volume approach for creating a persistent Spot Instance, as described here (Approach 2). Typically it takes 2-5 minutes for my Spot Instance to be fulfilled and the Swap to complete. However, some days, the process never finishes (or at least I get impatient after waiting 20 minutes to an hour!).
To be clear, the Instance is created, but the Swap never happens: I can ssh into the server but my persistent files are not there. I also can see this by going to my AWS console and noting that "spotter" (my persistent storage) has no attachment information:
As the swapping script which I'm using never gives me any errors, it's hard to see what's failing. So, I'm wondering if based on my screenshot I can just use the AWS EC2 Management Console to "manually" perform the swap, and if so, how would I accomplish this.
And, if it helps @Vorsprung,
I initiate the process by running the following script:
# The config file was created in ondemand_to_spot.sh export config_file=my.conf cd "$(dirname ${BASH_SOURCE[0]})" . ../$config_file || exit -1 export request_id=`../ec2spotter-launch $config_file` echo Spot request ID: $request_id echo Waiting for spot request to be fulfilled... aws ec2 wait spot-instance-request-fulfilled --spot-instance-request-ids $request_id export instance_id=`aws ec2 describe-spot-instance-requests --spot-instance-request-ids $request_id --query="SpotInstanceRequests[*].InstanceId" --output="text"` echo Waiting for spot instance to start up... aws ec2 wait instance-running --instance-ids $instance_id echo Spot instance ID: $instance_id echo 'Please allow the root volume swap script a few minutes to finish.' if [ "x$ec2spotter_elastic_ip" = "x" ] then # Non elastic IP export ip=`aws ec2 describe-instances --instance-ids $instance_id --filter Name=instance-state-name,Values=running --query "Reservations[*].Instances[*].PublicIpAddress" --output=text` else # Elastic IP export ip=`aws ec2 describe-addresses --allocation-ids $ec2spotter_elastic_ip --output text --query 'Addresses[0].PublicIp'` fi export name=fast-ai if [ "$ec2spotter_key_name" = "aws-key-$name" ] then function aws-ssh-spot { ssh -i ~/.ssh/aws-key-$name.pem ubuntu@$ip } function aws-terminate-spot { aws ec2 terminate-instances --instance-ids $instance_id } echo Jupyter Notebook -- $ip:8888 fi
where my.conf is:
# Name of root volume. ec2spotter_volume_name=spotter # Location (zone) of root volume. If not the same as ec2spotter_launch_zone, # a copy will be created in ec2spotter_launch_zone. # Can be left blank, if the same as ec2spotter_launch_zone ec2spotter_volume_zone=us-west-2b ec2spotter_launch_zone=us-west-2b ec2spotter_key_name=aws-key-fast-ai ec2spotter_instance_type=p2.xlarge # Some instance types require a subnet to be specified: ec2spotter_subnet=subnet-c9cba8af ec2spotter_bid_price=0.55 # uncomment and update the value if you want an Elastic IP # ec2spotter_elastic_ip=eipalloc-64d5890a # Security group ec2spotter_security_group=sg-2be79356 # The AMI to be used as the pre-boot environment. This is NOT your target system installation. # Do Not Modify this unless you have a need for a different Kernel version from what's supplied. # ami-6edd3078 is ubuntu-xenial-16.04-amd64-server-20170113 ec2spotter_preboot_image_id=ami-bc508adc
and the ec2spotter-launch script is:
#!/bin/bash # "Phase 1" this is the user-facing script for launching a new spot istance if [ "$1" = "" ]; then echo "USER ERROR: please specify a configuration file"; exit -1; fi cd $(dirname $0) . $1 || exit -1 # New instance: # Desired launch zone LAUNCH_ZONE=$ec2spotter_launch_zone # Region is LAUNCH_ZONE minus the last character LAUNCH_REGION=$(echo $LAUNCH_ZONE | sed -e 's/.$//') PUB_KEY=$ec2spotter_key_name # Existing Volume: # If no volume zone if [ "$ec2spotter_volume_zone" = "" ] then # Use instance zone ec2spotter_volume_zone=$LAUNCH_ZONE fi # Name of volume (find it by name later) ROOT_VOL_NAME=$ec2spotter_volume_name # zone of volume (needed if different than instance zone) ROOT_ZONE=$ec2spotter_volume_zone # Region is Zone minus the last character ROOT_REGION=$(echo $ROOT_ZONE | sed -e 's/.$//') #echo "ROOT_VOL_NAME=${ROOT_VOL_NAME}; ROOT_ZONE=${ROOT_ZONE}; ROOT_REGION=${ROOT_REGION}; " #echo "LAUNCH_ZONE=${LAUNCH_ZONE}; LAUNCH_REGION=${LAUNCH_REGION}; PUB_KEY=${PUB_KEY}" AWS_ACCESS_KEY=`aws configure get aws_access_key_id` AWS_SECRET_KEY=`aws configure get aws_secret_access_key` aws ec2 describe-volumes \ --filters Name=tag-key,Values="Name" Name=tag-value,Values="$ROOT_VOL_NAME" \ --region ${ROOT_REGION} --output=json > volumes.tmp || exit -1 ROOT_VOL=$(jq -r '.Volumes[0].VolumeId' volumes.tmp) ROOT_TYPE=$(jq -r '.Volumes[0].VolumeType' volumes.tmp) #echo "ROOT_TYPE=$ROOT_TYPE; ROOT_VOL=$ROOT_VOL"; if [ "$ROOT_VOL_NAME" = "" ] then echo "root volume lacks a Name tag"; exit -1; fi cat >user-data.tmp <<EOF #!/bin/sh echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds echo AWSSecretKey=$AWS_SECRET_KEY >> /root/.aws.creds apt-get update apt-get install -y jq apt-get install -y python-pip python-setuptools apt-get install -y git pip install awscli cd /root git clone --depth=1 https://github.com/slavivanov/ec2-spotter.git echo Got spotter scripts from github. cd ec2-spotter echo Swapping root volume ./ec2spotter-remount-root --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip EOF userData=$(base64 user-data.tmp | tr -d '\n'); cat >specs.tmp <<EOF { "ImageId" : "$ec2spotter_preboot_image_id", "InstanceType": "$ec2spotter_instance_type", "KeyName" : "$PUB_KEY", "EbsOptimized": true, "Placement": { "AvailabilityZone": "$LAUNCH_ZONE" }, "BlockDeviceMappings": [ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true, "VolumeType": "gp2", "VolumeSize": 128 } } ], "NetworkInterfaces": [ { "DeviceIndex": 0, "SubnetId": "${ec2spotter_subnet}", "Groups": [ "${ec2spotter_security_group}" ], "AssociatePublicIpAddress": true } ], "UserData" : "${userData}" } EOF SPOT_REQUEST_ID=$(aws ec2 request-spot-instances --launch-specification file://specs.tmp --spot-price $ec2spotter_bid_price --output="text" --query="SpotInstanceRequests[*].SpotInstanceRequestId" --region ${LAUNCH_REGION}) echo $SPOT_REQUEST_ID # Clean up rm user-data.tmp rm specs.tmp rm volumes.tmp
1 Answers
Answers 1
This is not an exact answer, but it may help you to find the way to debug the issue. As I understand, this is the part of your setup is in the ec2spotter-launch
script responsible for volume swap:
... cat >specs.tmp <<EOF { "ImageId" : "$ec2spotter_preboot_image_id", ... "UserData" : "${userData}" } EOF SPOT_REQUEST_ID=$(aws ec2 request-spot-instances --launch-specification file://specs.tmp --spot-price $ec2spotter_bid_price --output="text" --query="SpotInstanceRequests[*].SpotInstanceRequestId" --region ${LAUNCH_REGION})
The specs.tmp is used as instance launch specification: --launc-specification file:://specs.tmp
.
And the "UserData" inside the launch specification is a script which is also generated in es2spotter-launch
:
cat >user-data.tmp <<EOF #!/bin/sh echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds echo AWSSecretKey=$AWS_SECRET_KEY >> /root/.aws.creds apt-get update ... cd /root git clone --depth=1 https://github.com/slavivanov/ec2-spotter.git echo Got spotter scripts from github. cd ec2-spotter echo Swapping root volume ./ec2spotter-remount-root --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip EOF
The actual work to swap the root volume is performed by the ec2spotter-remount-root
script which is downloaded from github.
There are many echo
statements in that script, so I think if you find where the output goes, you'll be able to understand what was wrong. So when you have the issue, you'll ssh to the instance and check the log file. The question is what file to check (and if the script output is being logged into some file).
Here is what I suggest to try:
Check standard logs under
/var/log
generated when the instance is starting (cloud-init.log, syslog, etc) to see if you can find theec2spotter-remount-root
outputTry to enable logging by yourself, something similar is discussed here
I would try modifying the user-data.tmp
part in es2spotter-launch
this way:
#!/bin/bash set -x exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1 echo AWSAccessKeyId=$AWS_ACCESS_KEY > /root/.aws.creds ... echo Swapping root volume ./ec2spotter-remount-root --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip EOF
Here I've changed first three lines to enable logging into /var/log/user-data.log
.
- If 1 and 2 don't work, I would try asking script author on github. As there are lots of
echo
s in the script, the author should know where to look for that output.
Hope that helps, also you don't need to wait for the issue to appear to try this out, instead look for the script output on successful runs too. Or, if you are able to make few test runs, then do that and make sure you can find the log with script output.
0 comments:
Post a Comment