Home » Uncategorized » Downloading an Amazon EC2 AMI to local drive

Downloading an Amazon EC2 AMI to local drive

I post an integration to the very precise Jiaqi Zhang’s post http://weaponshot.wordpress.com/2012/04/08/downloading-an-ami-to-local/ to download an Amazon EC2 AMI EBS-Backed, and also be able to BOOT the downloaded instance.

I keep the same numbers as the original post, for your reference.

1. Choose an existing EBS-backed AMI that you want to download, and launch it if it’s not, and check if it’s using ext4 filesystem, invoking “df -T” in the instance.

3.1 Use “su” to change to root.

3.3 Download in your computer your credentials, such as pk-XXX.pem, cert-XXX.pem. These can be found in the X509 certificates tab in your credentials panel from upper-right of the console screen (just click your name).

3.4 Copy them to your ami instance using scp -i <identity_file.pem> <pk-XXX.pem> <cert-XXX.pem> ec2-user@your_ami:~/directory. Here the identity_file.pem is the key file you downloaded when you created the instance or created the key pair.

3.5 Log in to the instance, invoke “ec2-bundle-vol -k <pk-XXX.pem> -c <cert-XXX.pem> -u <user_id>”. The two .pem files are just copied in the previous step. The user_id is the digital numbers you can find in your “account activity” (upper-right, under your name), with the – signs.

(If you don’t have the ec2-ami-tools pre-installed, see instructions in the original post)

4. Bundling an image means to compress it and cut in a bunch of files you will see in the /tmp dir.

4.1 To upload it to S3: Create a bucket in your S3 console panel. Name it like, e.g., “mybuck”. Don’t use capital letter, spaces, dashes or underscores.

4.2 in your ami instance, “cd /tmp”and  “ec2-upload-bundle -b <mybuck> -m <manifest_file> -a <access_key> -s <secret_key>”. Here the manifest_file is the one automatically generated xml file when you invoke the bundle command. It should be under the /tmp directory together with those image.part.XX files. You can find the “access_key” and “secret_key” in your credential panel under the “access keys” tab. The secret key is by default hidden, and you should click the “show” to make it visible: copy&paste both of them, and you are there.

Now that all the image files are uploaded, let’s download them to our local machine.

Go in vmware, start an Ubuntu machine. You can freely download from official Ubuntu website. If you use vmware 3.1.4, don’t download the 12.x because the vmware tools, necessary to share the folders between your real machine and your VM, are supported only on 10.x. Chose 32bit/64bit accordingly to your EC2 machine you are importing. This is very important!

Once downloaded the right .iso, in vmware create a new virtual machine using the .iso as the installation disk.

When you have your Ubuntu machine running, vmware should install automatically for you the vmware tools (just see in VirtualMachine menu and wait). Check with “ls -l /mnt/hgfs” if you see your shared folder. If not, install manually vmware-tools (it’s an option in Virtual Machine menu).

6.1 Now install the Amazon ec2 toolkits to your local machine with the command: “apt-get install ec2-ami-tools” (and also ec2-api-tools if you wanto to control ec2 vm from there)

To run these commands, you have to copy in your home dir, using the shared folder /mnt/hgfs, the two .pem files from your computer.

Now create a dir (i.e. in your homedir), cd in to it and invoke “ec2-download-bundle -b <mybuck> -a <access key> -s <secret key> -k <pkXXX.pem>”: this will download the bundled image files from your S3 bucket to that dir.

In this same dir, invoke “ec2-unbundle -k <pkXXX.pem> -m <image.manifest.xml>. Then you should get back the 10GB file named “image” (the file size depends on the type of ami, “small” ones get 10GB).

Now there are two possibilities:

-> If you only want to mount your image and not to boot it, you can simply install qemu with “apt-get install qemu” and invoke “qemu-img convert -f raw -O vmdk image /tmp/ec2-image.vmdk”; then move this .vmdk file in the shared folder, halt the VM, attach the new vmdk to your VM, start again the VM, do a “df” to check if your boot disk is /dev/sda or /dev/sdb, and mount the new disk with  “mkdir /mnt/yourdisc” and “mount -t ext4 /dev/sdX /mnt/yourdisc” where X is b if your boot disk is a, or vice versa.

Then cd /mnt/yourdisc …and here it’s all your stuff!

— *** —

-> But if you want to boot your instance, then there there are some other steps. FYI, this post says that it’s so hard that’s unworthy, but this other shows a way to do it, even if it’s uncomplete, and after a lot of hours I managed to do it.

You just obtained your 10GB image file, right?

Now shut down your VM, go in hard drive setting panel in Vmware and create a secondary disc to attach to your instance. Let’s select a SCSI 11GB drive. You have to select it 10% more of the space of your original disc (We assume that the image is 10GB).

7.3  Boot up again your VM, run “fdisk -l” to check that you see /dev/sdb (assuming it’s sda your primary disk), with no valid partition table because it’s not formatted.

Before to copy your data on it from the image file, install gparted (apt-get install gparted) and run it. In gparted, choose Device/Create partition table with standard (msdos) label, and then create an ext4 partition leaving 1MB at the beginning. Probably it would be even better to create a 9GB partition and 1GB of swap space. Deselect the tick in “round to cylinders” or it will not leave this 1 MB.  Create also the 990MB swap partition. Commit, and then right click the primary partition and activate the “boot” flag.

7.5 Now, as su (root user), invoke “dd if=image of=/dev/sdb1”. It can take from some minutes to a whole night, depending your hadrware. I have a MacBook Pro with an SSD disc and so it took only 6 minutes. This command copies the data from the unbundled raw image to the partition you created.

7.6 Now invoke “mkdir /mnt/ec2″ and then “mount -t ext4 /dev/sdb1 /mnt/ec2”: in /mnt/ec2 directory you will see all the files in your AMI. Up to here it’s very similar to the other possibility, but we left 1MB of space at the beginning to make the disk bootable.

Before to make the disc bootable with grub, I replaced the kernel in /boot directory, taking it from an Ubuntu 10.04 64bit (because my Amazon Linux AMI was @64bit).

I just saved the old /mnt/ec2/boot directory and created a new one copying all the /boot directory taken from the Ubuntu 10.04.

Then I created a /mnt/ec2/boot/grub/menu.lst file with inside this content:

default=0
timeout=0
hiddenmenu 
title EC2 with kernel 2.6.32 from Ubuntu10.04
root (hd0,0)
kernel /boot/vmlinuz-2.6.32-38-generic root=/dev/sda1
initrd /boot/initrd.img-2.6.32-38-generic

These two files are obviously the name of the vmlinuz* and initrd* files in the boot directory.

Pay attention. Inside this menu we use hd0 and sda1 because when it will boot, this disk will be the first one (and only one). But now on, this disk is the second, so that’s why in the following commands we read hd1 and sdb.

Now invoke  “grub-install –root-directory=/mnt/ec2 /dev/sdb” (double check with df that your hard disc is in /dev/sdb, or you could have problems to boot again from the primary disc!)

Then invoke: “grub –device-map=/dev/null” and, at the grub> prompt, type the commands in bold (I leave for you the results of the commands).

You could have to change the numbers in the geometry command, running a “fdisk -l -u” and looking at your cylinders, heads, and sectors. Head and sectors will be 63 and 255, but the cyls number can change if you choose a different size for your disc.

grub> device (hd1) /dev/sdb
grub> geometry (hd1) 1435 63 255
drive 0x81: C/H/S = 1435/63/255, The number of sectors = 23053275, /dev/sdb
   Partition num: 0,  Filesystem type is ext2fs, partition type 0x83
grub> root (hd1,0)
grub> setup (hd1)
 Checking if "/boot/grub/stage1" exists... yes
 Checking if "/boot/grub/stage2" exists... yes
 Checking if "/boot/grub/e2fs_stage1_5" exists... yes
 Running "embed /boot/grub/e2fs_stage1_5 (hd1)"...  17 sectors are embedded.
succeeded
 Running "install /boot/grub/stage1 (hd1) (hd1)1+17 p (hd1,0)/boot/grub/stage2
/boot/grub/menu.lst"... succeeded
Done.

Now the disc is ready to be booted, but we still have to:

-copy from your Ubuntu machine the /lib/modules, or you’ll not be able to run some commands that need the right modules compiled for your new kernel.

Doing this, a known bug in evbug.ko will spam your console, so you have to:

cd /lib/modules/$(uname -r)/kernel/drivers/input

mv evbug.ko evbug.ko.disabled

add also in /etc/modprobe.d/blacklist.conf a row with: “blacklist evbug”.

Also mv /usr/bin/cloud-init and cloud-init-cfg with a different name, or you’ll have to wait 3 or 4 minutes and a lot of error of these programs trying to connect intranet Amazon addresses.

Last but not least, go in /etc/shadow of your Ubuntu64 machine, copy your encoded password and paste it in the root user of /mnt/ec2/etc/shadow, or you will not be able to log in as root – even if your password of ec2-user should still work and it should also work the sudo su command.

Now halt the VM, create a fresh new empty machine (I choosed a Centos 64 bit but it should work also with an Ubuntu, if there is any difference: it’s only an empty machine and should be the same). Tell vmware to take an existing disk, enter the other Ubuntu 64 machine, find the secondary disc and choose the default option (to make a copy of that disk). At the end of the copy, cross your fingers and run the machine.

The console is really hard to use, so you’ll want to login via ssh. But you’ll see that SSH is not working, because the keys are generated by the cloud-init that’s not working anymore.

It should be possible to generate the keys with the command ssh-keygen, but I didn’t know how to use it, I tried to reinstall with yum but after remove I couldn’t install it back, then downloaded the sources of OpenSSL and OpenSSH from their .org sites and compiled & installed following the instructions, and ssh is perfectly working now.

Please cite my blog, if you want to repost/share this (or part of) the article. Thanks!


12 Comments

  1. Todd says:

    Hi Emanuele,

    Thanks for taking the time to document these steps. I just tried them and ran into a couple of things that were confusing so I thought I would leave you some comments. I encountered Amazon’s 10gb image limitation for the AMI I was trying to virtualize so I unfortunately could not proceed past 3.5. Your steps were extremely helpful for the parts I could do.

    I’m keeping your same numbering sequence.

    1. I think what you mean here is to make a note of what file system the ‘df -T’ reports so that you can specify the proper filesystem type in the mount command at steps 6.1 and 7.6, and in the partition creation at step 7.4. In other words, ‘df -T’ does not =have= to give ext4. In my case, the AMI’s file system was ext3.

    3.4 When I downloaded the X.509 certificates, I only got a cert-XXX.pem file. At step 3.4 you also need to copy the identity_file.pem to the AMI instance. Step 3.5 requires the -k parameter to be included, and I didn’t think I needed the -k parameter since I didn’t have any pk-XXX.pem files, so I think the scp line should read:

    scp -i identity_file.pem identity_file.pem pk-XXX.pem cert-XXX.pem ec2-user@your_ami:~/directory

    with the comment: “Note: the repetition of identity_file.pem is NOT a typo. You need to copy it to your AMI instance as well as any other X.509 certificates.”

    I don’t know what needs to happen if someone has pk-XXX.pem files also.

    3.5 Based on the problem I ran into at step 3.4, the invoke line I had to type was:

    ec2-bundle-vol -k identity_file.pem -c cert-XXX.pem -u user_id

    That is, replace “-k pk-XXX.pem” with “-k identity_file.pem”.

    6.1 This step may also need to have pkXXX.pem replaced with identity_file.pem. There are two places to change it in this step.

    Kind regards,
    Todd

  2. Todd says:

    Is there no way to turn on a preview for replies left here so people can make sure they look OK before committing the post?

  3. Thanks Todd! I’ve been away for a while. Next time just paste in a text editor and check that everything is there, then cut again from the text editor and paste in wpress comment. That’s the way you preview.

    • Todd says:

      But I did paste into a text editor. The problem is that a text editor does not render comments in the same way as WordPress. This is why a preview in WordPress is needed.

  4. Pie says:

    Hi! I’ve followed your tutorial and I’ve a question: Can I simply download via ftp the “image” file generated by ec2-bundle-vol (the 10Gb one) without having to upload files to S3, download them, and unbundle them?

  5. Great info, thanks for sharing.

    I did it on the past using another method.

    In case someone needs it.

    Use vmware or virtual box to spin a machine with the same settings you use on the ec2

    For example I used it to virtualize a old server running Redhat 4.8 in to EC2 but could use to do the oposite way.

    I spin a Oracle 4.9 on Ec2 removed the need for pem and change the sshd.config to allow connect as root

    changed root password

    create a file on my old server called exclude.txt

    Exclude example

    /boot
    /proc
    /sys
    /tmp
    /dev
    /var/lock
    /etc/fstab
    /etc/mdadm.conf
    /etc/mtab
    /etc/resolv.conf
    /etc/conf.d/net
    /etc/network/interfaces
    /etc/networks
    /etc/sysconfig/network*
    /etc/sysconfig/hwconf
    /etc/sysconfig/ip6tables-config
    /etc/sysconfig/kernel
    /etc/hostname
    /etc/HOSTNAME
    /etc/hosts
    /etc/modprobe*
    /etc/modules
    /etc/udev
    /net
    /home/rasteri
    /var/spool
    /lib/modules
    /etc/rc.conf

    sudo rsync -e ‘ssh -p 22′ -azPx –delete-after –exclude-from=”/exclude.txt” / root@ec2-54-235-230-128.compute-1.amazonaws.com :/

    When it finished I reboot the instance and the old server startup at EC2.

    You could do the other way around.

    EC2 to vmware or virtualbox

    See you…..

  6. how to enable root ssh login no pem required

    # $OpenBSD: sshd_config,v 1.69 2004/05/23 23:59:53 dtucker Exp $

    # This is the sshd server system-wide configuration file. See
    # sshd_config(5) for more information.

    # This sshd was compiled with PATH=/usr/local/bin:/bin:/usr/bin

    # The strategy used for options in the default sshd_config shipped with
    # OpenSSH is to specify options with their default value where
    # possible, but leave them commented. Uncommented options change a
    # default value.

    Port 22
    Port 554
    Protocol 2
    #ListenAddress 0.0.0.0
    #ListenAddress ::

    # HostKey for protocol version 1
    #HostKey /etc/ssh/ssh_host_key
    # HostKeys for protocol version 2
    #HostKey /etc/ssh/ssh_host_rsa_key
    #HostKey /etc/ssh/ssh_host_dsa_key

    # Lifetime and size of ephemeral version 1 server key
    #KeyRegenerationInterval 1h
    #ServerKeyBits 768

    # Logging
    #obsoletes QuietMode and FascistLogging
    #SyslogFacility AUTH
    SyslogFacility AUTHPRIV
    #LogLevel INFO

    # Authentication:

    #LoginGraceTime 2m
    PermitRootLogin yes
    #StrictModes yes
    #MaxAuthTries 6

    #RSAAuthentication yes
    #PubkeyAuthentication yes
    #AuthorizedKeysFile .ssh/authorized_keys

    # For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
    RhostsRSAAuthentication yes
    # similar for protocol version 2
    #HostbasedAuthentication no
    # Change to yes if you don’t trust ~/.ssh/known_hosts for
    # RhostsRSAAuthentication and HostbasedAuthentication
    IgnoreUserKnownHosts yes
    # Don’t read the user’s ~/.rhosts and ~/.shosts files
    IgnoreRhosts no

    # To disable tunneled clear text passwords, change to no here!
    #PasswordAuthentication yes
    #PermitEmptyPasswords no
    PasswordAuthentication yes

    # Change to no to disable s/key passwords
    #ChallengeResponseAuthentication yes
    ChallengeResponseAuthentication yes

    # Kerberos options
    #KerberosAuthentication no
    #KerberosOrLocalPasswd yes
    #KerberosTicketCleanup yes
    #KerberosGetAFSToken no

    # GSSAPI options
    #GSSAPIAuthentication no
    GSSAPIAuthentication yes
    #GSSAPICleanupCredentials yes
    GSSAPICleanupCredentials yes

    # Set this to ‘yes’ to enable PAM authentication, account processing,
    # and session processing. If this is enabled, PAM authentication will
    # be allowed through the ChallengeResponseAuthentication mechanism.
    # Depending on your PAM configuration, this may bypass the setting of
    # PasswordAuthentication, PermitEmptyPasswords, and
    # “PermitRootLogin without-password”. If you just want the PAM account and
    # session checks to run without PAM authentication, then enable this but set
    # ChallengeResponseAuthentication=no
    #UsePAM no
    UsePAM yes

    #AllowTcpForwarding yes
    #GatewayPorts no
    #X11Forwarding no
    X11Forwarding yes
    #X11DisplayOffset 10
    #X11UseLocalhost yes
    #PrintMotd yes
    #PrintLastLog yes
    #TCPKeepAlive yes
    #UseLogin no
    #UsePrivilegeSeparation yes
    #PermitUserEnvironment no
    #Compression yes
    #ClientAliveInterval 0
    #ClientAliveCountMax 3
    #UseDNS yes
    #PidFile /var/run/sshd.pid
    #MaxStartups 10
    #ShowPatchLevel no

    # no default banner path
    #Banner /some/path

    # override default of no subsystems
    Subsystem sftp /usr/libexec/openssh/sftp-server

  7. […] Original: http://preda.wordpress.com/2012/08/29/downloading-an-amazon-ec2-ami-to-local-drive/ […]

  8. Vaibhav says:

    Hi experts,
    I am getting error while ec2-unbuldle. error is “padding check failed”
    Using below command.
    ec2-unbundle -k testing.pem -m test.ami.manifest.se(testing.pem i used in whole earlier process so it is correct )

    Please let me know if any solution for this.

    thanks.
    Vaibhav

  9. sagi says:

    I could download the bundles

    Downloading manifest Kernel-image.manifest.xml from Kernel-image to /volume1/unbundle/Kernel-image.manifest.xml …
    Downloading part Kernel-image.part.00 to /volume1/unbundle/Kernel-image.part.00 …
    Downloaded Kernel-image.part.00 from Kernel-image
    Downloading part Kernel-image.part.01 to /volume1/unbundle/Kernel-image.part.01 …
    Downloaded Kernel-image.part.01 from Kernel-image
    Downloading part Kernel-image.part.02 to /volume1/unbundle/Kernel-image.part.02 …
    Downloaded Kernel-image.part.02 from Kernel-image
    Downloading part Kernel-image.part.03 to /volume1/unbundle/Kernel-image.part.03 …

    but I cannot unbundle the images and gettting the below errors

    Execution failed, pipeline: image-unbundle-pipeline, stage: untar.
    ERROR: invalid digest, expected 6252f8e8d67db209a7bb9bdeb059ddd64b18de5f received

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: