How to install a GUI on Amazon AWS EC2 or EMR with the Amazon AMI

Tim Ryan picture Tim Ryan · Sep 23, 2017 · Viewed 9.6k times · Source

I have a need to run an application that requires a GUI interface to start and configure. I also need to be able to run this application on Amazon's EC2 service and EMR service. The EMR requirement means it has to run on Amazon's Linux AMI.

After extensive searching I've been unable to find any ready made solutions, in particular the requirement to run on Amazon's AMI. The closest match and most often referenced solution is here. Unfortunately it was developed on a RHEL6 instance which differs enough from Amazon's AMI that the solution does not work.

I'm posting my solution below. Hopefully it will save some others from the many hours of experimentation it took to come up with the right recipe.

Answer

Tim Ryan picture Tim Ryan · Sep 23, 2017

Here is my solution to get a GUI running on Amazon's AMI. I used this post as a starting point, but had to make many changes to get it working on Amazon's AMI. I also added additional info to make this work in a reasonably automated way so an individual who needs to bring up this environment more than once could do it without too much hassle.

Note: I include a lot of commentary in this post. I apologize in advance, but I thought it might be helpful to someone needing to make modfications if they could understand why made the various choices along the way.

The scripts included below install some files along the way. See section 4 for a list of the files and the directory structure used by these scripts.

Step 1. Install the Desktop

After performing a 'yum update', most solutions include a line like

sudo yum groupinstall -y "Desktop"

This deceivingly simple step requires significantly more effort on the Amazon AMI. This group is not configured in the Amazon AMI (AAMI from here on out). The AAMI has Amazon's own repositories installed and enabled by default. Also installed is the epel repo, but it is disabled by default. After enabling epel I found the Desktop group but it was not populated with packages. I also found Xfce (another desktop alternative) which was populated. Eventually I decided to install Xfce rather than Desktop. Still, that was not straight forward, but it eventually led to the solution.

Here it's worth noting that the first thing I tried was to install the centos repository and install the Desktop group from there. Initially this seemed promising. The group was fully populated with packages. However, after some effort I eventually decided there were simply too many version conflicts between the dependencies and packages that were already installed on the AAMI.

This led me choose Xfce from the epel repo. Since the epel repo was already installed on AAMI I figured there would be better dependency version coordination with the Amazon repos. This was generally true. Many dependencies were found either in the epel repo or the Amazon repos. For the ones that weren't, I was able to find them in the centos repo, and in most cases those were leaf dependencies. So most of the trouble came from the few dependencies in the centos repo that had sub-dependencies which conflicted with the amazon or epel repo. In the end a few hacks were required to bypass the dependency conflicts. I tried to minimize those as much as possible. Here is the script for installing Xfce

installGui.sh

#!/bin/bash

# echo each command
set -x

# assumes RSRC_DIR and IS_EMR set by parent script
YUM_RSRC_DIR=$RSRC_DIR/yum

sudo yum -y update

# Most info I've found on installing a GUI on AWS suggests to install using
#> sudo yum groupinstall -y "Desktop"
# This group is not available by default on the Amazon Linux AMI.  The group
# is listed if the epel repo is enabled, but it is empty.  I tried installing
# the centos repo, which does have support for this group, but it simply end
# up having to many dependency version conflicts with packages already installed
# by the Amazon repos.
#
# I found the path of least resistance to be installing the group Xfce from
# the epel repo. The epel repo is already included in amazon image, just not enabled.
# So I'm guessing there was at least some consideration by Amazon to align
# the dependency versions of this repo with the Amazon repos.
#
# My general approach to this problem was to start with the last command:
#> sudo yum groupinstall -y Xfce
# which will generate a list of missing dependencies.  The script below
# essentially works backwards through that list to eliminate all the
# missing dependencies.
#
# In general, many of the dependencies required by Xfce are found in either
# the epel repo or the Amazon repos.  Most of the remaining dependencies can be
# found in the centos repo, and either don't have any further dependencies, or if they
# do those dependencies are satisfied with the centos repo with no collisions
# in the epel or amazon repo.  Then there are a couple of oddball dependencies
# to clean up.

# if yum-config-manager is not found then install yum-utils
#> sudo yum install yum-utils
sudo yum-config-manager --enable epel

# install centos repo
# place the repo config @  /etc/yum.repos.d/centos.repo
sudo cp $YUM_RSRC_DIR/yum.repos.d/centos.repo /etc/yum.repos.d/

# The config centos.repo specifies the key with a URL.  If for some reason the key
# must be in a local file, it can be found here: https://www.centos.org/keys/RPM-GPG-KEY-CentOS-6
# It can be installed to the right location in one step:
#> wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6 https://www.centos.org/keys/RPM-GPG-KEY-CentOS-6
# Note, a key file must also be installed in the system key ring.  The docs are a bit confusing
# on this, I found that I needed to run both gpg AND then followed by rpm, eg:
#> sudo gpg --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
#> sudo rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6

# I found there are a lot of version conflicts between the centos, Amazon and epel repos.
# So I did not enable the centos repo generally.  Instead I used the --enablerepo switch
# enable it explicitly for each yum command that required it.  This only works for yum.  If
# rpm must be used, then yum-config-manager must be used to enable/disable repos as a
# separate step.
#
# Another problem I ran into was yum installing the 32-bit (*.i686) package rather than
# the 64-bit (*.x86_64) verision of the package.  I never figured out why.  So I had
# to specify the *.x86_64 package explicitly.  The search tools (eg. 'whatprovides')
# did not list the 64 bit package either even though a manual search through the
# package showed the 64 bit components were present.
#
# Sometimes it is difficult to determine which package must be in installed to satisfy
# a particular dependency.  'whatprovides' is a very useful tool for this
#> yum --enablerepo centos whatprovides libgdk_pixbuf-2.0.so.0
#> rpm -q --whatprovides libgdk_pixbuf

sudo yum --enablerepo centos install -y gdk-pixbuf2.x86_64
sudo yum --enablerepo centos install -y gtk2.x86_64
sudo yum --enablerepo centos install -y libnotify.x86_64
sudo yum --enablerepo centos install -y gnome-icon-theme
sudo yum --enablerepo centos install -y redhat-menus
sudo yum --enablerepo centos install -y gstreamer-plugins-base.x86_64

# problem when we get to libvte, installing libvte requires expat, which conflicts with amazon lib
# the centos package version was older and did not install right lib version
# but … the expat dependency was coming from a dependency on python-libs.
# the easiest workaround was to install python using the amazon repo, that in turn
# installs a version of python libs that is compatible with the version of libexpat on the system.

sudo yum install -y python
sudo yum --enablerepo centos install -y vte.x86_64

sudo yum --enablerepo centos install -y libical.x86_64
sudo yum --enablerepo centos install -y gnome-keyring.x86_64

# another sticky point, xfdesktop requires desktop-backgrounds-basic, but ‘whatprovides’ does not 
# provide any packages for this query (not sure why).  It turns out this is provided by the centos 
# repo, installing ‘desktop-backgrounds-basic’ will try to install the package redhat-logos, but 
# unfortunately this is obsoleted by Amazon’s generic-logos package
# The only way I could find to get around this was to erase the generic logos package.
# This doesn't seem too risky since this is just images for the desktop and menus.
#
sudo yum erase -y generic-logos

# Amazon repo must be disabled to prevent interference with the install
# of redhat-logos
sudo yum --disablerepo amzn-main --enablerepo centos install -y redhat-logos

# next problem is a dependency on dbus.  The dependency comes from dbus-x11 in 
# centos repo.  It requires dbus version 1.2.24, the amazon image already has
# version 1.6.12 installed.  Since the dbus-x11 is only used by the GUI package,
# easiest way around this is to install dbus-x11 with no dependency checks.
# So it will use the newer version of dbus (should be OK).  The main thing that could be a problem
# here is if it skips some other dependency.  When doing manually, its possible to run the install until
# the only error left is the dbus dependency.  It’s a bit risky running in a script since, basically it’s assuming
# all the dependencies are already in place.
yumdownloader --enablerepo centos dbus-x11.x86_64
sudo rpm -ivh --nodeps dbus-x11-1.2.24-8.el6_6.x86_64.rpm
rm dbus-x11-1.2.24-8.el6_6.x86_64.rpm

sudo yum install -y xfdesktop.x86_64

# We need the version of poppler-glib from centos repo, but it is found in several repos.
# Disable the other repos for this step.
# On EMR systems a newer version of poppler is already installed.  So move up 1 level
# in dependency chain and force install of tumbler.

if [ $IS_EMR -eq 1 ]
then
    yumdownloader --enablerepo centos tumbler.x86_64
    sudo rpm -ivh --nodeps tumbler-0.1.21-1.el6.x86_64.rpm
else
    sudo yum --disablerepo amzn-main --disablerepo amzn-updates --disablerepo epel --enablerepo centos install -y poppler-glib
fi


sudo yum install  --enablerepo centos -y polkit-gnome.x86_64
sudo yum install  --enablerepo centos  -y control-center-filesystem.x86_64

sudo yum groupinstall -y Xfce

Here are the contents for the centos repository config file:

centos.repo

[centos]
name=CentOS mirror
baseurl=http://repo1.ash.innoscale.net/centos/6/os/x86_64/
failovermethod=priority
enabled=0
gpgcheck=1
gpgkey=https://www.centos.org/keys/RPM-GPG-KEY-CentOS-6

If all you needed was a recipe to get a desktop package installed on the Amazon AMI, then you're done. The rest of this post covers how to configure VNC to access the desktop via an SSH tunnel, and how to package all of this so that the instance can be easily started from a script.

Step 2. Install and Configure VNC

Below is my top level script for installing the GUI. After configuring a few variables the first thing it does is call the script from step 1 above. This script has some extra baggage since I've built it to work on a regular ec2 instance, or emr and as root or as ec2-user. The essential steps are

  1. install libXfont
  2. install tiger-vnc-server
  3. install the VNC server config file
  4. create a .vnc directory in the user home directory
  5. install the xstartup file in the .vnc directory
  6. install a dummy passwd file in the .vnc directory
  7. start the VNC server

A few key points to note:

This assumes you will access the VNC server through an SSH tunnel. In the end this really seemed like the easiest and most reliably secure way to go. Since you probably have a port for SSH open in your security group specification, you won't have to make any changes to it. Also, the encryption config for VNC clients/servers is not straight forward. It seemed easy to make a mistake and leave your communications unencrypted. The settings for this are in the vncservers file. The -localhost switch tells vnc only to accept local connections. The '-nolisten tcp' tells associated xserver modules to also not accept connections from the network. Finally the '-SecurityTypes None' switch allows you to open your VNC session without typing a passwd, since the only way into the machine is through ssh, the additional password check seems redundant.

The xstartup file determines what will start when your VNC session is initiated the first time. I've noticed many posts on this subject skip this point. If you don't tell it to start the Xfce desktop, you will just get a blank window when you start VNC. The config I have here is very simple.

Even though I mentioned above that the VNC server is configured to not prompt for a password, it nevertheless requires a passwd file in the .vnc directory in order for the server to start. The first time you run the script it will fail when it tries to start the server. Login to the machine via ssh and run 'vncpasswd'. It will create a passwd file in the .vnc directory that you can save to use as part of these scripts during install. Note, I've read that VNC does not do anything sophisticated to protect the passwd file. So I would not recommend using a passwd that you use for other, more important accounts.

installGui.sh

#!/bin/bash

# echo each command
set -x

BIN_DIR="${BASH_SOURCE%/*}"
ROOT_DIR=$(dirname $BIN_DIR)
RSRC_DIR=$ROOT_DIR/rsrc
VNC_DIR=$RSRC_DIR/vnc

# Install user config files into ec2-user home directory
# if it is available.  In practice, this should always
# be true

if [ -d "/home/ec2-user" ]
then
   USER_ACCT=ec2-user
else
   USER_ACCT=hadoop
fi

HOME_DIR="/home"

# Use existence of hadoop home directory as proxy to determine if
# this is an EMR system.  Can be used later to differentiate
# steps on EC2 system vs EMR.
if [ -d "/home/hadoop" ]
then
    IS_EMR=1
else
    IS_EMR=0
fi


# execute Xfce desktop install
. "$BIN_DIR/installXfce.sh"

# now roughly follow the following from step 3: https://devopscube.com/setup-gui-for-amazon-ec2-linux/

sudo yum install -y pixman pixman-devel libXfont

sudo yum -y install tigervnc-server


# install the user account configuration file.
# This setup assumes the user will always connect to the VNC server
# through an SSH tunnel.  This is generally more secure, easier to
# configure and easier to get correct than trying to allow direct
# connections via TCP.
# Therefore, config VNC server to only accept local connections, and
# no password required.
sudo cp $VNC_DIR/vncservers-$USER_ACCT /etc/sysconfig/vncservers

# install the user account, vnc config files

sudo mkdir $HOME_DIR/$USER_ACCT/.vnc
sudo chown $USER_ACCT:$USER_ACCT $HOME_DIR/$USER_ACCT/.vnc

# need xstartup file to tell vncserver to start the window manager
sudo cp $VNC_DIR/xstartup $HOME_DIR/$USER_ACCT/.vnc/
sudo chown $USER_ACCT:$USER_ACCT $HOME_DIR/$USER_ACCT/.vnc/xstartup

# Even though the VNC server is config'd to not require a passwd, the
# server still looks for the passwd file when it starts the session.
# It will fail if the passwd file is not found.
# The first time these scripts are run, the final step will fail.
# Then manually run
#> vncpasswd
# It will create the file ~/.vnc/passwd.  Then save this file to persistent
# storage so that it can be installed to the user account during
# server initialization.

sudo cp $ROOT_DIR/home/user/.vnc/passwd $HOME_DIR/$USER_ACCT/.vnc/
sudo chown $USER_ACCT:$USER_ACCT $HOME_DIR/$USER_ACCT/.vnc/passwd

# This script will be running as root if called from the EC2 launch
# command.  VNC server needs to be started as the user that
# you will connect to the server as (eg. ec2-user, hadoop, etc.)
sudo su -c "sudo service vncserver start" -s /bin/sh $USER_ACCT

# how to stop vncserver
# vncserver -kill :1

# On the remote client
# 1. start the ssh tunner
#> ssh -i ~/.ssh/<YOUR_KEY_FILE>.pem -L 5901:localhost:5901 -N ec2-user@<YOUR_SERVER_PUBLIC_IP>
#    for debugging connection use -vvv switch
# 2. connect to the vnc server using client on the remote machine.  When
#    prompted for the IP address, use 'localhost:5901'
#    This connects to port 5901 on your local machine, which is where the ssh
#    tunnel is listening.

vncservers

# The VNCSERVERS variable is a list of display:user pairs.
#
# Uncomment the lines below to start a VNC server on display :2
# as my 'myusername' (adjust this to your own).  You will also
# need to set a VNC password; run 'man vncpasswd' to see how
# to do that.  
#
# DO NOT RUN THIS SERVICE if your local area network is
# untrusted!  For a secure way of using VNC, see this URL:
# http://kbase.redhat.com/faq/docs/DOC-7028

# Use "-nolisten tcp" to prevent X connections to your VNC server via TCP.

# Use "-localhost" to prevent remote VNC clients connecting except when
# doing so through a secure tunnel.  See the "-via" option in the
# `man vncviewer' manual page.

# Use "-SecurityTypes None" to allow session login without a password.
# This should only be used in combination with "-localhost"
# Note: VNC server still looks for the passwd file in ~/.vnc directory
# when the session starts regardless of whether the user is
# required to enter a passwd.

# VNCSERVERS="2:myusername"
# VNCSERVERARGS[2]="-geometry 800x600 -nolisten tcp -localhost"
VNCSERVERS="1:ec2-user"
VNCSERVERARGS[1]="-geometry 1280x1024 -nolisten tcp -localhost -SecurityTypes None"

xstartup

#!/bin/sh

unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
# exec /etc/X11/xinit/xinitrc
/usr/share/vte/termcap/xterm &
/usr/bin/startxfce4 &

Step 3. Connect to Your Instance

Once you've got the VNC server running on EC2 you can try connecting to it. First open an SSH tunnel to your instance. 5901 is the port where the VNC server listens for display 1 from the vncservers file. It will listen for display 2 on port 5902, etc. This command creates a tunnel from port 5901 on your local machine to port 5901 on the instance.

ssh -i ~/.ssh/<YOUR_KEY_FILE>.pem -L 5901:localhost:5901 -N ec2-user@<YOUR_SERVER_PUBLIC_IP>

Now open your preferred VNC client. Where it prompts for the IP address of the server enter:

localhost:5901

If nothing happens at all, then either there was a problem starting the vnc server, or there is a connectivity problem preventing the client from reaching the server, or possibly a problem in vncservers config file

If a window comes up, but it is just blank then check that the Xfce install completed successfully and that the xstartup file is installed.

Step 4. Simplify

If you just need to do this once then sftp'ing the scripts over to your instance and running manually is fine. Otherwise you're going to want to automate this as much as possible to make it faster and less error prone when you do need to fire up an instance with a GUI.

The first step to automating is to create an EFS volume containing the scripts and config files that can be mounted when the instance is started. Amazon has plenty of info on creating a network file system. A couple points to pay attention to when creating the volume. If you don't want your volume to be open to the world you may want to create a custom security group to use for your EFS volume. I created security group for my EFS volume (call it NFS_Mount) that only allows inbound TCP traffic on port 2049 coming from one of my other security groups, call it MasterVNC. Then when you create an instance, make sure to associate the MasterVNC security group with that instance. Otherwise the EFS volume won't allow your instance to connect with it.

Now mount the EFS volume:

sudo mkdir /mnt/YOUR_MOUNT_POINT_DIR
sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 fs-YOUR_EFS_ID.efs.us-east-1.amazonaws.com:/ /mnt/YOUR_MOUNT_POINT_DIR

Now populate /mnt/YOUR_MOUNT_POINT_DIR with the 6 files mentioned in steps 1 and 2 using the following directory structure. Recall that you must create the passwd file the first time using the command 'vncpasswd'. It will create the file at ~/.vnc/passwd.

/mnt/YOUR_MOUNT_POINT_DIR/bin/installGui.sh /mnt/YOUR_MOUNT_POINT_DIR/bin/installXfce.sh

/mnt/YOUR_MOUNT_POINT_DIR/rsrc/vnc/vncservers-ec2-user /mnt/YOUR_MOUNT_POINT_DIR/rsrc/vnc/xstartup /mnt/YOUR_MOUNT_POINT_DIR/rsrc/vnc/passwd

/mnt/YOUR_MOUNT_POINT_DIR/rsrc/yum/yum.repos.d/centos.repo

At this point, setting up an instance with a GUI should be pretty easy. Create your instance as you normally would (make sure to include the MasterVNC security group), ssh to the instance, mount the EFS volume, and run the installGui.sh script.

Step 5. Automate

You can take things a step further and launch your instance in 1 step using the AWS CLI tools on your local machine. To do this you will need to mount the EFS volume and run the installGui.sh script using arguments to the AWS CLI commands. This just requires creating a top level script and passing it to the CLI command.

Of course there are a couple complications. EC2 and EMR use different switches and mechanisms to attach the script. And furthermore, on EMR I only want the GUI to be installed on the master node (not the core or task nodes).

Launching an EC2 instance requires embedding the script in the command with the --user-data switch. This is done easily by specifying the absolute path to the script file on your local machine.

aws ec2 run-instances --user-data file:///PATH_TO_YOUR_SCRIPT/top.sh  ... other options

The EMR launch does not support embedding scripts from a local file. Instead you can specify an S3 URI in the bootstrap actions.

aws emr create-cluster --bootstrap-actions '[{"Path":"s3://YOUR_BUCKET/YOUR_DIR/top.sh","Name":"Custom action"}]' ... other options

Finally, you'll see in top.sh below most of the script is a function to determine if the machine is a basic EC2 instance or an EMR master. If not for that the script could be 3 lines. You may wonder why not just use the built in 'run-if' bootstrap action rather than writing my own function. The built in 'run-if' script has a bug and does not properly run scripts located in S3.

Debugging things once you put them in the init sequence can be a challenge. One thing that can help is the log file: /var/log/cloud-init-output.log. This captures all the console output from the scripts run during bootstrap initialization.

top.sh

#!/bin/bash

# note: conditional bootstrap function run-if has a bug, workaround ...
# this function adapted from https://forums.aws.amazon.com/thread.jspa?threadID=222418
# Determine if we are running on the master node.
# 0 - running on master, or non EMR node
# 1 - running on a task or core node

check_if_master_or_non_emr() {
    python - <<'__SCRIPT__'
import sys
import json

instance_file = "/mnt/var/lib/info/instance.json"

try:
    with open(instance_file) as f:
        props = json.load(f)
    is_master_or_non_emr = props.get('isMaster', False)

except IOError as ex:
    is_master_or_non_emr = True   # file will not exist when testing on a non-emr machine

if is_master_or_non_emr:
    sys.exit(1)
else:
    sys.exit(0)
__SCRIPT__
}

check_if_master_or_non_emr
IS_MASTER_OR_NON_EMR=$?

# If this machine is part of EMR cluster, then ONLY install on the MASTER node

if [ $IS_MASTER_OR_NON_EMR -eq 1 ]
then
    sudo mkdir /mnt/YOUR_MOUNT_POINT_DIR

    sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 fs-YOUR_EFS_ID.efs.us-east-1.amazonaws.com:/ /mnt/YOUR_MOUNT_POINT_DIR

    . /mnt/YOUR_MOUNT_POINT_DIR/bin/installGui.sh
fi

exit 0