Configuring Trackpoint on the Lenovo Thinkpad

This is more or less just a post for myself. I always end up dumping a couple of hours into this problem whenever I get a new machine for work -- surprise! I work for Docker now -- and tonight especially I really could have used this post instead of wasting that time researching the problem all over again. I choose Thinkpads when I have a choice, because the popular alternative is stupid.

Anyway, I use Linux Mint with Cinnamon, and LM18 is the current version. It's based on Ubuntu 16.04. I've chosen a P50 and upgraded the RAM to 64GB. Everything works out of the box, including the weird dual graphics situation going on under the hood. However, I want a super sensitive Trackpoint. The sensitivity settings are under something like /sys/devices/platform/i8042/serio1/serio2/ in the sensitivity, speed and inertia files. I like to keep mine at about 255, 230, and 4, respectively. The value of 255 is maximal, btw.

Now, simply dumping my preferences into those files works for the current session. Meaning, when I reboot the machine, they are reset to their defaults. So I'm using systemd to write values into these files on boot. I've got a /etc/tmpfiles.d/tpoint.conf file with the following contents:

w /sys/devices/platform/i8042/serio1/serio2/speed - - - - 230
w /sys/devices/platform/i8042/serio1/serio2/sensitivity - - - - 255
w /sys/devices/platform/i8042/serio1/serio2/inertia - - - - 4

Now, this works great. However, after I resume (or thaw) from a suspend (or hibernate), the location of these config files changes. So what I've done is added a simple shell script to /etc/pm/sleep.d/trackpoint-fix which contains the following:

#!/bin/bash
# set sensitivity/speed of trackpoint on resume
case "${1}" in
	suspend|hibernate)
		# suspending to RAM
		sleep 0
		;;
	resume|thaw)
		# resume from suspend 
		newdir=$(find /sys/devices/platform/ | grep sensitivity | sed -e "s/sensitivity//")
		echo 230 | sudo tee > ${newdir}speed
		echo 255 | sudo tee > ${newdir}sensitivity
		echo 4 | sudo tee > ${newdir}inertia
		;;       
esac   

This is pretty self-explanatory. I dig around for the new location of these configuration files and then dump my favorite values into them. That's all there is to it.

Dockerized IPython / Anaconda for Machine Learning

Hey! You might have seen my recent post about having Dockerized some software called GraphLab Create (together with IPython) for a machine learning course I was taking. As it happens, I've found that image so useful for other, generic ML work that I've pared it down to its IPython/Anaconda bundle only. So I'd like to introduce the super-simple but super-useful Dockerized IPython / Anaconda project!

This repo includes a couple of useful scripts: One for building the image (build.sh) and one for running the resultant image as a container (run.sh). Just run the build script and then the run script (and optionally provide a directory to mount into the container for data files) and you're all set! Note: Either your specified directory or current working directory will be mounted to /data as a volume into the container! Also, your IPython Notebook may include import statements which reference functions inside files in your new /data volume directory. This means you will need to change any path references to include /data, and specifically add the /data directory to your import path by adding this to the top of your IPython Notebook:

import sys
sys.path.insert(0, '/data')

## this relative dir won't work:
# data_dir = 'foo/dataset.1'
## so we just add /data to the front:
data_dir = '/data/foo/dataset.1' 

Finally, I try to be readable, but take a look at my earlier post linked above if you want a breakdown of what's going on in there. Have fun!

Let's Encrypt: Nginx-Proxy Docker Companion

I've been using the fairly popular nginx-proxy reverse proxy for Docker containers, created by Jason Wilder. It's a slick, super-simple method of putting many containers on a single host that all need to share HTTP/HTTPS ports. I am also a huge fan of the Let's Encrypt project. Free SSL certificates as long as you can prove that you are the domain's operator. This is really how it should work: In my book, forcing people to pay for SSL certificates is shitty and exploitative, especially considering how incredibly important encryption is these days. And incidentally, the behavior around self-signed certificates in browsers is stupid and broken.

So anyways, I played with the Let's Encrypt stuff late last year when they went into public beta. Best Christmas present ever! So the idea is pretty simple: You tell your Let's Encrypt client (probably best to use certbot) that you want a certificate, and it talks to the Let's Encrypt server, requesting a certificate from the CA. That causes Let's Encrypt to make a curl call to your domain, requesting a specific resource that certbot creates. That resource looks something like http://www.example.com/.well-known/acme-challenge/g89SrgM4UAJGHiukm3GqQ3xMjTnpN-kZDYb27u4aTRW. That resource is just a regular file on disk. It has contents that look something like DTe7mGGhLlML7Vlh4dyNTu97OiIrIIs7xd5O0Fpmlq8.TaRs2K47il2D0K9RjmOKOx7Neuu91FdEpLp2Wo4FcNI. As long as that resource matches what Let's Encrypt is looking for, you get a free SSL certificate! And they've built some magic into certbot so that it automagically installs that cert into your webserver if it's a common one (e.g. Apache or Nginx or something).

I wanted to use it on on my Dockerized web frontends, which use nginx-proxy. I had spotted a couple of issues on nginx-proxy's Github page which mentioned Let's Encrypt, but I hadn't yet tried to get this working with my nginx-proxy container. Getting that to work does not seem like a trivial task. Not having the spare minutes to get Let's Encrypt working in my infrastructure, I put it on the back burner and made a mental note to check in every once in a while. Well, I completely forgot about it until I got an email recently reminding me to renew one of my SSL certs. And I'm not paying $8.99 for something that should be free, so I knew it was time to check back in with Let's Encrypt being incorporated into the nginx-proxy project.

Enter nginx-proxy-letsencrypt-companion. This is a docker container that sits coupled to your nginx-proxy container, sharing its volumes and paying attention to containers spinning up that have LETSENCRYPT_HOST and LETSENCRYPT_EMAIL environment variables set. The idea is that you start your nginx-proxy container, then start up this nginx-proxy-letsencrypt-companion container, and then start up your other containers that need Let's Encrypt certificates. The companion will request new Let's Encrypt certificates for containers that do not have current certificates and which also have those LETSENCRYPT_* environment variables set.

So here are my notes for getting this going. I ended up adding /usr/share/nginx/html as a data volume in my nginx-proxy container, and making a couple of the volumes rw instead of ro. Thus, my nginx-proxy run command looks something like this:

docker run -d \
    --name="nginx-proxy" \
    --restart="always" \
    -p 80:80 \
    -p 443:443 \
    -v "/var/docker/nginx-proxy/htpasswd:/etc/nginx/htpasswd" \
    -v "/var/docker/nginx-proxy/vhost.d:/etc/nginx/vhost.d" \
    -v "/var/docker/nginx-proxy/certs:/etc/nginx/certs" \
    -v "/var/run/docker.sock:/tmp/docker.sock" \
    -v "/usr/share/nginx/html" \
    jwilder/nginx-proxy

And the command for the brand new nginx-proxy-letsencrypt-companion container looks like this:

docker run -d \
    --name="nginx-proxy-letsencrypt-companion" \
    --restart="always" \
    -v "/var/run/docker.sock:/var/run/docker.sock:ro" \
    --volumes-from "nginx-proxy" \
    jrcs/nginx-proxy-letsencrypt-companion

And finally, your individual containers will follow this pattern (note the environment variables mentioned above):

docker run -d \
    --name="example.com" \
    --restart="always" \
    -e "VIRTUAL_HOST=example.com,www.example.com" \
    -e "VIRTUAL_PORT=2368" \
    -e "LETSENCRYPT_HOST=example.com,www.example.com" \
    -e "LETSENCRYPT_EMAIL=contact@example.com" \
    -v /var/docker/example.com/ghost:/var/lib/ghost \
    ghost

And there you have it! Once you get your containers up -- in that order: nginx-proxy, nginx-proxy-letsencrypt-companion, and your web container -- give the Let's Encrypt a minute to phone home and get a call back, and you'll have a free SSL certificate and related miscellany there in /var/docker/nginx-proxy/certs on your host. Also, the reason I created a data volume in the nginx-proxy container at /usr/share/nginx/html instead of a bind-mount volume. I do this because the Let's Encrypt client in the companion container will continually update the authorization data, and I don't want to have to worry later about cleaning up a huge, sprawling directory full of those files, some of which would contain valid information, and many others containing invalid information. Of course, you do whatever you want, as long as its right for you. Good luck!

Private Docker v2 Registry Upgrade Notes

Recently, I was offering some help on Stack Overflow to someone asking about deleting images from a private Docker registry. Here I mean a v2 registry, which is part of the Docker Distribution project. I should advise anyone reading this that no one ever refers to a v1 registry anymore: That project is dead, even though it occupies the registry:latest image tag on Dockerhub (you want to pull registry:2 at least). The v1 registry is an old Python project, and v2 is written in Go. Anyways, the v2 registry has not had delete capabilities (via the API) since its inception. This was my initial assumption, but I took the opportunity to research the latest information.

As it turns out, the latest versions of registry (later than v2.4 I think) do have delete functionality. While the main Docker project ("docker-engine") has excellent documentation, the Distribution project has previously not been. It's not bad, but it's not great, either. The API documentation is not very clear on using the new delete API functionality. But it's there, along with an interesting garbage collection mechanism. That's a topic for another day, but it's the reason I wanted to upgrade to version 2.4 of the registry. I was using version 2.1 or something.

Cue an upgrade, and a couple of problems. First, in the config.yml file, in the cache section under the storage section, the layerinfo setting has been deprecated and renamed to blobdescriptor. That isn't a blocking change yet, but it will be soon, so rename it now while you have the chance.

Finally, if you're backing your registry with S3 like a sane human being, the permissions have changed and the change is not documented anywhere. Zing! I couldn't push when I fired up my new registry container. I kept getting "Retrying in X seconds" messages when pushing individual layers. I killed and deleted the container, started up a new one with the level setting under the log section set to debug in my config.yml file. This yielded the key to the issue (notice the "s3aws: AccessDenied" message and 403 status code):

"err.code":"unknown","err.detail":"s3aws: AccessDenied: Access Denied\n\tstatus code: 403, request id: 11E0123C033B0DB5","err.message":"unknown error"

Here's what the new S3 policy needs to look like (beware copying from the documentation linked above: There is an errant comma in the documented policy):

 "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "s3:ListBucket",
          "s3:GetBucketLocation",
          "s3:ListBucketMultipartUploads"
        ],
        "Resource": "arn:aws:s3:::mybucket"
      },
      {
        "Effect": "Allow",
        "Action": [
          "s3:PutObject",
          "s3:GetObject",
          "s3:DeleteObject",
          "s3:ListMultipartUploadParts",
          "s3:AbortMultipartUpload"
        ],
        "Resource": "arn:aws:s3:::mybucket/*"
      }
]

Just for the record, this new policy adds support for the s3.GetBucketLocation, s3:ListBucketMultipartUploads Actions on your particular bucket, and the s3:ListMultipartUploadParts and s3:AbortMultipartUpload Actions on your bucket contents.

Dockerized IPython and GraphLab Create for Machine Learning

So here's a fun surprise: I love machine learning just like everyone else! I have played around with a few ML concepts in a couple of weekend projects, but nothing serious. Earlier this year I came across a Coursera ML course which seemed like a great place to start a more formal education about the subject. Sweet! I'm a self-taught programmer, but taking university CS courses as a non-degree-seeking student helped fill the gaps in my knowledge. I see the value of formal education.

To prepare for the class, they want you to set up a Python 2.7 environment (Anaconda), IPython, and GraphLab Create. IPython is an interactive environment for programming languages. Obviously, it was first targeted for Python, but I guess they do all kinds of languages now. One of the cool features is built-in support for data visualizations. We're specifically concerned with the IPython "Notebook" feature set. We are also using GraphLab Create, which is a commercial product spawned out of a project from CMU and released by Dato. The CEO of Dato is one of the primary instructors of the course.

Real quick, GraphLab Create is a commercial product, as I've mentioned. Dato offers a free "student" license for this product, which is what we will be using for the course. This blog post assumes that you have already signed up for one of these educational licenses, and you have already been given your license key (which looks, for example, like ABCD-0123-EF45-6789-9876-54FE-3210-DCBA). You will need both this license key and the email address you signed up with in order to continue.

Now, all this software is great (really, you'll see what I mean when you start using it), but it seems to be quite a lot of stuff that I'd rather not have installed on my filesystem if I can help it. If you have spoken to me at all in the last two and half years, you know that I am a big fan of Docker. Probably 80% of this blog is about Docker. And so it should come as no surprise that I've got compartmentalization on my mind. I install and use just about everything Dockerized. These tools are phenomenally useful for this illuminating ML course, but I want them containerized.

I've taken the installation instructions for Anaconda Python and GraphLab Create and put them into a Dockerfile, which you'll have a chance to look at a little further down. Before I get to that, I want to point out that if you look closely at the install instructions for GraphLab Create, you'll see a mention of getting your Nvidia GPU to work with the software in order to speed things along. Specifically for machine learning, having a GPU workhorse can be a computational difference of days, weeks, or even years.

CPUs are fine for most projects and will probably work just fine for this course, but I have seen the question of CUDA processing specifically with respect to Docker. I had heard that CUDA was not available to Docker containers because of the difficulty of making the Nvidia drivers available to the containerized process. Well, I took it as an opportunity for research, and it just so happens Nvidia has very recently released an application called nvidia-docker specifically for making the GPU available to Docker containers! You can follow that link for all kinds of interesting information, but suffice it to say nvidia-docker is a drop-in replacement for the docker executable which is used on images that you want to be CUDA-capable. They also offer similar functionality in a daemon plugin called nvidia-docker-plugin. You can read about the differences in the nvidia-docker documentation.

A quick note about nvidia-docker: I ran into a problem with the .deb I had installed per their instructions, because I am using the latest version of Docker (1.11), and as of this writing, they haven't released an updated .deb with the working code. And so that meant that I had to complie my own nvidia-docker binary. It's super easy (and doubly so for anyone with the technical wherewithal to be taking a machine learning course): Just git clone https://github.com/NVIDIA/nvidia-docker and then cd nvidia-docker && make and then sudo make install and you've got a working binary!

Also, I ran into import errors during the ML course (specifically with matplotlib). I ended up solving this by installing python-qt4 inside the Docker container right off the bat.

So finally, here's what my Dockerfile looks like:

FROM ubuntu:14.04
MAINTAINER curtisz <software@curtisz.com>

# get stuff
RUN apt-get update -y && \
	apt-get install -y \
		curl \
		python-qt4 && \
	rm -rf /var/cache/apt/archive/*

# get more stuff in one layer so unionfs doesn't store the 400mb file in its layers
WORKDIR /tmp
RUN curl -o /tmp/Anaconda2-4.0.0-Linux-x86_64.sh http://repo.continuum.io/archive/Anaconda2-4.0.0-Linux-x86_64.sh && \
	chmod +x ./Anaconda2-4.0.0-Linux-x86_64.sh && \
	./Anaconda2-4.0.0-Linux-x86_64.sh -b && \
	rm ./Anaconda2-4.0.0-Linux-x86_64.sh
# make the anaconda stuff available
ENV PATH=${PATH}:/root/anaconda2/bin

## anaconda
RUN conda create -n dato-env python=2.7 anaconda
# (use JSON format to force interpretation by /bin/bash)
RUN ["/bin/bash", "-c", ". activate dato-env"]
RUN conda update pip

## install graphlab create with creds provided in --build-arg in 'docker build' command:
ARG USER_EMAIL
ARG USER_KEY
RUN pip install --upgrade --no-cache-dir https://get.dato.com/GraphLab-Create/1.9/${USER_EMAIL}/${USER_KEY}/GraphLab-Create-License.tar.gz

## install ipython and ipython notebook
RUN conda install ipython-notebook

## upgrade GraphLab Create with GPU Acceleration
RUN pip install --upgrade --no-cache-dir http://static.dato.com/files/graphlab-create-gpu/graphlab-create-1.9.gpu.tar.gz

CMD jupyter notebook

I ended up having to update this Dockerfile when Dato released GraphLab Create version 1.9. It was as easy as changing "1.8.5" to "1.9" in the Dockerfile. Everything else was the same. Keep this in mind if you find that you need to install a newer version of GraphLab Create. Now, you'll build the image with this command, making sure to replace the email and license key with your own details:

docker build -t=graphlab --build-arg "USER_EMAIL=genius@example.edu" --build-arg "USER_KEY=ABCD-0123-EF45-6789-9876-54FE-3210-DCBA" .

The build will take a few minutes. It downloads a few hundred megabytes of stuff. When you're done with that, you can launch IPython with the following command:

nvidia-docker run -d --name=graphcreate -v "`pwd`/data:/data" --net=host graphlab:latest

Voila! You've got this whole operation running and you can access your IPython notebook by going to http://localhost:8888/ in your browser!

Please note: When it comes time to use GraphLab Create, you will be able to browse its UI normally, because we have specified --net=host in the docker run command, which shares the host's network stack with the container. The reason we do it this way is because GraphLab Create uses tcp/0 to set its server port. If you remember, that means the system chooses a random high port number, which prevents us from targeting a specific port with an EXPOSE Dockerfile directive (or -p port assignment in the docker run command). Exposing the host's network stack to the container could have some security implications if you run an untrusted application in that container. The applications we're using for this course are okay, it's just something you should be aware of.

Finally, I've been asked to include a tiny Docker crash course in case it's new to you. So our particular run command also mounts the ./data/ directory into the container at /data! This means you can download notebooks and datasets for the course and put them in that directory, and they'll be accessible in the container under the /data directory. For example, you would use sf = graphlab.SFrame('/data/people-example.csv') to load the sample data. In your terminal, you can use docker logs graphlab to see the container's logs, but don't forget you can swap out -d with -it in your docker run command if you want to create an interactive session for the container so that you can see the output in your terminal. You can also drop into a shell on the running container with docker exec -it graphlab /bin/bash and poke around if you need to. Killing the container happens with docker stop graphlab and deleting the container happens with docker rm graphlab. The Docker documentation is generally well-written, concise, and accurate. The source is also very approachable, as is the Docker community itself! Don't be afraid to drop by #docker on Freenode IRC if you need help!

For your convenience, I have created a Github repository with the Dockerfile and related scripts, as well as the sample starter data provided by Coursera.

Good luck!