Recent blog entries for glyph

Sorry I Unfollowed You

Since Alex Gaynor wrote his seminal thinkpiece on the subject, “I Hope Twitter Goes Away”, I’ve been wrestling to define my relationship to this often problematic product.

On the one hand, Twitter has provided me with delightful interactions with human beings who I would not otherwise have had the opportunity to meet or interact with. If you are the sort of person who likes following people, four suggestions I’d make on that front are Melissa 🔔, Gary Bernhardt, Eevee and Matt Blaze, all of whom have blogs but none of whom I would have discovered without Twitter.

Twitter has also allowed me to reach a larger audience with my writing than I otherwise would have been able to. Lots of people click on links to this blog from Twitter either from following me directly or from a retweet. (Thank you, retweeters, one and all.)

On the other hand, the effect of using Twitter on my productivity is like having a constant, low-grade headache. While Twitter has never been a particularly bad distraction as measured by hours spent on it (I keep metrics on that, and it’s rarely even in the top 10), I feel like consulting Twitter is something I do when I am stuck, or having to think about something hard. “I’ll just check Twitter” is an easy way to “take a break” right at the moment that I ought to be thinking harder, eliminating distractions, mustering my will to focus.

This has been particularly stark for me as I’ve been trying to get some real writing done over the last couple of weeks and have been consistently drawing a blank. Given that I have a deadline coming up on Wednesday and another next Monday, something had to give.

Or, as Joss Whedon put it, when he quit Twitter:

If I’m going to start writing again, I have to go to the quiet place, and this is the least quiet place I’ve ever been in my life.

I’m an introvert, and using Twitter is more like being at a gigantic, awkward party all the time than any other online space I’ve ever been in.

There’s an irony here. Mostly what people like that I put on Twitter (and yes, I’ve checked) are announcements that link to other things, accomplishments in other areas, like a blog post, or a feature in Twisted, but using Twitter itself is inimical to completing those things.

I’m loath to abandon the positive aspects of Twitter. Some people also use Twitter as a replacement for RSS, and I don’t want to break the way they choose to pay attention to the stuff that I do. And a few of my friends communicate exclusively through direct messages.

The really “good” thing about Twitter is discovery. It enables you to discover people, content, and, eugh, “brands” that appeal to you. I have discovered things that I enjoy many times. The fundamental problem I am facing, which is a little bit hard to admit to oneself, is that I have discovered enough. I have enough games to play, enough books and articles to read, enough podcasts to listen to, enough movies to watch, enough code to write, enough open source libraries to investigate, that I will be busy for years based on what I already know.

For me, using Twitter’s timeline at this point to “discover” more things is like being at a delicious buffet, being so full I’m nauseous, and stuffing my pockets with shrimp “just in case” I’m hungry “when I get home” - and then, of course, not going home.

Even disregarding my desire to produce useful content, if I just want to enjoy consuming content more deeply, I have to take the time to engage with it properly.

So here’s what I’m doing:

  1. I am turning on the “anyone can direct message me” feature. We’ll see how that goes; I may have to turn it off again later. As always, I’d prefer you send email (or text me, if it’s time-critical).
  2. I am unfollowing literally everyone, and will not follow people in the future. Checking my timeline was the main information junk-food I want to avoid.
  3. Since my timeline, rather than mentions and replies, was my main source of distraction, I’ll continue paying attention to mentions and replies (at least for now; I’ll have to see if that becomes a problem in the absence of a timeline).
  4. In order to avoid producing such information junk-food myself, I’m going to try to directly tweet less, and put more things into brief blog posts so I have enough room to express them. I won’t say “not at all”, but most of the things that I put on Twitter would really be better as longer, more thoughtful articles.

Please note that there’s nothing prescriptive here. I’m outlining what I’m doing in the hopes that others might recognize similar problems with themselves - if everyone used Twitter this way, there would hardly be a point to the site.

Also, if I’ve unfollowed you, that doesn’t mean I’m not interested in what you have to say. I already have a way of keeping in touch with people’s more fully-formed ideas: I use Blogtrottr to deliver relevant blog articles to my email. If I previously followed you and you think I might not be reading your blog already (in most cases I believe I already am), please feel free to drop me a line with an RSS link.

Syndicated 2015-06-09 00:41:00 from Deciphering Glyph

Separate your Fakes and your Inspectors

When you are writing unit tests, you will commonly need to write duplicate implementations of your dependencies to test against systems which do external communication or otherwise manipulate state that you can’t inspect. In other words, test fakes. However, a “test fake” is just one half of the component that you’re building: you’re also generally building a test inspector.

As an example, let’s consider the case of this record-writing interface that we may need to interact with.

1
2
3
4
5
6
class RecordWriter(object):
    def write_record(self, record):
        "..."

    def close(self):
        "..."

This is a pretty simple interface; it can write out a record, and it can be closed.

Faking it out is similarly easy:

1
2
3
4
5
class FakeRecordWriter(object):
    def write_record(self, record):
        pass
    def close(self):
        pass

But this fake record writer isn’t very useful. It’s a simple stub; if our application writes any interesting records out, we won’t know about it. If it closes the record writer, we won’t know.

The conventional way to correct this problem, of course, is to start tracking some state, so we can assert about it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class FakeRecordWriter(object):
    def __init__(self):
        self.records = []
        self.closed = False

    def write_record(self, record):
        if self.closed:
            raise IOError("cannot write; writer is closed")
        self.records.append(record)

    def close(self):
        if self.closed:
            raise IOError("cannot close; writer is closed")
        self.closed = True

This is a very common pattern in test code. However, it’s an antipattern.

We have exposed 2 additional, apparently public attributes to application code: .records and .closed. Our original RecordWriter interface didn’t have either of those. Since these attributes are public, someone working on the application code could easily, inadvertently access them. Although it’s unlikely that an application author would think that they could read records from a record writer by accessing .records, it’s plausible that they might add a check of .closed before calling .close(), to make sure they won’t get an exception. Such a mistake might happen because their IDE auto-suggested the completion, for example.

The resolution for this antipattern is to have a separate “fake” object, exposing only the public attributes that are also on the object being faked, and an “inspector” object, which exposes only the functionality useful to the test.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class WriterState(object):
    def __init__(self):
        self.records = []
        self.closed = False

    def raise_if_closed(self):
        if self.closed:
            raise ValueError("already closed")


class _FakeRecordWriter(object):
    def __init__(self, writer_state):
        self._state = writer_state

    def write_record(self, record):
        self._state.raise_if_closed()
        self._state.records.append(record)

    def close(self):
        self._state.raise_if_closed()
        self._state.closed = True


def create_fake_writer():
    state = WriterState()
    return state, _FakeRecordWriter(state)

In this refactored example, we now have a top-level entry point of create_fake_writer, which always creates a pair of WriterState and thing-which-is-like-a-RecordWriter. The type of _FakeRecordWriter can now be private, because it’s no longer interesting on its own; it exposes nothing beyond the methods it’s trying to fake.

Whenever you’re writing test fakes, consider writing them like this, to ensure that you can hand application code the application-facing half of your fake, and test code the test-facing half of the fake, and not get them mixed up.

Syndicated 2015-05-09 06:52:00 from Deciphering Glyph

Not Funny

What?

Today’s “joke” from the PSF about PyCon Havana was not funny, and, speaking as a PSF Fellow, I do not endorse it.

What’s Not Funny?

Honestly I’m not sure where I could find a punch-line in this. I just don’t see much there.

But if I look for something that’s supposed to be “funny”, here’s what I see:

  1. Cuba is a backward country without sufficient technology to host a technical conference, and it is absurd and therefore “funny” that we could hold PyCon there.
  2. We are talking about PyCon US; despite the recent thaw in relations, decades of hostility that have torn families apart make it “funny” that US citizens would go to Cuba for a conference.

These things aren’t funny.

Some Non-Reasons I’m Writing This

A common objection when someone speaks up about a subject like this is that it’s “just a joke”. That anyone speaking up and saying that offensive things aren’t funny somehow dislikes the very concept of humor. I don’t know why people think that, but I guess I need to make it clear: I am not an enemy of joy. That is not why I’m saying something.

I’m also not Cuban, I have no Cuban relatives, and until this incident I didn’t even know I had friends of Cuban extraction, so I am not personally insulted by this. That means another common objection will crop up: some will ask if I’m just looking for an excuse to get offended, to write about taking offense and get attention for it.

So let me assure you, that personally, this is not the kind of attention that I want. I really didn’t want to write this post. It’s awkward. I really don’t want to be having these types of conversations. I want to get attention for the software I write, not for my opinions about tacky blog posts.

Why, Then?

I might not know many Cuban python programmers personally, but I’d love to meet some. I’d love to meet anyone who cares about programming. Meeting diverse people from all over the world and working with them on code has been one of the great joys of my life. I love the fact that the Python community facilitates that and tries hard to reach out to people and to make them feel welcome.

I am writing this because I know that, somewhere out there, there’s a Cuban programmer, or a kid who will grow up to be one, who might see that blog post, and think that the Python community, or the software industry, thinks that they’re a throw-away punch line. I want them to know that I don’t think they’re a punch line. I want them to know that the python community doesn’t think they’re a punch line. I want them to know that they are not a punch line, and I want them to pursue their interest in programming exactly as far as it takes them and not push them away.

These people are real, they are listening, and if you tell me to just “lighten up” you are saying that your enjoyment of a joke is more important than their membership in our community.

It’s Not Just Me

The PSF is paying attention. The chairman of the PSF has acknowledged the problematic nature of the “joke”. Several of my friends in the Python community spoke up before I did (here, here, here, here, here, here, here), and I am very grateful for their taking the community to task and keeping us true to ideals of inclusiveness and empathy.

That doesn’t excuse the public statement, made using official channels, which was in very poor taste. I am also very disappointed in certain people within the PSF1 who seem intent on doubling down on this mistake rather than trying to do something to correct it.


  1. names withheld to avoid a pile-on, but you know who you are and you should be ashamed. 

Syndicated 2015-04-02 05:47:00 from Deciphering Glyph

Headcanon

My Castle headcanon1 has always been that, when they finally catch up with Mal (oh, and they definitely do catch up with him; the idea that no faction within the Alliance would seek revenge for what he’s done is laughable) they decide that they can, in fact, “make people better”, and he is no exception. After the service he has done in exposing the corruption and cover-ups behind Miranda, they can’t just dispose of him, so they want to rehabilitate him and make him a productive, contributing member of alliance society.

They can’t simply re-format his brain directly, of course. It wouldn’t be compatible with his personality, and his underlying connectome would simply reject the overlaid neural matrix; it would degrade over time, and he would have to return for treatments far too regularly for it to be practical.

The most fitting neural re-programming they can give him, of course, would be to have him gradually acclimate to becoming a lawman. So “Richard Castle” begins as an anti-authoritarian man-child and acquiesces, bit by bit, to the necessity of becoming an agent of the enforcement of order.

My favorite thing about the current season is that, while it is already obvious that my interpretation is correct, this season has given Mal a glimmer of hope. Clearly the reprogramming isn’t working, and aspects of his real life are coming through.

They really can’t take the sky from him.

Syndicated 2015-03-22 08:16:00 from Deciphering Glyph

Deploying Python Applications with Docker - A Suggestion

Deploying python applications is much trickier than it should be.

Docker can simplify this, but even with Docker, there are a lot of nuances around how you package your python application, how you build it, how you pull in your python and non-python dependencies, and how you structure your images.

I would like to share with you a strategy that I have developed for deploying Python apps that deals with a number of these issues. I don’t want to claim that this is the only way to deploy Python apps, or even a particularly right way; in the rapidly evolving containerization ecosystem, new techniques pop up every day, and everyone’s application is different. However, I humbly submit that this process is a good default.

Rather than equivocate further about its abstract goodness, here are some properties of the following container construction idiom:

  1. It reduces build times from a naive “sudo setup.py install” by using Python wheels to cache repeatably built binary artifacts.
  2. It reduces container size by separating build containers from run containers.
  3. It is independent of other tooling, and should work fine with whatever configuration management or container orchestration system you want to use.
  4. It uses existing Python tooling of pip and virtualenv, and therefore doesn’t depend heavily on Docker. A lot of the same concepts apply if you have to build or deploy the same Python code into a non-containerized environment. You can also incrementally migrate towards containerization: if your deploy environment is not containerized, you can still build and test your wheels within a container and get the advantages of containerization there, as long as your base image matches the non-containerized environment you’re deploying to. This means you can quickly upgrade your build and test environments without having to upgrade the host environment on finicky continuous integration hosts, such as Jenkins or Buildbot.

To test these instructions, I used Docker 1.5.0 (via boot2docker, but hopefully that is an irrelevant detail). I also used an Ubuntu 14.04 base image (as you can see in the docker files) but hopefully the concepts should translate to other base images as well.

In order to show how to deploy a sample application, we’ll need a sample application to deploy; to keep it simple, here’s some “hello world” sample code using Klein:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# deployme/__init__.py
from klein import run, route

@route('/')
def home(request):
    request.setHeader("content-type", "text/plain")
    return 'Hello, world!'

def main():
    run("", 8081)

And an accompanying setup.py:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from setuptools import setup, find_packages

setup (
    name             = "DeployMe",
    version          = "0.1",
    description      = "Example application to be deployed.",
    packages         = find_packages(),
    install_requires = ["twisted>=15.0.0",
                        "klein>=15.0.0",
                        "treq>=15.0.0",
                        "service_identity>=14.0.0"],
    entry_points     = {'console_scripts':
                        ['run-the-app = deployme:main']}
)

Generating certificates is a bit tedious for a simple example like this one, but in a real-life application we are likely to face the deployment issue of native dependencies, so to demonstrate how to deal with that issue, that this setup.py depends on the service_identity module, which pulls in cryptography (which depends on OpenSSL) and its dependency cffi (which depends on libffi).

To get started telling Docker what to do, we’ll need a base image that we can use for both build and run images, to ensure that certain things match up; particularly the native libraries that are used to build against. This also speeds up subsquent builds, by giving a nice common point for caching.

In this base image, we’ll set up:

  1. a Python runtime (PyPy)
  2. the C libraries we need (the libffi6 and openssl ubuntu packages)
  3. a virtual environment in which to do our building and packaging
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# base.docker
FROM ubuntu:trusty

RUN echo "deb http://ppa.launchpad.net/pypy/ppa/ubuntu trusty main" > \
    /etc/apt/sources.list.d/pypy-ppa.list

RUN apt-key adv --keyserver keyserver.ubuntu.com \
                --recv-keys 2862D0785AFACD8C65B23DB0251104D968854915
RUN apt-get update

RUN apt-get install -qyy \
    -o APT::Install-Recommends=false -o APT::Install-Suggests=false \
    python-virtualenv pypy libffi6 openssl

RUN virtualenv -p /usr/bin/pypy /appenv
RUN . /appenv/bin/activate; pip install pip==6.0.8

The apt options APT::Install-Recommends and APT::Install-Suggests are just there to prevent python-virtualenv from pulling in a whole C development toolchain with it; we’ll get to that stuff in the build container. In the run container, which is also based on this base container, we will just use virtualenv and pip for putting the already-built artifacts into the right place. Ubuntu expects that these are purely development tools, which is why it recommends installation of python development tools as well.

You might wonder “why bother with a virtualenv if I’m already in a container”? This is belt-and-suspenders isolation, but you can never have too much isolation.

It’s true that in many cases, perhaps even most, simply installing stuff into the system Python with Pip works fine; however, for more elaborate applications, you may end up wanting to invoke a tool provided by your base container that is implemented in Python, but which requires dependencies managed by the host. By putting things into a virtualenv regardless, we keep the things set up by the base image’s package system tidily separated from the things our application is building, which means that there should be no unforseen interactions, regardless of how complex the application’s usage of Python might be.

Next we need to build the base image, which is accomplished easily enough with a docker command like:

1
$ docker build -t deployme-base -f base.docker .;

Next, we need a container for building our application and its Python dependencies. The dockerfile for that is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# build.docker
FROM deployme-base

RUN apt-get install -qy libffi-dev libssl-dev pypy-dev
RUN . /appenv/bin/activate; \
    pip install wheel

ENV WHEELHOUSE=/wheelhouse
ENV PIP_WHEEL_DIR=/wheelhouse
ENV PIP_FIND_LINKS=/wheelhouse

VOLUME /wheelhouse
VOLUME /application

ENTRYPOINT . /appenv/bin/activate; \
           cd /application; \
           pip wheel .

Breaking this down, we first have it pulling from the base image we just built. Then, we install the development libraries and headers for each of the C-level dependencies we have to work with, as well as PyPy’s development toolchain itself. Then, to get ready to build some wheels, we install the wheel package into the virtualenv we set up in the base image. Note that the wheel package is only necessary for building wheels; the functionality to install them is built in to pip.

Note that we then have two volumes: /wheelhouse, where the wheel output should go, and /application, where the application’s distribution (i.e. the directory containing setup.py) should go.

The entrypoint for this image is simply running “pip wheel” with the appropriate virtualenv activated. It runs against whatever is in the /application volume, so we could potentially build wheels for multiple different applications. In this example, I’m using pip wheel . which builds the current directory, but you may have a requirements.txt which pins all your dependencies, in which case you might want to use pip wheel -r requirements.txt instead.

At this point, we need to build the builder image, which can be accomplished with:

1
$ docker build -t deployme-builder -f build.docker .;

This builds a deployme-builder that we can use to build the wheels for the application. Since this is a prerequisite step for building the application container itself, you can go ahead and do that now. In order to do so, we must tell the builder to use the current directory as the application being built (the volume at /application) and to put the wheels into a wheelhouse directory (one called wheelhouse will do):

1
2
3
4
5
$ mkdir -p wheelhouse;
$ docker run --rm \
         -v "$(pwd)":/application \
         -v "$(pwd)"/wheelhouse:/wheelhouse \
         deployme-builder;

After running this, if you look in the wheelhouse directory, you should see a bunch of wheels built there, including one for the application being built:

1
2
3
4
5
6
$ ls wheelhouse
DeployMe-0.1-py2-none-any.whl
Twisted-15.0.0-pp27-none-linux_x86_64.whl
Werkzeug-0.10.1-py2-none-any.whl
cffi-0.9.0-py2-none-any.whl
# ...

At last, time to build the application container itself. The setup for that is very short, since most of the work has already been done for us in the production of the wheels:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# run.docker
FROM deployme-base

ADD wheelhouse /wheelhouse
RUN . /appenv/bin/activate; \
    pip install --no-index -f wheelhouse DeployMe

EXPOSE 8081

ENTRYPOINT . /appenv/bin/activate; \
           run-the-app

During build, this dockerfile pulls from our shared base image, then adds the wheelhouse we just produced as a directory at /wheelhouse. The only shell command that needs to run in order to get the wheels installed is pip install TheApplicationYouJustBuilt, with two options: --no-index to tell pip “don’t bother downloading anything from PyPI, everything you need should be right here”, and, -f wheelhouse which tells it where “here” is.

The entrypoint for this one activates the virtualenv and invokes run-the-app, the setuptools entrypoint defined above in setup.py, which should be on the $PATH once that virtualenv is activated.

The application build is very simple, just

1
$ docker build -t deployme-run -f run.docker .;

to build the docker file.

Similarly, running the application is just like any other docker container:

1
$ docker run --rm -it -p 8081:8081 deployme-run

You can then hit port 8081 on your docker host to load the application.

The command-line for docker run here is just an example; for example, I’m passing --rm so that if you run this example just so that it won’t clutter up your container list. Your environment will have its own way to call docker run, how to get your VOLUMEs and EXPOSEd ports mapped, and discussing how to orchestrate your containers is out of scope for this post; you can pretty much run it however you like. Everything the image needs is built in at this point.

To review:

  1. have a common base container that contains all your non-Python (C libraries and utilities) dependencies. Avoid installing development tools here.
  2. use a virtualenv even though you’re in a container to avoid any surprises from the host Python environment
  3. have a “build” container that just makes the virtualenv and puts wheel and pip into it, and runs pip wheel
  4. run the build container with your application code in a volume as input and a wheelhouse volume as output
  5. create an application container by starting from the same base image and, once again not installing any dev tools, pip install all the wheels that you just built, turning off access to PyPI for that installation so it goes quickly and deterministically based on the wheels you’ve built.

While this sample application uses Twisted, it’s quite possible to apply this same process to just about any Python application you want to run inside Docker.

I’ve put a sample project up on Github which contain all the files referenced here, as well as “build” and “run” shell scripts that combine the necessary docker commandlines to go through the full process to build and run this sample app. While it defaults to the PyPy runtime (as most networked Python apps generally should these days, since performance is so much better than CPython), if you have an application with a hard CPython dependency, I’ve also made a branch and pull request on that project for CPython, and you can look at the relatively minor patch required to get it working for CPython as well.

Now that you have a container with an application in it that you might want to deploy, my previous write-up on a quick way to securely push stuff to a production service might be of interest.

(Once again, thanks to my employer, Rackspace, for sponsoring the time for me to write this post. Thanks also to Shawn Ashlee and Jesse Spears for helping me refine these ideas and listening to me rant about them. However, that expression of gratitude should not be taken as any kind of endorsement from any of these parties as to my technical suggestions or opinions here, as they are entirely my own.)

Syndicated 2015-03-06 22:58:00 from Deciphering Glyph

According To...?

I believe that web browsers must start including the ultimate issuer in an always-visible user interface element.

You are viewing this website at glyph.twistedmatrix.com. Hopefully securely.

We trust that the math in the cryptographic operations protects our data from prying eyes. However, trusting that the math says the content is authentic and secure is useless unless you know who your computer is talking to. The HTTPS/TLS system identifies your interlocutor by their domain name.

In other words, you trust that these words come from me because glyph.twistedmatrix.com is reasonably associated with me. If the lock on your web browser’s title bar was next to the name stuff-glyph-says.stealing-your-credit-card.example.com, presumably you might be more skeptical that the content was legitimate.

But... the cryptographic primitives require a trust root - somebody that you “already trust” - meaning someone that your browser already knows about at the time it makes the request - to tell you that this site is indeed glyph.twistedmatrix.com. So you read these words as if they’re the world according to Glyph, but according to whom is it according to me?

If you click on some obscure buttons (in Safari and Firefox you click on the little lock; in Chrome you click on the lock, then “Connection”) you should see that my identity as glyph.twistedmatrix.com has been verified by “StartCom Class 1 Primary Intermediate Server CA” who was in turn verified by “StartCom Certification Authority”.

But if you do this, it only tells you about this one time. You could click on a link, and the issuer might change. It might be different for just one script on the page, and there’s basically no way to find out. There are more than 50 different organizations which could certify that could tell your browser to trust that this content is from me, several of whom have already been compromised. If you’re concerned about government surveillance, this list includes the governments of Hong Kong, Japan, France, the Netherlands, Turkey, as well as many multinational corporations vulnerable to secret warrants from the USA.

Sometimes it’s perfectly valid to trust these issuers. If I’m visiting a website describing some social services provided to French citizens, it would of course be reasonable for that to be trusted according to the government of France. But if you’re reading an article on my website about secure communications technology, probably it shouldn’t be glyph.twistedmatrix.com brought to you by the China Internet Network Information Center.

Information security is all about the user having some expectation and then a suite of technology ensuring that that expectation is correctly met. If the user’s expectation of the system’s behavior is incorrect, then all the technological marvels in the world making sure that behavior is faithfully executed will not help their actual security at all. Without knowing the issuer though, it’s not clear to me what the user’s expectation is supposed to be about the lock icon.

The security authority system suffers from being a market for silver bullets. Secure websites are effectively resellers of the security offered to them by their certificate issuers; however, the customers are practically unable to even see the trade mark - the issuer name - of the certificate authority ultimately responsible for the integrity and confidentiality of their communications, so they have no information at all. The website itself also has next to no information because the certificate authority themselves are under no regulatory obligation to disclose or verify their security practices.

Without seeing the issuer, there’s no way for “issuer reputation” to be a selling point, which means there’s no market motivation for issuers to do a really good job securing their infrastructure. There’s no way for average users to notice if they are the victims of a targetted surveillance attack.

So please, browser vendors, consider making this information available to the general public so we can all start making informed decisions about who to trust.

Syndicated 2015-02-13 20:52:00 from Deciphering Glyph

Security as Stencil

Image Credit: Horia Varlan

On the Internet, it’s important to secure all of your communications.

There are a number of applications which purport to give you “secure chat”, “secure email”, or “secure phone calls”.

The problem with these applications is that they advertise their presence. Since “insecure chat”, “insecure email” and “insecure phone calls” all have a particular, detectable signature, an interested observer may easily detect your supposedly “secure” communication. Not only that, but the places that you go to obtain them are suspicious in their own right. In order to visit Whisper Systems, you have to be looking for “secure” communications.

This allows the adversary to use “security” technologies such as encryption as a sort of stencil, to outline and highlight the communication that they really want to be capturing. In the case of the NSA, this dumps anyone who would like to have a serious private conversation with a friend into the same bucket, from the perspective of the authorities, as a conspiracy of psychopaths trying to commit mass murder.

The Snowden documents already demonstrate that the NSA does exactly this; if you send a normal email, they will probably lose interest and ignore it after a little while, whereas if you send a “secure” email, they will store it forever and keep trying to crack it to see what you’re hiding.

If you’re running a supposedly innocuous online service or writing a supposedly harmless application, the hassle associated with setting up TLS certificates and encryption keys may seem like a pointless distraction. It isn’t.

For one thing, if you have anywhere that user-created content enters your service, you don’t know what they are going to be using it to communicate. Maybe you’re just writing an online game but users will use your game for something as personal as courtship. Can we agree that the state security services shouldn’t be involved in that?. Even if you were specifically writing an app for dating, you might not anticipate that the police will use it to show up and arrest your users so that they will be savagely beaten in jail.

The technology problems that “secure” services are working on are all important. But we can’t simply develop a good “secure” technology, consider it a niche product, and leave it at that. Those of us who are software development professionals need to build security into every product, because users expect it. Users expect it because we are, in a million implicit ways, telling them that they have it. If we put a “share with your friend!” button into a user interface, that’s a claim: we’re claiming that the data the user indicates is being shared only with their friend. Would we want to put in a button that says “share with your friend, and with us, and with the state security apparatus, and with any criminal who can break in and steal our database!”? Obviously not. So let’s stop making the “share with your friend!” button actually do that.

Those of us who understand the importance of security and are in the business of creating secure software must, therefore, take on the Sisyphean task of not only creating good security, but of competing with the insecure software on its own turf, so that people actually use it. “Slightly worse to use than your regular email program, but secure” is not good enough. (Not to mention the fact that existing security solutions are more than “slightly” worse to use). Secure stuff has to be as good as or better than its insecure competitors.

I know that this is a monumental undertaking. I have personally tried and failed to do something like this more than once. As the Rabbi Tarfon put it, though:

It is not incumbent upon you to complete the work, but neither are you at liberty to desist from it.

Syndicated 2015-01-25 00:16:00 from Deciphering Glyph

The Glyph

As you may have seen me around the Internet, I am typically represented by an obscure symbol.

The “Glyph” Glyph

I have been asked literally hundreds of times about the meaning of that symbol, and I’ve always been cryptic in response because I felt that a full explanation is too much work. Since the symbol is something I invented, the invention is fairly intricate, and it takes some time to explain, describing it in person requires a degree of sustained narcissism that I’m not really comfortable with.

You all keep asking, though, and I really do appreciate the interest, so thanks to those of you who have asked over and over again: here it is. This is what the glyph means.

Ulterior Motive

I do have one other reason that I’m choosing to publish this particular tidbit now. Over the course of my life I have spent a lot of time imagining things, doing world-building for games that I have yet to make or books that I have yet to write. While I have published fairly voluminously at this point on technical topics (more than once on actual paper), as well as spoken about them at conferences, I haven’t made many of my fictional ideas public.

There are a variety of reasons for this (not the least of which that I have been gainfully employed to write about technology and nobody has ever wanted to do that for fiction) but I think the root cause is because I’m afraid that these ideas will be poorly received. I’m afraid that I’ll be judged according to the standards for the things that I’m now an expert professional at – software development – for something that I am a rank amateur at – writing fiction. So this problem is only going to get worse as I get better at the former and keep not getting practice at the latter by not publishing.

In other words, I’m trying to break free of my little hater.

So this represents the first – that I recall, at least – public sharing of any of the Divunal source material, since the Twisted Reality Demo Server was online 16 years ago. It’s definitely incomplete. Some of it will probably be bad; I know. I ask for your forbearance, and with it, hopefully I will publish more of it and thereby get better at it.

Backstory

I have been working on the same video game, off and on, for more or less my entire life. I am an extremely distractable person, so it hasn’t seen that much progress - at least not directly - in the last decade or so. I’m also relentlessly, almost pathologically committed to long-term execution of every dumb idea I’ve ever had, so any minute now I’m going to finish up with this event-driven networking thing and get back to the game. I’ll try to avoid spoilers, in case I’m lucky enough for any of you ever actually play this thing.

The symbol comes from early iterations of that game, right about the time that it was making the transition from Zork fan-fiction to something more original.

Literally translated from the in-game language, the symbol is simply an ideogram that means “person”, but its structure is considerably more nuanced than that simple description implies.

The world where Divunal takes place, Divuthan, was populated by a civilization that has had digital computers for tens of thousands of years, so their population had effectively co-evolved with automatic computing. They no longer had a concept of static, written language on anything like paper or books. Ubiquitous availability of programmable smart matter meant that the language itself was three dimensional and interactive. Almost any nuance of meaning which we would use body language or tone of voice to convey could be expressed in varying how letters were proportioned relative to each other, what angle they were presented at, and so on.

Literally every Divuthan person’s name is some variation of this ideogram.

So a static ideogram like the one I use would ambiguously reference a person, but additional information would be conveyed by diacritical marks consisting of other words, by the relative proportions of sizes, colors, and adornments of various parts of the symbol, indicating which person it was referencing.

However, the game itself is of the post-apocalyptic variety, albeit one of the more hopeful entries in that genre, since restoring life to the world is one of the player’s goals. One of the things that leads to the player’s entrance into the world is a catastrophe that has mysteriously caused most of the inhabitants to disappear and disabled or destroyed almost all of their technology.

Within the context of the culture that created the “glyph” symbol in the game world, it wasn’t really intended to be displayed in the form that you see it. The player first would first see such a symbol after entering a ruined, uninhabited residential structure. A symbol like this, referring to a person, would typically have adornments and modifications indicating a specific person, and it would generally be animated in some way.

The display technology used by the Divuthan civilization was all retained-mode, because I imagined that a highly advanced display technology would minimize power cost when not in use (much like e-paper avoids bleeding power by constantly updating the screen). When functioning normally, this was an irrelevant technical detail, of course; the displays displayed what you want them to display. But after a catastrophe that has disrupted network connectivity and ruined a lot of computers, this detail is important because many of the displays were still showing static snapshots of a language intended to use motion and interactivity as ways to convey information.

As the player wandered through the environment, they would find some systems that were still active, and my intent was (or “is”, I suppose, since I do still hold out hope that I’ll eventually actually make some version of this...) that the player would come to see the static, dysfunctional environment around them as melancholy, and set about restoring function to as many of these devices as possible in order to bring the environment back to life. Some of this would be represented quite concretely as time-travel puzzles later in the game actually allowed the players to mitigate aspects of the catastrophe that broke everything in the first place, thereby “resurrecting” NPCs by preventing their disappearance or death in the first place.

Coen

COEN

Coen refers to the self, the physical body, the notion of “personhood” abstractly. The minified / independent version is an ideogram for just the head, but the full version as it is presented in the “glyph” ideogram is a human body: the crook at the top is the head (facing right); the line through the middle represents the arms, and the line going down represents the legs and feet.

This is the least ambiguous and nuanced of all the symbols. The one nuance is that if used in its full form with no accompanying ideograms, it means “corpse”, since a body which can’t do anything isn’t really a person any more.

Kset

KSET

This is the trickiest ideogram to pronounce. The “ks” is meant to be voiced as a “click-hiss” noise, the “e” has a flat tone like a square wave from a synthesizer, and the “t” is very clipped. It is intended to reference the power-on sound that some of the earliest (remember: 10s of thousands of years before the main story, so it’s not like individuals have a memory of the way these things sounded) digital computers in Divuthan society made.

Honestly though if you try to do this properly it ends up sounding a lot like the English word “cassette”, which I assure you is fitting but completely unintentional.

Kset refers to algorithms and computer programs, but more generally, thought and the life of the mind.

This is a reference to the “Ee” spell power rune in the 80s Amiga game, Dungeon Master, which sadly I can’t find any online explanation of how the manual described it. It is an object poised on a sharp edge, ready to roll either left or right - in other words, a symbolic representation of a physical representation of the algorithmic concept of a decision point, or the computational concept of a branch, or a jump instruction.

Edec

EDEC

Edec refers to connectedness. It is an ideogram reflecting a social graph, with the individual below and their many connections above them. It’s the general term for “social relationship” but it’s also the general term for “network protocol”. When Divuthan kids form relationships, they often begin by configuring a specific protocol for their communication.

This is how boundary-setting within friendships and work environments (and, incidentally, flirting) works; they use meta-protocol messages to request expanded or specialized interactions for use within the context of their dedicated social-communication channels.

Unlike most of these other ideograms, its pronunciation is not etymologically derived from an onomatopoeia, but rather from an acronym identifying one of the first social-communication protocols (long since obsoleted).

Zenk

ZENK

“Zenk” is the ideogram for creation. It implies physical, concrete creations but denotes all types of creation, including intellectual products.

The ideogram represents the Divuthan version of an anvil, which, due to certain quirks of Divuthan materials science that is beyond the scope of this post, doubles for the generic idea of a “work surface”. So you could also think of it as a desk with two curved legs. This is the only ideogram which represents something still physically present in modern, pre-catastrophe Divuthan society. In fact, workshop surfaces are often stylized to look like a Zenk radical, as are work-oriented computer terminals (which are basically an iPad-like device the size of a dinner table).

The pronunciation, “Zenk”, is an onomatopoeia, most closely resembled in English by “clank”; the sound of a hammer striking an anvil.

Lesh

LESH

“Lesh” is the ideogram for communication. It refers to all kinds of communication - written words, telephony, video - but it implies persistence.

The bottom line represents a sheet of paper (or a mark on that sheet of paper), and the diagonal line represents an ink brush making a mark on that paper.

This predates the current co-evolutionary technological environment, because appropriately for a society featured in a text-based adventure game, the dominant cultural groups within this civilization developed a shared obsession for written communication and symbolic manipulation before they had access to devices which could digitally represent all of it.

All Together Now

There is an overarching philosophical concept of “person-ness” that this glyph embodies in Divuthan culture: although individuals vary, the things that make up a person are being (the body, coen), thinking (the mind, kset), belonging (the network, edec), making (tools, zenk) and communicating (paper and pen, lesh).

In summary, if a Divuthan were to see my little unadorned avatar icon next to something I have posted on twitter, or my blog, the overall impression that it would elicit would be something along the lines of:

“I’m just this guy, you know?”

And To Answer Your Second Question

No, I don’t know how it’s pronounced. It’s been 18 years or so and I’m still working that bit out.

Syndicated 2015-01-12 09:00:00 from Deciphering Glyph

Docker Dev to Prod in Just A Few Easy Steps

It seems that docker is all the rage these days. Docker has popularized a powerful paradigm for repeatable, isolated deployments of pretty much any application you can run on Linux. There are numerous highly sophisticated orchestration systems which can leverage Docker to deploy applications at massive scale. At the other end of the spectrum, there are quick ways to get started with automated deployment or orchestrated multi-container development environments.

When you're just getting started, this dazzling array of tools can be as bewildering as it is impressive.

A big part of the promise of docker is that you can build your app in a standard format on any computer, anywhere, and then run it. As docker.com puts it:

“... run the same app, unchanged, on laptops, data center VMs, and any cloud ...”

So when I started approaching docker, my first thought was: before I mess around with any of this deployment automation stuff, how do I just get an arbitrary docker container that I've built and tested on my laptop shipped into the cloud?

There are a few documented options that I came across, but they all had drawbacks, and didn't really make the ideal tradeoff for just starting out:

  1. I could push my image up to the public registry and then pull it down. While this works for me on open source projects, it doesn't really generalize.
  2. I could run my own registry on a server, and push it there. I can either run it plain-text and risk the unfortunate security implications that implies, deal with the administrative hassle of running my own certificate authority and propagating trust out to my deployment node, or spend money on a real TLS certificate. Since I'm just starting out, I don't want to deal with any of these hassles right away.
  3. I could re-run the build on every host where I intend to run the application. This is easy and repeatable, but unfortunately it means that I'm missing part of that great promise of docker - I'm running potentially subtly different images in development, test, and production.

I think I have figured out a fourth option that is super fast to get started with, as well as being reasonably secure.

What I have done is:

  1. run a local registry
  2. build an image locally - testing it until it works as desired
  3. push the image to that registry
  4. use SSH port forwarding to "pull" that image onto a cloud server, from my laptop

Before running the registry, you should set aside a persistent location for the registry's storage. Since I'm using boot2docker, I stuck this in my home directory, like so:

1
me@laptop$ mkdir -p ~/Documents/Docker/Registry

To run the registry, you need to do this:

1
2
3
4
5
6
7
8
me@laptop$ docker pull registry
...
Status: Image is up to date for registry:latest
me@laptop$ docker run --name registry --rm=true -p 5000:5000 \
    -e GUNICORN_OPTS=[--preload] \
    -e STORAGE_PATH=/registry \
    -v "$HOME/Documents/Docker/Registry:/registry" \
    registry

To briefly explain each of these arguments - --name is just there so I can quickly identify this as my registry container in docker ps and the like; --rm=true is there so that I don't create detritus from subsequent runs of this container, -p 5000:5000 exposes the registry to the docker host, -e GUNICORN_OPTS=[--preload] is a workaround for a small bug, STORAGE_PATH=/registry tells the registry to look in /registry for its images, and the -v option points /registry at the directory we previously created above.

It's important to understand that this registry container only needs to be running for the duration of the commands below. Spin it up, push and pull your images, and then you can just shut it down.

Next, you want to build your image, tagging it with localhost.localdomain.

1
2
me@laptop$ cd MyDockerApp
me@laptop$ docker build -t localhost.localdomain:5000/mydockerapp .

Assuming the image builds without incident, the next step is to send the image to your registry.

1
me@laptop$ docker push localhost.localdomain:5000/mydockerapp

Once that has completed, it's time to "pull" the image on your cloud machine, which - again, if you're using boot2docker, like me, can be done like so:

1
2
3
me@laptop$ ssh -t -R 127.0.0.1:5000:"$(boot2docker ip 2>/dev/null)":5000 \
    mycloudserver.example.com \
    'docker pull localhost.localdomain:5000/mydockerapp'

If you're on Linux and simply running Docker on a local host, then you don't need the "boot2docker" command:

1
2
3
me@laptop$ ssh -t -R 127.0.0.1:5000:127.0.0.1:5000 \
    mycloudserver.example.com \
    'docker pull localhost.localdomain:5000/mydockerapp'

Finally, you can now run this image on your cloud server. You will of course need to decide on appropriate configuration options for your applications such as -p, -v, and -e:

1
2
3
4
me@laptop$ ssh mycloudserver.example.com \
    'docker run -d --restart=always --name=mydockerapp \
        -p ... -v ... -e ... \
        localhost.localdomain:5000/mydockerapp'

To avoid network round trips, you can even run the previous two steps as a single command:

1
2
3
4
5
6
me@laptop$ ssh -t -R 127.0.0.1:5000:"$(boot2docker ip 2>/dev/null)":5000 \
    mycloudserver.example.com \
    'docker pull localhost.localdomain:5000/mydockerapp && \
     docker run -d --restart=always --name=mydockerapp \
        -p ... -v ... -e ... \
        localhost.localdomain:5000/mydockerapp'

I would not recommend setting up any intense production workloads this way; those orchestration tools I mentioned at the beginning of this article exist for a reason, and if you need to manage a cluster of servers you should probably take the time to learn how to set up and manage one of them.

However, as far as I know, there's also nothing wrong with putting your application into production this way. If you have a simple single-container application, then this is a reasonably robust way to run it: the docker daemon will take care of restarting it if your machine crashes, and running this command again (with a docker rm -f mydockerapp before docker run) will re-deploy it in a clean, reproducible way.

So if you're getting started exploring docker and you're not sure how to get a couple of apps up and running just to give it a spin, hopefully this can set you on your way quickly!

(Many thanks to my employer, Rackspace, for sponsoring the time for me to write this post. Thanks also to Jean-Paul Calderone, Alex Gaynor, and Julian Berman for their thoughtful review. Any errors are surely my own.)

Syndicated 2014-12-09 02:14:00 from Deciphering Glyph

Public or Private?

If I am creating a new feature in library code, I have two choices with the implementation details: I can make them public - that is, exposed to application code - or I can make them private - that is, for use only within the library.

https://www.flickr.com/photos/skyrim/6518329775/

If I make them public, then the structure of my library is very clear to its clients. Testing is easy, because the public structures may be manipulated and replaced as the tests dictate. Inspection is easy, because all the data is exposed for clients to manipulate. Developers are happy when they can manipulate and test things easily. If I select "public" as the general rule, then developers using my library will be happy, because they'll be able to inspect and test everything quite easily whether I specifically designed in support for that or not.

However, now that they're public, I have to support them in their current form into the forseeable future. Since I tend to maintain the libraries I work on, and maintenance necessarily means change, a large public API surface means a lot of ongoing changes to exposed functionality, which means a constant stream of deprecation warnings and deprecated feature removals. Without private implementation details, there's no axis on which I can change my software without deprecating older versions. Developers hate keeping up with deprecation warnings or having their applications break when a new version of a library comes out, so if I adopt "public" as the general rule, developers will be unhappy.

https://www.flickr.com/photos/lisasaunders18/14061397865

If I make them private, then the structure of my library is a lot easier to understand by developers, because the API surface is much smaller, and exposes only the minimum necessary to accomplish their tasks. Because the implementation details are private, when I maintain the library, I can add functionality "for free" and make internal changes without requiring any additional work from developers. Developers like it when you don't waste their time with trivia and make it easier to get to what they want right away, and they love getting stuff "for free", so if I adopt "private" as the general rule, developers will be happy.

However, now that they're private, there's potentially no way to access that functionality for unforseen use-cases, and testing and inspection may be difficult unless the functionality in question was designed with an above-average level of care. Since most functionality is, by definition, designed with an average level of care, that means that there will inevitably be gaps in these meta-level tools until the functionality has already been in use for a while, which means that developers will need to report bugs and wait for new releases. Developers don't like waiting for the next release cycle to get access to functionality that they need to get work done right now, so if I adopt "private" as the general rule, developers will be unhappy.

Hmm.

Syndicated 2014-11-28 23:25:00 from Deciphering Glyph

5 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!