benad is currently certified at Apprentice level.

Name: Benoit Nadeau
Member since: 2002-04-03 17:23:09
Last Login: 2014-04-22 17:48:00

FOAF RDF Share This

Homepage: http://benad.me/

Notes:

I am Benoit Nadeau, jr. eng. in Software Engineering,
living and working in Montreal, Canada.

Projects

Recent blog entries by benad

Syndication: RSS 2.0

OpenSSL: My Heart is Bleeding

After a week, I think I can comfortably explain what happened with this "heartbleed" OpenSSL bug. Now, everybody make mistakes. Especially programmers. Especially me. But at least my errors didn't create a major security hole in 20% of the Internet. Let's review some basic tenets of Software Engineering:

  1. All code (of minimal size and complexity) has bugs. Less code and complexity (and functionality) means less bugs.
  2. Software should be made resilliant against errors. If it can't, it should at least halt (crash).
  3. Software should be designed for Humans, both the code and user interface.

Out of hubris, excess and arrogance, the OpenSSL developers managed to do the opposite of all of these tenets. To quote Theo de Raadt:

OpenSSL is not developed by a responsible team.

Why? Let's do some investigation.

First, Robin Seggelmann had this idea to add a completely unnecessary "heartbeat" feature to TLS. Looking at the protocol design alone, the simple fact that the size of the payload exists in two different places (TLS itself and Heartbeat) is pretty bad and begs for a security hole. Anyway, tenet one.

Still, Seggelmann went ahead and sent working code a year later, on December 31st, at 11:59 PM, the best time for a code review. Of course, the code is filled with non-descriptive variable names which hide the error in plain sight during the ineffective code review, but given the poor quality of the OpenSSL code, they find this acceptable. That's tenet three.

At this point, you may ask: "Shouldn't most modern malloc implementations minimally protect software against buffer overflows and overreads?" If you did, you are correct. But then, years ago, OpenSSL implemented their own memory allocation scheme. If you try to revert that back to plain malloc, OpenSSL doesn't work anymore because its code has bugs that depends on memory blocks being recycled in LIFO fashion. That's tenet two.

The result is bad, and very, very real. In Canada, nearly a thousand Social Insurance Numbers were leaked. And that doesn't count or even start to imagine how many private keys and information leaked like that over the past two years.

By the way, this kind of mess have been my experience with cryptographic software. The usability problem with cryptography isn't just for end users, but also the code itself. Using single-letter variables in a mathematical context where each variable is described at length may be acceptable, but meaningless variable letters without comments in code isn't. While I don't mind much about such "math code" in data compression, for security this makes the code less likely to be secure. Basically, everybody think that being smart is sufficient for writing good code, so of course they would be offended if a software engineer would recommend writing the code from their specs instead of letting them do it themselves. No wonder the worst code always comes from university "careerists".

Personally, I'd stop using OpenSSL as soon as possible.

Syndicated 2014-04-15 01:29:07 from Benad's Blog

A Week with Android

Given my reputation as a "Mac Guy", one would expect me to be ignorant of other platforms. It's actually quite the opposite since, at work, I am a well-experienced Windows and Linux developer. I've only recently started programming on iOS, so I thought it would be a good idea to also learn to develop on the Android platform. While not strictly necessary for software development, I bought an Android phone, to get used to its environment and idioms. To get the best possible experience of the platform, I bought a Nexus 5, which I received by mail a little bit over a week ago. Already, I observed a few differences with iOS beyond the typical "iOS versus Android" feature comparisons.

I quickly noticed how much "Android versus iOS" has parallels with "PC versus Mac" in terms of the software platform. Android is highly customizable, though it's unclear if that was done for the users or to appease the manufacturers and carriers. If I haven't gotten a phone directly from Google, I suspect that buying a typical non-Google, bundled-with-a-plan Android phone would have given me something filled with crapware I couldn't uninstall. Like a PC. That, and given how much havoc rogue Apps can do, I immediately installed an anti-virus, firewall and backup software. Again, like a PC.

Then, there's the screen. It's spectacular. At 445 dpi, it may be the first screen I've ever used where I can't distinguish its pixels at any distance. Cramming a full-sized 1080p screen in 5 inches is amazing. The colours are great too. Still, the screen is physically too large for casual one-hand use. It almost feels like a mini-tablet rather than a phone. Also, its large screen size has an impact on battery life.

Speaking of battery life, when the screen is off, battery life is spectacular. Sure, comparing any new device against the aging battery of a 30-month-old iPhone is unfair. But this Nexus 5 can be casually used for days before fully draining its battery. Oh, and that induction charging is really nice too, compared to fighting with those asymmetrical micro-USB cables… Of course, its battery life depends on well-behaved Apps. As an example, a runaway Songza App leaked a CPU spinlock, causing it to keep the CPU fully powered for hours. Even in this extreme scenario, the battery life could still be compared to my iPhone.

Speaking of music Apps, audio on the Nexus 5 suck. Using the exact same high-quality headphones, I can tell that the headphone jack (or whatever else in the audio chain) is significantly worse than the iPhone. There are many Apps to add equalizers and bass boosters, but even then it still doesn't sound as good. Also, volume controls from my headsets don't work. Well, for now all of my music is in the Apple ecosystem, so I don't mind using my SIM-card-less iPhone as an iPod.

I'll be comparing iOS and Android development once I'm more experienced in both. For now, I'll start getting used to it, both from a user and developer perspective.

Syndicated 2014-04-03 00:25:41 from Benad's Blog

Going Root

So, I've just changed the host for my web site. Not because I was unhappy with the excellent Fused, but because, well, I've outgrown shared hosting. If my web site ought to be some kind of professional image, then at the very least I should own the entire software stack, and not be restricted with whatever versions of Apache or PHP the host decide to rent me.

A decade ago, having a web site meant choosing between shared hosting and "bare metal", but since then VPS (virtual private servers) became a viable option that sits in between. For the past few years I was put off VPS because of the costs, using Amazon EC2 as a reference. Considering that I have to maintain a complete server, paying $50 a month for the privilege to do so was just too much for me.

Luckily, pricing for VPS became quite competitive, especially for a web site as simple as mine. So I tried out RamNode, and after a few weeks of use I trust it enough to move my web site to it, and save a hundred dollars per year in the process.

Migrating to a Debian 7 environment was a bit of a challenge. My background, due to my workplace experience, is mostly with RedHat environments, so it took a little while to adapt to Debian. I really like how lean and fast a well-configured Linux server can be, and having automated security updates on a locked-down stable Debian environment is comforting. I still don't host any dynamic content on my site, but if I ever do, it will be my code, not some 3rd-party PHP script that will become another attack vector.

In the end, nobody will notice, but it really like the idea that I now fully own "my web site", from the IP address to the domain name, down to the version of the web server and operating system, short of the hardware. In a world where most people's online presence is done entirely through social networks that treat your content at their whim, owning a web site is the closest thing to complete free speech one can have. That, and it's a cool thing to add to my résumé.

Syndicated 2014-03-30 20:06:56 from Benad's Blog

The Syncing Problem

Conflicts Everywhere

For the past few months I've been using a simple iOS app to track the episodes I've seen in the TV shows I follow. One of its advertised features is its ability to synchronize itself to all of your iOS devices, in my case my iPhone and iPad. Well, that feature barely works. Series that I would remove from my watch list would show up again the next time I launch the app. Or I would mark an episode as watched and within a few seconds it would show up again.

I can't blame the developer: Apple promised that the iOS Core Data integration with iCloud would make synchronization seamless, and yet until iOS 7 it would regularly corrupt your data silently. Worse, it's over-simplistic API hide the countless pitfalls about its use, and I suspect the developer of my simple TV show tracker fell in every single one of them.

Synchronizing stuff across devices isn't a particularly new problem. Personally, I've never experienced synchronizing a PDA with a dock and some proprietary software, but I've experienced a bit the nightmare of synchronizing my old Motorola RAZR phone with iSync and various online address book services. It wasn't just that the various software interpreted the address book structure differently, causing countless duplicated entries, but also that if I dared modifying my address book from multiple devices before a sync, it would surely create conflicts that I would have to fix everywhere.

Nowadays, synchronization issues still loom over every "cloud-based" service, be if for note-taking like Evernote or seamless file synchronization like Dropbox and countless others. In every case, synchronization conflicts are delegated to an "exceptional scenario" for which the usability is horrendous: If you're lucky, elements are renamed, if not, data gets deleted. In almost every case, you'll have to find back previous versions of your data, while keeping older versions of your data is considered a "premium" feature.

This is frustrating. Data synchronization across electronic devices is not something new, and this is now becoming commonplace. Even with a single client device, the server could use a highly-scalable "NoSQL" database that can make the storage service out-of-sync with itself. Imagine how surprised I was when I learned that the database developer is responsible to handle document conflicts in CouchDB, and that those conflicts would "naturally" happen if the database is replicated across multiple machines! The more distributed the data, be it on client-side devices or on the server side, the more synchronization conflicts will inevitably happen.

And yet, for several years, programmers have been daily using tools that solved this problem...

DVCS

Going back to the example of the TV show tracker, one of the problem with most synchronization systems is that they synchronize documents rather then the user action themselves. If I marked a TV show episode as "watched", then, unless for some reason I marked it as "unwatched" on another device, the show tracker should never come up with the action of "unwatching" an episode.

Similarly, if on one device I mark an episode as "watched", then revert it back to "unwatched", yet on another device that is unaware of those action I completely remove the TV series, then the synchronization should prioritize the latter. Put another way, conflicts are often resolved by taking into context on what state of the data the action was made.

In addition to storing the data model, each device should also track the sequence of actions done on each device. Synchronization would value the user actions first, and then resolve back those actions into the data model. In effect, the devices should be tracking divergeant user action histories.

Yeah, that's what we call "Distributed Version Control Systems". Programmers have been using DVCS like git, Mercurial and many others for over a decade now. If the solution have been a part of our programming tools for so long, why are we still having problems with automated synchronization in our software?

It's a Plain Text World

Sadly, DVCS are made almost solely for plain-text programming source code. Not only that, but marking user actions as "commits" is carefully done manually, and so are resolving synchronization conflicts. Sure, they can support binary files, but they won't be merged like other text files.

As for how DVCS treat a group of files, it went to the complete opposite of older version control systems. In the past, each file had its own set of versions. This is an issue with source code, since usually files have interdependencies that the version control system isn't aware of. Now, with DVCS, a "version" is the state of all the files in the "repository", so changing two different files on two different devices require user intervention as this could cause a conflict from the point of view of those implicit interdependencies. Sure, you can make each file its own repository with clever use of "sub-repositories", but the overhead of doing so in most DVCS makes this impractical.

For individual text files, DVCS (and most VCS in general) treat them as a sequence of text lines. A line of text is considered atomic, and computed deltas between two text files will tend to promote large chunks of sequential line modifications over individual changes. While this may work fine for most programming languages, a few cases cause issues. As an example, in C, it is common practice to indent lines with space characters to make the code more readable, and to increase indentation for each increasing level on nesting. Changing the code a little bit could affect the nesting level of some code block, and by good practice, the indentation level. The result, from a DVCS perspective, is that the entire code block changed because every line had space characters prepended, even if in reality the code semantic is identical for that block.

Basically, DVCS are really made only for plain text documents. And yet, with some effort, some of their underlying design could be used to build a generic and automated synchronization system.

The Long and Snowy Road

Having a version control system on data structures made by software rather than plain text files made by humans isn't particularly new. Microsoft Word does it internally. Photoshop also, through it's "infinite undo" system.

Computing the difference between two file structures is surely something that was solved ages ago by computer scientists. In the end, if each file is just a combination of well-known structures like lists, sets, keyed sets and so on, then one could easily make a generic file difference engine. Plain text files, viewed as simple lists of lines, or even to a certain extent binary files, viewed as lists of bytes, could easily fit in such an engine.

There would be a great effort required to change any existing "programmer's DVCS" into a generic synchronization engine. Everything is oriented towards user interaction, source code text files, deltas represented as the result of a diff command, a single history graph for all the files in the repository, and so on. But on the other way, a generic DVCS can be used as a basis for a programmer's DVCS without too much issue. It would be even quite amazing for it to version control not only the source code, but also the source annotated with compiler output, so that the conflict resolution system would be dealing with the semantics of the source code and not just plain text.

There are a few novel issues that are specific to device synchronization that would affect a generic DVCS, for example having a mechanism to safely prune old data history that is not needed anymore because all devices are up-to-date (to a known point). In fact, unlike existing DVCS, maybe not all devices need to support peer-to-peer synchronization.

But implementing a generic DVCS for automated synchronization do feel like reinventing the wheel. At minimum, things that were already implemented in today's DVCS have to be redone in almost the same way, but with a few important changes to make them work on generic data structures.

On top of that, I have the nagging feeling somebody, somewhere already did this. Or maybe some asshole patented it and then nobody can implement it anymore for fear of being sued into oblivion. This isn't a novel idea by that much (so it shouldn't be patented anyway). Many already considered backing up their entire computers with version control meant for source code, and while impractical, it kind of worked. The extra step of integrating that into a software's data model serialization isn't a big mental leap either. Apple could have done it with Core Data on iCloud, but, like everybody else, synchronization conflicts was an afterthought.

Right now, if I had to write some software that synchronize data with other devices, there would be no generic solution I could use that would implement a generic DVCS, so I would surely do my best to ignore the hard problem of data conflicts until enough users complain. Or maybe by then I'll get fed up and write my own generic DVCS, if nobody else did it first.

Note to self: I may want to post this as an article on my web site.

Syndicated 2014-02-17 00:00:49 from Benad's Blog

Back to Objective-C

In lieu of the usual annual resolutions, this year I made use of some of those vacation days to plan for what programming languages I will learn in the following months. In the past few years, I learned Python, Erlang and D, but sadly I couldn't come up with any excuse to use them in a programming project, be it at home or at work.

A factor that I glossed over when choosing a programming language is its usefulness in the "work market", be it my current work, seeking a new job, or even self-employment. Another factor is what other non-programmers expected out of me. And, of course, being a user of many Apple products, people wonder when they'll see my iPhone App.

I did learn a little bit of Objective-C about a decade ago, at the beginnings of Mac OS X. I did some very basic "Cocoa" desktop development, though back then the programming environment for it was more NextStep's than Apple's. Since then, I focused so much on UNIX, Linux and server-side development that it felt weird to be using a Mac yet forgot how to write desktop software for it.

Unlike Erlang and D, there are surely tons of books for iOS development, and a few for Mac. I will have a shock using the latest XCode, but I can deal with that. While Mac development is unrestricted, my greatest worry is about iOS', with its restrictive API and closed App Store. Do I need to have a corporation to publish an App? Will it be restricted to the Canadian store? If I want to place ads in the App, how do I deal with the revenue?

Of course, I could just do stuff "in-house", be it with their method to deploy Apps for testing purposes, or by using a jailbroken device. That might be preferable in my case, since I'm not doing this to make money with a "crazy idea" but simply for practice. Still, there are some fees ($100 a year?).

But there is the greater issue that developing for iOS feels like developing for the Java virtual machine, with none of the benefits of the virtual machine. I always liked the idea of developing as close to the OS' kernel as possible, and not be at the mercy of some restrictive APIs. Over the years, I became closer to a Linux developer than Windows, Mac, Android or iOS. In an ideal world, I guess I'd be an open-source developer.

Still, I think it's a good time to jump back in Mac and iOS development. The Mac Cocoa APIs surely matured in the past decade, without the cruft of the "Carbon" API from its Pascal days and the cross-compilation compatibilities with PowerPC. As for iOS, the new visual style of iOS 7 is perfect for a drawing-inept developer like me, as all I need is good and clean spacing of tasteful fonts in hand-picked colours, and not the high-resolution pixellated Photoshop mess of the past few years.

As for what software I'll write, I don't know. Very likely it will be a wholly unoriginal idea that I will do for myself first because the currently existing solutions either do it poorly or not to my taste. Then release them as free and add that to a proper "software portfolio". Compared to the crazy challenges I've had in the past few years, iOS development shouldn't be that difficult…

Syndicated 2014-01-05 19:48:27 from Benad's Blog

97 older entries...

 

benad certified others as follows:

  • benad certified benad as Journeyer
  • benad certified llasram as Journeyer
  • benad certified shlomif as Journeyer

Others have certified benad as follows:

  • benad certified benad as Journeyer
  • llasram certified benad as Journeyer
  • pasky certified benad as Apprentice

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page