Older blog entries for nbm (starting at number 107)

Google I/O: Underneath the covers at Google

This session deserves a much longer post, but I just wanted to put down the most interesting stuff quick.  Basically, a back-end developers guide of how Google is put together - from how a request that someone does in a browser gets a response to how those responses are put together from multiple sources and how those sources are built up.

Everyone knows Google's love of lots of commodity hardware for their servers, but it was interesting to hear some other things - reasonably low-end networking gear too.  Otherwise, that they've back where they started in terms of machines without cases shoved into in-house-designed racks.  The scale has changed dramatically, of course.

"If you have 10k servers, expect to lose 10 a day..."

GFS's masters are same server hardware as slaves - take part in master election like any other machine.  Google puts "millions" of pages together in a GFS "file", since it uses 64MB chunks.  200+ clusters, many of them 1000s of machines, pools of 1000s of clients.  4+PB filesystems, 40GB/s read/write load (even while HW is failing constantly).

MapReduce usage within Google is growing fast - 700 new applications in a recent month at peak, currently around 10k applications.  From 171k MapReduce jobs in March 2006 to 2.2 million jobs in September 2007.  MapReduce is very optimised to keep jobs near the data they need to conserve precious network speed within the datacentre.

Google still has one large shared source base(!), from low-level libraries used by anything to domain-specific libraries to applications.  Benefits are that it's easy to find examples of usage of something so you can use it correctly, and to reuse (ie, as a library).  Drawbacks being that such reuse causes some fairly tangled dependencies.

Language usage at Google: C++ for all high-performance, commonly-accessed web stuff.  Java is used for less-performance-oriented and/or lower-volume applications.  Python is used behind the scenes for things like configuration, administration, &c.

Syndicated 2008-05-29 01:39:39 from Cosmic Seriosity Balance

Announcements from the Google I/O keynote, Google App Engine opens signups

Some interesting news was delivered during the Google I/O keynote.

In terms of Google App Engine, the announcement that got the biggest applause was that it was now open to all signups - no waiting list and a few tens of thousands of developers.

Beyond that, the two new APIs were announced - the memcache API and the Image API.

Some pricing expectations for usage beyond the free chunk given to you were given:

  • CPU: 5 million "average" page views free, 10-12c per core-hour thereafter
  • Storage: 500MB free, 15-18c per GB-month thereafter.
  • Incoming traffic: 5 million "average" page views, 11-13c/GB thereafter
  • Outgoing traffic: 5 million "average" page views, 9-11c/GB thereafter

The Google Web Toolkit 1.5 release candidate was released today, which brings Java 5 language features.

In terms of OpenSocial, the 0.8 version specification was released yesterday, and that AOL has joined the OpenSocial initiative.

Syndicated 2008-05-28 22:19:54 from Cosmic Seriosity Balance

Google I/O keynote - moving the web forward

The Google I/O keynote was entitled Client, Connectivity, and the Cloud.  The message was that "Google cares about moving the web forward".

The obvious question is why, and four reasons were given:

  • Google is a company that has only existed and could only have existed because of the web.  Thus, moving the web forward is improving what they can deliver.
  • It's a virtuous cycle - richer web apps can reach more users, and this means more usage, and this means more revenue.
  • More softly, since Google is a company of the web generation, and that's how the web was made - consensus and partnership.
  • Again softly, Google feels a debt of gratitude to the open source and the web community. 

How will they move the web forward (and, perhaps, how should the web be moved forward?).  The history of challenges and benefits of the historical and current model of computing was given:

The mainframe had a lot of power (for the time), but was not easy to get your hands on it.  Deployment was easy, since you installed your software on one computer and used dumb terminals to use it.

The PC meant accessibility, but meant less power.  It lost the ease of deployment because you had to support a variety of hardware, operating system, and applications/libraries.

The web brought the return of easy deployment with deployment on servers you control, and the use of the (relatively) dumb browser.  However, supporting scale in your application means needing "cloud computing" (perhaps a stretch for most current applications, but certainly true if you're aiming high), and the "accessibility" of clouds is currently not that great.

So, how should it move forward?

  • Making the client more powerful
  • Keeping connectivity pervasive
  • Making the cloud (ie, resources/power) more accessible 

In terms of Client, Google Gears was explained and demonstrated.  Basically, Gears is about extending the current browser to enable more rich applications.  It's not just offline/cacheability - Allen Hurff from MySpace showed off how asynchronous threads, SQL database and full-text search allow their mess age view to allow sorting and (more impressively) search without leaving the client.  The speed possible with this isn't what we expect from the web today.

In terms of Connectivity, Android was brought out.  What wasn't immediately obvious to me before, Android is a full stack for mobile phones, not just OS, framework, libraries, and so forth.  The example wasn't all that impressive, since only one phone was used, but the idea of multiple phones with different abilities having a consistent set of applications and interaction is compelling.

The Cloud discussion was perhaps of most interest to me, since it discussed Google App Engine.  It was described as a way of making aspects of the Google infrastructure available to developers (ie, not just "machines").  It is supposed to take care of all the problems you have outside of your own application - ie, not having to worry about setting up machines, installing the OS, maintaining the OS and applying security updates, logs, monitoring, and so forth.

The key goals of Google App Engine is to allow it to be easy to develop, easy to scale, and free to start.

Beyond or within these three main ways forward, three other projects were given attention: Google's GData APIs, Google Web Toolkit, and OpenSocial.

Syndicated 2008-05-28 22:06:40 from Cosmic Seriosity Balance

In Sunny San Francisco, at Google I/O

I won't say anything about the trip up, but I've been in San Francisco since Sunday afternoon (local time).  Monday was Memorial Day and the members of the SynthaSite team at the time I joined in November last year decided that we should get together for brunch, and then we headed out to see some sights around the area, taking a quick trip over the Golden Gate bridge and into Sausalito.

It was an overcast day, but I did get a glimpse of how beautiful the city can be.

Tuesday I pretended I had the capacity to do work and visited the SynthaSite offices in San Francisco.  Didn't get much done, but felt the office here in San Francisco shared a similar vibe to the one in Cape Town (although open plan feels weird after being in a three/four-person room for so long).

I've made it to Google I/O, which is way bigger than I expected it to be (and, according to one of the shirted staff-members, more than they originally expected too).  Hundreds of people were registering when I arrived, and by the time I got to the front of the A-B table (which was a good 75 people long), the A-B queue was longer than when I arrived.

Will try write something after each session, assuming the wireless works better than it does now...

Syndicated 2008-05-28 21:35:42 from Cosmic Seriosity Balance

In San Francisco next week, Google I/O, Sebastopol Pylons/WSGI Sprint

On Saturday, I'm heading off to San Francisco to attend Google I/O and also spend some time with my colleagues at SynthaSite in our US office.  Of most interest at the conference (at least in my personal capacity) is Google App Engine, but pretty much everything sounds interesting (with GWT being the big exception), and I can just imagine that making the decisions on what sessions to attend will be hard to do.  (And, you know, I guess I'm supposed to keep an eye out for things that might be useful to the company, or something...)

Over the weekend, I'll hopefully be heading to Sebastopol (in California Wine Country) for the Pylons/WSGI Sprint being held at O'Reilly Media's headquarters there.  There's two days of sprints, and I'm hoping to be there for most of both days - but it depends on travel arrangements.  If I get the time, I hope I can pop out and see a bit of the surrounding country and maybe one or two of those "places of interest".

In between the gatherings and travel, and before I head back, I'll spend time at the SynthaSite offices, doing what I'd generally be doing in Cape Town, but with better connectivity and less rainy cold winter.

If you want to catch me while I'm in San Francisco (or in London for the half-day I'll be there on the trip back) send me an email or leave a comment.

Syndicated 2008-05-22 23:03:13 from Cosmic Seriosity Balance

CTPUG in the Global Python Sprint weekend

On Saturday (May 10th) the Cape Town Python User Group held a Python Sprint meeting as part of the Global Python Sprint weekend.  8 or so of us got together on and off from 10:30am until about 9:30pm at the SynthaSite offices around a table and worked through 10 or so issues in the Python issue database.

Thanks to The Other Neil and Simon for most of the organisation effort, and to them and Adrianna, Russell, Jonathan, Jeremy, Brad, and David for coming through and taking part.

And thanks to SynthaSite for coffee, coke, crisps, chocolates, and other goodies.

According to The Other Neil, we worked on:

Syndicated 2008-05-12 15:28:46 from Cosmic Seriosity Balance

A team apart

For about two weeks, ending about two weeks ago, we had a full house of current employees at the SynthaSite offices in Cape Town - which has allowed everyone to get to know everyone else both at work and at play.  Over the past two weeks and continuing for another week or so, people have been heading back to the US office or heading to work from there for the known future.

The time together was great and necessary, and the time apart is necessary also, but it's hard to not want to see my new and old friends at the office.  The offices feel too quiet (although we've got new friends starting next week).

It is early days yet, but I know from previous experience how distance can allow one to treat people unfairly - it is easier to disappoint and easier to pretend to forget and easier to believe that the other is being stupid or lazy when you don't see each other regularly.  Yes, even geeks.

I'm quite interested in the challenge of making this not happen, and I'm hoping to see how our experiments in project management and communication and structure turn out.

I identified tools, process and people as our main strengths that will help us get through this new period, and then realised they were also our greatest challenges.  It's amazing how much your outlook can affect how you feel about a prospect like this.  If you start out, like I did, with "We've always been good with tools, but...", it leaves you feeling like you're entering a big unknown without much help.  But if you say "This might mean having to retool somewhat, but we've learned a lot about getting tools right", it makes you feel up for the fight.

I'll try write up my observations as they happen - although this recent three week break wasn't for lack of things to write but more for lack of the energy to write.  (I'll try catch up, but no promises...)

Syndicated 2008-04-30 08:56:55 from Cosmic Seriosity Balance

Traffic accounting with ulogd, by Stefano

When I first started at the Bandwith Barn, the traffic accounting that such an environment required just wasn't available off-the-shelf or in the open source world.  I've often been asked for the hacking combination of scripts and pmacct that maintain the Bandwidth Barn traffic system - which includes "buying" more monthly traffic, setting traffic limits per month per person, up-to-date graphs of usage per protocol and per client available to each company in the Barn, and months of historical data in case of queries or complaints about the billing.

Looks like ulogd, some iptables rules, and a few simple cronned SQL scripts make this a lot easier these days, thanks to this post about ulogd for bandwidth accounting by Stefano.

Syndicated 2008-04-07 13:49:14 from Cosmic Seriosity Balance

SynthaSite planning week and boat trip

Pictures from the boat trip

This past week at SynthaSite has been the first with the full newly-expanded international team together in the Cape Town office. This has been an opportunity to get to know the new hires and for everyone to come together with their ideas and come up with goals, plans, and specifications. Which meant a week with at least one meeting going on at any one time.

A big potential challenge to new hires, especially in management and other senior positions, is balancing their ability to contribute new things to your existing team but not getting swept away with them and hurting the common thread in your team. I must admit that I was a little worried about the decisions being made in meetings I wasn't a part of. This is a bad habit I've picked up over the years, and despite all indicators to the contrary and belief in those involved in the meetings, I couldn't entirely shake it.

On Thursday, the outcomes from the various meetings over the past few days were presented to the whole team. The most striking part of the meeting to me was how those who weren't in the earlier meetings were able to accurately predict the long-term and short-term goals and features and markets and so forth that were presented. The next most striking was how flexible and accepting those who'd spent hours in meetings to come up with these outcomes were of additions and removals from what they presented.

That was a perfect precursor to our reward for the week's work and a celebration of meeting a few internal targets in the last month — a boat trip out from the Cape Town waterfront on Friday afternoon.

Pictures from the boat trip

Such a trip does have the potential to be a disaster — making a bunch of people wet and cold, forcing them to maintain their balance and their stomach, and otherwise messing with people isn't the best setting if there are issues between your people or if there's nothing binding them already. We did have new hires, after all.

But our new hires are much like the rest of us. No suits or fancy clothes when we're all office-bound. Shoes are optional. But when it comes to work, serious. More experienced than most of us, and older than most of us, but with the same youthful excitement and wonder for the space we're in and what we're doing. They're also just nice people — I've enjoyed watching every possible combination of new and old employee having multiple one-on-one conversations over the past week.

So, no disaster.

The boat trip itself was a lot of fun for me, despite getting absolutely soaked and nearly falling overboard a few times. I guess one has to do it to understand how that can be enjoyable, since I can't think of much to say in explanation. We ended up cutting the trip a bit short to avoid the setting sun and the ensuing cold and to rather have a warm supper in a warm restaurant. The review and exchange of photographs meant many laughs all around, and the shared adventure meant ample topic for discussion.

Pictures from the boat trip

Syndicated 2008-04-06 16:48:00 from Cosmic Seriosity Balance

SynthaSite release 2.2.3

About a week ago, we released the latest iteration of SynthaSite.

We had a pretty tough iteration compared to usual - probably the biggest stumbling block being people in the US attending conferences, seeing people, and the travel and recuperation time around that.  We managed to do some pretty cool stuff with those of us who had more stable availability (aka being left behind), which makes me very excited about what we can achieve when we're working full steam ahead.

Probably the biggest wins, as expressed by our users, were around styles - we added 22, and also enabled a whole bunch of them to have customisable banners.  Behind the scenes, I was pleasantly surprised to see initiative was taken in that we now have some tools to speed up these processes.

We've also majorly beefed up our support materials - we have a bunch of new tutorials that are easily available within our site builder, and a number of other goodies.

I've been enjoying watching our support systems grow over the past 3-6 weeks - we're starting to see support regulars helping others as well as an increasing proportion of support queries beyond the standard tool familiarisation ones.

Two or three rare, but long-standing, bugs have also been squashed as well, which has made a few of our users who had the right combination of factors very happy.

With my "process enablement" cap on, it seems that we've now grown confident in our release and update processes after employing them on the past few iterations.  That means a lot less stress for everyone involved, and I even think everyone actually is starting to perhaps even sometimes enjoy the QA period - discovering how everything comes together, saving us from face-palming, and so forth.

Syndicated 2008-04-03 22:35:01 from Cosmic Seriosity Balance

98 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!