Recent blog entries for badger

Digging Hoe

For those of you unsure whether to laugh about this at FUDCon, just a link to show these really exist, despite the name.

Dear lazyweb, is this possible with AMQP?

Producer sends messages to the server. Messages are not lost if the server is restarted. Messages are not lost if the producer is sending a message and the server is down.

Multiple consumers can connect to the server and receive all of the messages. Some of the consumers will be connected almost all the time. Other consumers will poll (they will be fired off by cron jobs). In both cases, the consumers should receive all of the messages that the producer has sent.

Terminology like "fanout" and "durable queues" seems to be where I need to look but I'm not sure if they're really the same concepts.

FUDCon Live!

Just a quick FUDCon note: We are trying to make it easier for people who are not able to make it to FUDCon itself to see some of the things going on and get some value from the sessions. Check out the FUDCon Live wiki page for a list of sessions, when they're happening, and logs from finished sessions.

Looking at the logs, you may notice that they're logs from IRC. This is because we have people transcribing highlights from all the sessions into irc channels as they're happening. This is an opportunity for people, not just to follow along at home but also to ask questions and join in with the conference sessions.

How to join the Fedora Live IRC channels?

  1. First, view the FUDCon Live schedule to see what sessions are currently taking place.
  2. Then use your IRC client to go to irc.freenode.net
  3. Third, join the #fudcon-room-[NUMBER] channel that corresponds to the room number that the session is being held in.
  4. Sit back, read, and ask questions!

Here's an example:

I'm interested in the Sysadmin & Developer Panel. I see that it is in Room 7 on the wiki schedule. I open up konversation and go to irc.freenode.net as the server. Then I /join #fudcon-room-7 and participate in the conversation that's going on.

FUDBus

I was afraid that the FUDBus to Toronto FUDCon would be a bust when I found out that our planned wifi and AC Power was not to be but it turned out to be quite productive. I spent most of the day talking to Dave Malcolm about getting Python 3 Packaging Guidelines and packages into Fedora. We've simplified a bit of it, clarified other bits, and the only really difficult thing left is figuring out whether to allow one srpm to handle modules that build both python2 and python3 subpackages or separate srpms for python2 and python3. There's pros and cons to each that we'll have to weigh against each other before we can settle on a solution. We'll keep thinking about this and hopefully have something finished by the end of FUDCon.

I also got to talk to mizmo about how to become better at designing user interfaces with inkscape. By happy chance, she's doing a presentation tomorrow on just that topic and gave me a sneak peak with the slides she prepared. So as not to spoil her presentation, all I'll say for now is that I think it's a *very* useful presentation. Mo showed both a broad overview of making mockups in inkscape and the key specific features that she uses. Much like mmcgrath's presentation last year about how system admins in infrastructure have handled problems, this is a presentation that will help you on multiple levels. It will give you skills to design UIs and better work with inkscape to create artwork. It will also show you something of the process and thinking of a UI designer, an often underappreciated but hugely important part of our work.

Other bus topics: A bit of discussion about replacing cvs with a new version control system, updating packagedb and cvs acls to allow a comaintainer-only packager group, and arranging with lmacken to do a TG2 quickstart with csrf and fas auth plugins sometime during the hackfest. If I canget that to work, I'll be able to write a guide that tells others how to use these pieves of python-fedora in their TG2 applications.

Before Language

Patter patter swish is a shower running in the morning
Tchic-tchic, tchic-tchic is a car that won't turn over in the cold
Crunch crunch crunch is you and I walking through frost sheathed grass.

-Toshio Kuratomi, Dec 2009

Doing what we do Best

By chance, I happened to see a conversation on #fedora-latam today about whether changes are needed in how Fedora is presented in Latin America. It was interesting (even if google translate couldn't do it justice) and left me thinking that there will be some contentious discussions in the near future but the latam ambassadors are doing good work to break some new areas for reaching new contributors.

Some of the issues raised:

  • What contributions make someone a valuable contributor?
  • What activities can grow the contributor base
  • How to measure and grow the active contributors?
  • How to work together as a latam group instead of individual communities in each country?

Bearing in mind that I'm not a Fedora Ambassador and not a member of the Latin American community, I'd like to contribute some thoughts to this.

What contributions are valuable

All constructive contributions are valuable.

Coders, packagers, and release engineers have always been valued in Fedora. However, a good number of us in Fedora are aware that there are other forms of contribution and those forms are just as important to cultivate. Documentation writers, designers, artists, translators, planners, end user support, teachers, etc. The trick is figuring out how to fit the special talents that someone has with a role that they can fill in Fedora.

Are some contributions more valuable than others? Yes. But it's not necessarily the contributions that we normally think of. It's important to any Linux distribution to have packagers, for instance, but most Linux distributions already have attracted a large number of those. Teachers and UI designers (in my Fedora experience) have been in short supply. That means that even though a Linux distribution could not survive without any packagers but it could survive without any UI designers, attracting one additional UI designer may more valuable than attracting another packager.

How to grow the contributor base?

With the understanding that we do need a varied contributor base, the ways that we grow and nurture those contributors changes. I think it's fairly common for free software developers to think of the process of contributing purely through their own experience. First, they were a computer user. Then they were a free software user. Then they became a free software coder. Or from computer user to system admin to Linux packager to software coder. The danger in this unspoken assumption is that not everyone has the desire to become a software coder in the end even if they have the desire to contribute to the free software community.

I think that one of the challenges that the Fedora Latin American community needs to address is to identify the steps designers, teacher, and other non-coders take as they become more and more involved in the project. Step by step:

  1. What prompted them to try Fedora?
  2. What kept them using it after the initial use?
  3. What got them involved in the Fedora Community as opposed to just being a Fedora user?
  4. What roles have they stepped into since they first became involved in Fedora?
  5. What roles do they want to fill eventually?

Answering these questions helps us to understand what motivates other new contributors and therefore become better at nurturing them as they grow as a Fedora contributor. For instance, let's say we had these answers (note, I'm making this story up; find some real stories for some real answers):

My teacher showed me Fedora in school. I discovered that inkscape was better for drawing than photoshop (which I didn't own a legal copy of anyway) and the gimp was just as good for photo manipulation after I got used to the slight differences. After that, I heard about the call for a Fedora 9 theme and submitted a mockup. Once I did that and started getting involved in critiquing the other submissions, I started hanging out on IRC and talking to the other Fedora contributors regularly. Now I'm on the design team and work on artwork for Fedora proper and localized versions of art for fedora-latam. I'm hoping to get more into UI Design in the future.

What are some things that we can draw from a story like this?

  • School is one venue for recruiting new people. Having events at schools and training teachers could both lead to more users.
  • Tools they needed to do work was more important early on than it being free as in speech. They were using photoshop for a job better served by a vector drawing program -- perhaps because they couldn't get a free (as in beer) copy of the latter. Showing people tools that are better for what they do than what they have now is one way to make an impression.
  • Fedora made a request for the particular type of assistance that the person could provide. The person didn't hang around asking how they could contribute. Having "contest"-like events can be an entry point for new contributors. Note that they stuck around to critique other people's work -- so design was the entrypoint but there was a smooth transition into contributing in other ways. This could also mean that equiping ambassadors with an understanding of how to get people who want to contribute in touch with someone that can give them a task and mentor them right away will lead to better contribution than to expect people to ask on email days after meeting the ambassador.
  • Real time communication played a role in forming a bond to the Fedora community.
  • The contributor feels like they belong to a group now (I'm on the design team).
  • They want to advance by learning how to do UI design. We should get some of our current UI designers to give a class on that.

If we have real stories to think about, we can be better at deciding what types of events we need to organize to get people interested in Fedora and what we need to do after the events to get those interested people involved as contributors, not just users.

Growing active contributors

The Fedora Account System has about 38,000 accounts. Roughly 17,000 of those have signed the cla. Roughly 2,500 belong to another group in addition to the cla_signed group. As the commitment to working on Fedora increases, the number of people who are working on those things decreases -- not just in Latin America but in the project as a whole. I don't have any valuable insight on how to tell that contributors will be active in Fedora but I do know that if the latam group figures out something that works very well, it won't be by copying what the project as a whole has already done. They might take pieces of what we do and adapt it but they will also need to experiment and try out new ideas. Not only because they have a different audience than other regions but also that what is being done in other regions has definite room for improvement.

Working Together for A Better Tomorrow

One thing that was brought up was that Latin America only has two commonly used languages. It should be much easier for latam to communicate and share resources (like documentation and posters) than Europe where there's a multitude of languages. And yet it seems like much of the work in fedora-latam is being done on a country by country level. Listening to the people doing the work, it seems like the main problem with working together is that collaboration takes time. When you have a small group of people that you can meet or talk to regularly, it is easy to arrange to do things together. When you expand to try to talk to other people that you only see once a year, have time zone differences, and see the needs of the people around you differently, you have a harder time getting anything done.

I think that we see this in all of the Fedora project, not just in fedora-latam. There are very definitely people who talk about things, people who make decisions, and people who get work done. There is overlap among the sets of people but there are other people who want to talk forever. I think that working together is definitely something to work towards but those who do things should not be slowed down by those who talk. If someone is willing to work on tools to help collaborate more, create it. If someone is off doing great things, report back what worked and what didn't so others can benefit from your experiences. Try to be open to other ideas but don't wait on other ideas being finalized to implement them if talking about them is dragging on and you think you can do a good job with the idea now.


Well, that's enough of my uninformed opinions for now :-) I'm just excited to hear what fedora-latam starts doing as they're pushing into new territory figuring out how to bring in contributors that are under represented in Fedora at this time.

espaƱol (google translate)

I love this quote: "Basically, my job is to be contagiously enthusiastic" -- Mel Chua in this interview.

Wanted: C++ Programmer to work with Inkscape upstream

One of the things to have emerged from the hallway track at the Google Summer of Code Mentor Summit was the need for a robust, featureful, free software whiteboarding tool. This would allow people to collaboratively work on project design, model workflow, and do things more visually than the current round of instant messaging, pastebins, collaborative text editors, and voip.

Currently, I know of two potential competitors for this. The first is Coccinella, a tcl program that does free-form drawing with a few caveats. Here's what mizmo, one of the main Fedora Design Team Members has to say about it:

For free-form drawing, Jabber-based Coccinella gets me close, but it's a little clunky and when people join a meeting late they don't get to see what was drawn on the whiteboard before they joined. I'd like it to automatically snapshot the whiteboard at various points and synchronize the snaps with the text conversation and automatically email me a report.

Additionally, coccinella doesn't have many of the tools that make diagramming, flow charting, and other, more structured drawings easier. For this, many artists use inkscape. Inkscape allows artists and designers to make mockups and quickly prototype new designs. At least a few open source developers also use it for making charts and diagrams to visualize their program's structure and execution. It would be great if we could collaborate on these over the Internet using inkscape's rich toolset. This is where the inkscape whiteboard plugin enters the picture.

The whiteboard plugin, inkboard, was written as a GSoC project in 2005. Although there's been some work on it since then, development has not kept pace with the rest of inkscape. Currently, it is disabled in the configure script since it doesn't work. However, I talked with inkscape developer Jon A. Cruz at the Mentor Summit and found that all is not lost. Although someone is needed to step up and work on inkboard to bring it back, recent changes in the core of inkscape will make it easier to implement. Removal of id tags in the SVG that bloated the image size and caused potential conflicts between two synchronizing inkscape programs as well as incorporation of a new XMPP implementation should make the next version of inkboard easier to write and more robust.

Now where do you come in? From time to time someone will write me an email that says, "I've been using Linux for years and now I want to give back to the community. I've got programming experience in C++, how can I help?" This is your chance to step up! Contact Jon or subscribe directly to the inkscape developers mailing list. Check out the inkscape code from svn. And then get hacking!

Adel Gadllah (dragoo1) ran my script on his computer with a couple other compressors: pbzip2 (a parallel implementation of bzip2) and pigz (a parallel version of gzip). His computer is a quad core with 6GB of RAM. A definite upgrade from the machine I tested on (dual core with 1GB of RAM). The results are quite interesting.

Since no new algorithms were introduced, just new implementations, the compression ratios didn't change much. But the times for the parallel implementations were very interesting. pbzip2 runs faster than gzip. pigz -9 runs faster than lzop -1! If compression was the only process being run on the machine then the parallel implementations are definitely worthwhile.

Well, after reading this message from notting about speeds and sizes of xz compression at various levels, I got curious about how gzip falls into the picture. So I wrote a little script to do some naive testing, found a 64MB text file (an sql database dump), and ran a naive benchmark. First, the script so you can all see what horrible assumptions I'm making:


#!/bin/sh                                              


LZOP='lzop -U' GZIP='gzip' BZIP='bzip2' XZ='xz'

TESTFILE='/var/tmp/test.dump'

for program in "$LZOP" "$GZIP" "$BZIP" "$XZ" ; do case $program in gz*) ext='.gz' ;; bz*) ext='.bz2';; xz*) ext='.xz';; lz*) ext='.lzo';; *) echo 'error! No configured compressor extension' exit ;; esac

COMPRESSEDFILE="$TESTFILE$ext"

for lvl in `seq 1 9` ; do c_time=`/usr/bin/time -f '%E' 2>&1 $program -$lvl $TESTFILE` c_size=`ls -l $COMPRESSEDFILE |awk '{print $5}'` d_time=`/usr/bin/time -f '%E' 2>&1 $program -d $COMPRESSEDFILE` printf '%-10s %10s %10s %10s\n' "$program -$lvl" $c_time $c_size $d_time done done

As you can see, I'm not flushing caches between runs or anything fancy to make this a truly rigorous test. I'm also running this on my desktop (although I wasn't actively doing anything on that machine, it was logged into a normal X session with all the wakeups and polling and etc that that implies.) I also only used a single input file for data. Binary files or tarballs with a mixture of text and images and executables could certainly give different results. Grab the script and try this out on your own sample data. And if you get radically different results, post them!


Compressor   Compress     Size   Decompress
----------   --------   -------  ----------
none [*]_     0:00.43   67348587    0:00.00


lzop -U -1 0:00.57 16293912 0:00.35 lzop -U -2 0:00.62 16292914 0:00.40 lzop -U -3 0:00.62 16292914 0:00.34 lzop -U -4 0:00.57 16292914 0:00.42 lzop -U -5 0:00.57 16292914 0:00.42 lzop -U -6 0:00.67 16292914 0:00.41 lzop -U -7 0:13.53 12824930 0:00.30 lzop -U -8 0:39.71 12671642 0:00.32 lzop -U -9 0:41.92 12669217 0:00.28

gzip -1 0:01.96 11743900 0:01.02 gzip -2 0:02.04 11397943 0:00.92 gzip -3 0:02.77 11054616 0:00.89 gzip -4 0:02.59 10480013 0:00.82 gzip -5 0:03.42 10157139 0:00.78 gzip -6 0:05.44 9972864 0:00.77 gzip -7 0:06.71 9703170 0:00.76 gzip -8 0:13.64 9592825 0:00.91 gzip -9 0:15.89 9588291 0:00.76

bzip2 -1 0:20.17 7695217 0:04.73 bzip2 -2 0:21.68 7687633 0:03.69 bzip2 -3 0:23.48 7709616 0:03.63 bzip2 -4 0:26.00 7710857 0:03.69 bzip2 -5 0:25.45 7715717 0:04.09 bzip2 -6 0:26.95 7716582 0:03.95 bzip2 -7 0:28.13 7733192 0:04.23 bzip2 -8 0:29.71 7756200 0:04.36 bzip2 -9 0:31.39 7809732 0:04.50 [@]_

xz -1 0:08.21 7245616 0:01.86 xz -2 0:10.75 7195168 0:02.23 xz -3 0:59.45 5767852 0:01.90 xz -4 1:01.75 5739644 0:01.83 xz -5 1:09.70 5705752 0:02.60 xz -6 1:46.23 5443748 0:02.09 xz -7 1:50.37 5431004 0:02.19 xz -8 2:02.41 5417436 0:02.19 xz -9 [#]_ 2:18.12 5421508 0:02.55

.. _[*]: Time to copy the file. .. _[@]: What's up with bzip2? Why does the size increase with higher levels? .. _[#]: Note, xz -9 is unfair on two counts: 1) it pushed me into swap. 2) As for the size, xz had this output during that run:: Adjusted LZMA2 dictionary size from 64 MiB to 35 MiB to not exceed the memory usage limit of 397 MiB

My conclusions based upon entirely too little data :-)

  • If you want transparent compression, use lzop at one of the lower compression settings. I got 25% of the size at 100 MB/s with lzop -2.
  • Do not use lzop with -7 or higher. If you want more compression than -2/3/4/5/6 (the algorithm for these is currently all the same) use gzip. You'll get better compression with better speed.
  • The only reason to use bzip2 is if you must have both a smaller size than gzip and you can't deploy xz there. If you don't need the smaller size or the remote side can get xz then bzip2 is a waste. This applies to distributing source code tarballs as two formats, for instance. If you're going to release in two formats, use tar.gz and tar.xz instead of tar.gz and tar.bz2.
  • xz gets the smallest size but it's versatile in other ways too: xz -2 is faster than gzip -9 with better compression ratios.
  • gzip beats xz at decompression but not nearly as badly it beat bzip2.

80 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!