Older blog entries for badger (starting at number 85)

Before Language

Patter patter swish is a shower running in the morning
Tchic-tchic, tchic-tchic is a car that won't turn over in the cold
Crunch crunch crunch is you and I walking through frost sheathed grass.

-Toshio Kuratomi, Dec 2009

Doing what we do Best

By chance, I happened to see a conversation on #fedora-latam today about whether changes are needed in how Fedora is presented in Latin America. It was interesting (even if google translate couldn't do it justice) and left me thinking that there will be some contentious discussions in the near future but the latam ambassadors are doing good work to break some new areas for reaching new contributors.

Some of the issues raised:

  • What contributions make someone a valuable contributor?
  • What activities can grow the contributor base
  • How to measure and grow the active contributors?
  • How to work together as a latam group instead of individual communities in each country?

Bearing in mind that I'm not a Fedora Ambassador and not a member of the Latin American community, I'd like to contribute some thoughts to this.

What contributions are valuable

All constructive contributions are valuable.

Coders, packagers, and release engineers have always been valued in Fedora. However, a good number of us in Fedora are aware that there are other forms of contribution and those forms are just as important to cultivate. Documentation writers, designers, artists, translators, planners, end user support, teachers, etc. The trick is figuring out how to fit the special talents that someone has with a role that they can fill in Fedora.

Are some contributions more valuable than others? Yes. But it's not necessarily the contributions that we normally think of. It's important to any Linux distribution to have packagers, for instance, but most Linux distributions already have attracted a large number of those. Teachers and UI designers (in my Fedora experience) have been in short supply. That means that even though a Linux distribution could not survive without any packagers but it could survive without any UI designers, attracting one additional UI designer may more valuable than attracting another packager.

How to grow the contributor base?

With the understanding that we do need a varied contributor base, the ways that we grow and nurture those contributors changes. I think it's fairly common for free software developers to think of the process of contributing purely through their own experience. First, they were a computer user. Then they were a free software user. Then they became a free software coder. Or from computer user to system admin to Linux packager to software coder. The danger in this unspoken assumption is that not everyone has the desire to become a software coder in the end even if they have the desire to contribute to the free software community.

I think that one of the challenges that the Fedora Latin American community needs to address is to identify the steps designers, teacher, and other non-coders take as they become more and more involved in the project. Step by step:

  1. What prompted them to try Fedora?
  2. What kept them using it after the initial use?
  3. What got them involved in the Fedora Community as opposed to just being a Fedora user?
  4. What roles have they stepped into since they first became involved in Fedora?
  5. What roles do they want to fill eventually?

Answering these questions helps us to understand what motivates other new contributors and therefore become better at nurturing them as they grow as a Fedora contributor. For instance, let's say we had these answers (note, I'm making this story up; find some real stories for some real answers):

My teacher showed me Fedora in school. I discovered that inkscape was better for drawing than photoshop (which I didn't own a legal copy of anyway) and the gimp was just as good for photo manipulation after I got used to the slight differences. After that, I heard about the call for a Fedora 9 theme and submitted a mockup. Once I did that and started getting involved in critiquing the other submissions, I started hanging out on IRC and talking to the other Fedora contributors regularly. Now I'm on the design team and work on artwork for Fedora proper and localized versions of art for fedora-latam. I'm hoping to get more into UI Design in the future.

What are some things that we can draw from a story like this?

  • School is one venue for recruiting new people. Having events at schools and training teachers could both lead to more users.
  • Tools they needed to do work was more important early on than it being free as in speech. They were using photoshop for a job better served by a vector drawing program -- perhaps because they couldn't get a free (as in beer) copy of the latter. Showing people tools that are better for what they do than what they have now is one way to make an impression.
  • Fedora made a request for the particular type of assistance that the person could provide. The person didn't hang around asking how they could contribute. Having "contest"-like events can be an entry point for new contributors. Note that they stuck around to critique other people's work -- so design was the entrypoint but there was a smooth transition into contributing in other ways. This could also mean that equiping ambassadors with an understanding of how to get people who want to contribute in touch with someone that can give them a task and mentor them right away will lead to better contribution than to expect people to ask on email days after meeting the ambassador.
  • Real time communication played a role in forming a bond to the Fedora community.
  • The contributor feels like they belong to a group now (I'm on the design team).
  • They want to advance by learning how to do UI design. We should get some of our current UI designers to give a class on that.

If we have real stories to think about, we can be better at deciding what types of events we need to organize to get people interested in Fedora and what we need to do after the events to get those interested people involved as contributors, not just users.

Growing active contributors

The Fedora Account System has about 38,000 accounts. Roughly 17,000 of those have signed the cla. Roughly 2,500 belong to another group in addition to the cla_signed group. As the commitment to working on Fedora increases, the number of people who are working on those things decreases -- not just in Latin America but in the project as a whole. I don't have any valuable insight on how to tell that contributors will be active in Fedora but I do know that if the latam group figures out something that works very well, it won't be by copying what the project as a whole has already done. They might take pieces of what we do and adapt it but they will also need to experiment and try out new ideas. Not only because they have a different audience than other regions but also that what is being done in other regions has definite room for improvement.

Working Together for A Better Tomorrow

One thing that was brought up was that Latin America only has two commonly used languages. It should be much easier for latam to communicate and share resources (like documentation and posters) than Europe where there's a multitude of languages. And yet it seems like much of the work in fedora-latam is being done on a country by country level. Listening to the people doing the work, it seems like the main problem with working together is that collaboration takes time. When you have a small group of people that you can meet or talk to regularly, it is easy to arrange to do things together. When you expand to try to talk to other people that you only see once a year, have time zone differences, and see the needs of the people around you differently, you have a harder time getting anything done.

I think that we see this in all of the Fedora project, not just in fedora-latam. There are very definitely people who talk about things, people who make decisions, and people who get work done. There is overlap among the sets of people but there are other people who want to talk forever. I think that working together is definitely something to work towards but those who do things should not be slowed down by those who talk. If someone is willing to work on tools to help collaborate more, create it. If someone is off doing great things, report back what worked and what didn't so others can benefit from your experiences. Try to be open to other ideas but don't wait on other ideas being finalized to implement them if talking about them is dragging on and you think you can do a good job with the idea now.


Well, that's enough of my uninformed opinions for now :-) I'm just excited to hear what fedora-latam starts doing as they're pushing into new territory figuring out how to bring in contributors that are under represented in Fedora at this time.

espaƱol (google translate)

I love this quote: "Basically, my job is to be contagiously enthusiastic" -- Mel Chua in this interview.

Wanted: C++ Programmer to work with Inkscape upstream

One of the things to have emerged from the hallway track at the Google Summer of Code Mentor Summit was the need for a robust, featureful, free software whiteboarding tool. This would allow people to collaboratively work on project design, model workflow, and do things more visually than the current round of instant messaging, pastebins, collaborative text editors, and voip.

Currently, I know of two potential competitors for this. The first is Coccinella, a tcl program that does free-form drawing with a few caveats. Here's what mizmo, one of the main Fedora Design Team Members has to say about it:

For free-form drawing, Jabber-based Coccinella gets me close, but it's a little clunky and when people join a meeting late they don't get to see what was drawn on the whiteboard before they joined. I'd like it to automatically snapshot the whiteboard at various points and synchronize the snaps with the text conversation and automatically email me a report.

Additionally, coccinella doesn't have many of the tools that make diagramming, flow charting, and other, more structured drawings easier. For this, many artists use inkscape. Inkscape allows artists and designers to make mockups and quickly prototype new designs. At least a few open source developers also use it for making charts and diagrams to visualize their program's structure and execution. It would be great if we could collaborate on these over the Internet using inkscape's rich toolset. This is where the inkscape whiteboard plugin enters the picture.

The whiteboard plugin, inkboard, was written as a GSoC project in 2005. Although there's been some work on it since then, development has not kept pace with the rest of inkscape. Currently, it is disabled in the configure script since it doesn't work. However, I talked with inkscape developer Jon A. Cruz at the Mentor Summit and found that all is not lost. Although someone is needed to step up and work on inkboard to bring it back, recent changes in the core of inkscape will make it easier to implement. Removal of id tags in the SVG that bloated the image size and caused potential conflicts between two synchronizing inkscape programs as well as incorporation of a new XMPP implementation should make the next version of inkboard easier to write and more robust.

Now where do you come in? From time to time someone will write me an email that says, "I've been using Linux for years and now I want to give back to the community. I've got programming experience in C++, how can I help?" This is your chance to step up! Contact Jon or subscribe directly to the inkscape developers mailing list. Check out the inkscape code from svn. And then get hacking!

Adel Gadllah (dragoo1) ran my script on his computer with a couple other compressors: pbzip2 (a parallel implementation of bzip2) and pigz (a parallel version of gzip). His computer is a quad core with 6GB of RAM. A definite upgrade from the machine I tested on (dual core with 1GB of RAM). The results are quite interesting.

Since no new algorithms were introduced, just new implementations, the compression ratios didn't change much. But the times for the parallel implementations were very interesting. pbzip2 runs faster than gzip. pigz -9 runs faster than lzop -1! If compression was the only process being run on the machine then the parallel implementations are definitely worthwhile.

Well, after reading this message from notting about speeds and sizes of xz compression at various levels, I got curious about how gzip falls into the picture. So I wrote a little script to do some naive testing, found a 64MB text file (an sql database dump), and ran a naive benchmark. First, the script so you can all see what horrible assumptions I'm making:


#!/bin/sh                                              


LZOP='lzop -U' GZIP='gzip' BZIP='bzip2' XZ='xz'

TESTFILE='/var/tmp/test.dump'

for program in "$LZOP" "$GZIP" "$BZIP" "$XZ" ; do case $program in gz*) ext='.gz' ;; bz*) ext='.bz2';; xz*) ext='.xz';; lz*) ext='.lzo';; *) echo 'error! No configured compressor extension' exit ;; esac

COMPRESSEDFILE="$TESTFILE$ext"

for lvl in `seq 1 9` ; do c_time=`/usr/bin/time -f '%E' 2>&1 $program -$lvl $TESTFILE` c_size=`ls -l $COMPRESSEDFILE |awk '{print $5}'` d_time=`/usr/bin/time -f '%E' 2>&1 $program -d $COMPRESSEDFILE` printf '%-10s %10s %10s %10s\n' "$program -$lvl" $c_time $c_size $d_time done done

As you can see, I'm not flushing caches between runs or anything fancy to make this a truly rigorous test. I'm also running this on my desktop (although I wasn't actively doing anything on that machine, it was logged into a normal X session with all the wakeups and polling and etc that that implies.) I also only used a single input file for data. Binary files or tarballs with a mixture of text and images and executables could certainly give different results. Grab the script and try this out on your own sample data. And if you get radically different results, post them!


Compressor   Compress     Size   Decompress
----------   --------   -------  ----------
none [*]_     0:00.43   67348587    0:00.00


lzop -U -1 0:00.57 16293912 0:00.35 lzop -U -2 0:00.62 16292914 0:00.40 lzop -U -3 0:00.62 16292914 0:00.34 lzop -U -4 0:00.57 16292914 0:00.42 lzop -U -5 0:00.57 16292914 0:00.42 lzop -U -6 0:00.67 16292914 0:00.41 lzop -U -7 0:13.53 12824930 0:00.30 lzop -U -8 0:39.71 12671642 0:00.32 lzop -U -9 0:41.92 12669217 0:00.28

gzip -1 0:01.96 11743900 0:01.02 gzip -2 0:02.04 11397943 0:00.92 gzip -3 0:02.77 11054616 0:00.89 gzip -4 0:02.59 10480013 0:00.82 gzip -5 0:03.42 10157139 0:00.78 gzip -6 0:05.44 9972864 0:00.77 gzip -7 0:06.71 9703170 0:00.76 gzip -8 0:13.64 9592825 0:00.91 gzip -9 0:15.89 9588291 0:00.76

bzip2 -1 0:20.17 7695217 0:04.73 bzip2 -2 0:21.68 7687633 0:03.69 bzip2 -3 0:23.48 7709616 0:03.63 bzip2 -4 0:26.00 7710857 0:03.69 bzip2 -5 0:25.45 7715717 0:04.09 bzip2 -6 0:26.95 7716582 0:03.95 bzip2 -7 0:28.13 7733192 0:04.23 bzip2 -8 0:29.71 7756200 0:04.36 bzip2 -9 0:31.39 7809732 0:04.50 [@]_

xz -1 0:08.21 7245616 0:01.86 xz -2 0:10.75 7195168 0:02.23 xz -3 0:59.45 5767852 0:01.90 xz -4 1:01.75 5739644 0:01.83 xz -5 1:09.70 5705752 0:02.60 xz -6 1:46.23 5443748 0:02.09 xz -7 1:50.37 5431004 0:02.19 xz -8 2:02.41 5417436 0:02.19 xz -9 [#]_ 2:18.12 5421508 0:02.55

.. _[*]: Time to copy the file. .. _[@]: What's up with bzip2? Why does the size increase with higher levels? .. _[#]: Note, xz -9 is unfair on two counts: 1) it pushed me into swap. 2) As for the size, xz had this output during that run:: Adjusted LZMA2 dictionary size from 64 MiB to 35 MiB to not exceed the memory usage limit of 397 MiB

My conclusions based upon entirely too little data :-)

  • If you want transparent compression, use lzop at one of the lower compression settings. I got 25% of the size at 100 MB/s with lzop -2.
  • Do not use lzop with -7 or higher. If you want more compression than -2/3/4/5/6 (the algorithm for these is currently all the same) use gzip. You'll get better compression with better speed.
  • The only reason to use bzip2 is if you must have both a smaller size than gzip and you can't deploy xz there. If you don't need the smaller size or the remote side can get xz then bzip2 is a waste. This applies to distributing source code tarballs as two formats, for instance. If you're going to release in two formats, use tar.gz and tar.xz instead of tar.gz and tar.bz2.
  • xz gets the smallest size but it's versatile in other ways too: xz -2 is faster than gzip -9 with better compression ratios.
  • gzip beats xz at decompression but not nearly as badly it beat bzip2.

So thanks to cdfrey, I'm a little closer on two fronts.

First, the problem as given has a solution for hack #2 but apparently not hack #1. Here's the new sequence of commands:


git checkout base_url
git log
# Manually find the last commit in staging before I branched
git rebase --onto master [COMMIT ID FROM ABOVE]
git checkout master
git merge base_url

So no more patches, yay! However, you probably notice that we still have to use git log to find the branchpoint. After some discussion of this, it seems that if we have merged from the feature branch back to the branch it came from, there's no way around this. git does not maintain the history of where something came from and where it goes back to, it holds onto the heads and then follows the chain of commits back. So once we're merged, there's no branch point anymore... the trees are the same.

However, we did figure out a potential way to implement our workflow in the future. Instead of branching from staging, the feature branch should start off branching from master. After it's been worked on, it gets merged to staging. But since it started off from master, that should still leave the feature branch with a clear path of changes to apply to master. Once the changes have been tested in staging, we can merge the feature branch into master and it's then "okay" for the branchpoint to disappear since the work is completed.

Okay, git lovers, I have an incredibly simple problem but so far the only working solution is a kludge. I'm hoping someone can tell me what the elegant way to solve this problem is.

I'm working with three branches keeping configuration information for our environment. master is where our production configs live. staging is a branch where we merge changes and test them in our staging environment. Once tested, they get cherrypicked to master.

base_url is where I've been working on a new change that spans several commits. It was branched off of staging. After completion, I merged the changes into the staging branch and tested. So far so good.

Now I want to merge my branch into master. How do I do that?

Here's an idealized diagram of the branch relationships. In reality, sometimes changes go into master before staging.


master       staging   base_url
  |             |  merge |  ___
  | cherrypicks +<-------+   ^
  +<------------+        |   |
  |    (cp)     |        |  How do I merge these to master?
  +<------------+ branch |   |
  |    (cp)     +------->+  _V_
  +<------------+
  |   branch    |
  +------------>+
  |
  |
/srv/puppet

So far everything I've tried with git rebase or git merge seems to be sending changes from the staging branch that were present before I branched to base_url into master. I don't want that. I did my changes on a separate branch so I could merge just my changes to both staging and master later.

Here's the kludge that did work:


git checkout base_url
git log
# Manually find the last commit in staging before I branched
git format-patch [COMMIT ID FROM ABOVE]
git checkout master
git am [patch 0001 0002 0003....etc]

The two things that I find too hacky about this solution are:

  1. using git log to find the branch point. git should know where I branched from staging... I just need to tell it that I want to pull changes from the branch point forward somehow and it should find the proper commit id.
  2. generating patches and then applying them. git should be able to do this without generating "temporary files" like this. The deltas are in the repo, why pull them out into text files?

I have copies of my repository before I "fixed" it with the patch commands. So send me your recipes and I'll see if any of them work. Once we have a winner, I'll post strategies that worked and ones that didn't.

Of course, even after I know how to do this, there's still all sorts of follow on questions -- like, what happens if this new feature took a long time and I needed to remerge the base_url branch with staging in the middle?

a.badgergmail.com or abadger1999 on irc.freenode.net

Do not buy Swingline stapler model #545xx

My wife was having problems stapling today so I looked inside this one and found that the staple had fallen over inside the stapler. So, instead of the staples forming the upside down "U", all ready for the teeth to punch into the paper being stapled, the staples were positioned with the teeth to the front and the base of the "U" facing the springloaded rear of the stapler.

This is just poor design as some misguided engineer tried to cut costs. All other staplers I've seen have some sort of platform down the middle of the staple feed chamber. This allows the base of the "U" to rest supported on the platform and not depend on standing upright on their feet. Getting rid of that platform means that the staples can fall over when the stapler is loaded or if the spring's tension is off.

Or perhaps it isn't poor design -- A little experimentation showed that the stapler has one feature sure to please a company exec, provided it's the exec of Swingline: It's nearly impossible to load small quantities of staples into this design. With nothing to support them, they just fall over and slide underneath the other staples.

Had a productive evening planned out but didn't get to do any of it because of a chicken emergency. First time I've actually seen "it gave a spasm that threw its whole body in the air and died." Parents get back tomorrow night so hopefully I can start working long hours again starting next week.

76 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!