Older blog entries for badger (starting at number 87)

FUDCon Live!

Just a quick FUDCon note: We are trying to make it easier for people who are not able to make it to FUDCon itself to see some of the things going on and get some value from the sessions. Check out the FUDCon Live wiki page for a list of sessions, when they're happening, and logs from finished sessions.

Looking at the logs, you may notice that they're logs from IRC. This is because we have people transcribing highlights from all the sessions into irc channels as they're happening. This is an opportunity for people, not just to follow along at home but also to ask questions and join in with the conference sessions.

How to join the Fedora Live IRC channels?

  1. First, view the FUDCon Live schedule to see what sessions are currently taking place.
  2. Then use your IRC client to go to irc.freenode.net
  3. Third, join the #fudcon-room-[NUMBER] channel that corresponds to the room number that the session is being held in.
  4. Sit back, read, and ask questions!

Here's an example:

I'm interested in the Sysadmin & Developer Panel. I see that it is in Room 7 on the wiki schedule. I open up konversation and go to irc.freenode.net as the server. Then I /join #fudcon-room-7 and participate in the conversation that's going on.


I was afraid that the FUDBus to Toronto FUDCon would be a bust when I found out that our planned wifi and AC Power was not to be but it turned out to be quite productive. I spent most of the day talking to Dave Malcolm about getting Python 3 Packaging Guidelines and packages into Fedora. We've simplified a bit of it, clarified other bits, and the only really difficult thing left is figuring out whether to allow one srpm to handle modules that build both python2 and python3 subpackages or separate srpms for python2 and python3. There's pros and cons to each that we'll have to weigh against each other before we can settle on a solution. We'll keep thinking about this and hopefully have something finished by the end of FUDCon.

I also got to talk to mizmo about how to become better at designing user interfaces with inkscape. By happy chance, she's doing a presentation tomorrow on just that topic and gave me a sneak peak with the slides she prepared. So as not to spoil her presentation, all I'll say for now is that I think it's a *very* useful presentation. Mo showed both a broad overview of making mockups in inkscape and the key specific features that she uses. Much like mmcgrath's presentation last year about how system admins in infrastructure have handled problems, this is a presentation that will help you on multiple levels. It will give you skills to design UIs and better work with inkscape to create artwork. It will also show you something of the process and thinking of a UI designer, an often underappreciated but hugely important part of our work.

Other bus topics: A bit of discussion about replacing cvs with a new version control system, updating packagedb and cvs acls to allow a comaintainer-only packager group, and arranging with lmacken to do a TG2 quickstart with csrf and fas auth plugins sometime during the hackfest. If I canget that to work, I'll be able to write a guide that tells others how to use these pieves of python-fedora in their TG2 applications.

Before Language

Patter patter swish is a shower running in the morning
Tchic-tchic, tchic-tchic is a car that won't turn over in the cold
Crunch crunch crunch is you and I walking through frost sheathed grass.

-Toshio Kuratomi, Dec 2009

Doing what we do Best

By chance, I happened to see a conversation on #fedora-latam today about whether changes are needed in how Fedora is presented in Latin America. It was interesting (even if google translate couldn't do it justice) and left me thinking that there will be some contentious discussions in the near future but the latam ambassadors are doing good work to break some new areas for reaching new contributors.

Some of the issues raised:

  • What contributions make someone a valuable contributor?
  • What activities can grow the contributor base
  • How to measure and grow the active contributors?
  • How to work together as a latam group instead of individual communities in each country?

Bearing in mind that I'm not a Fedora Ambassador and not a member of the Latin American community, I'd like to contribute some thoughts to this.

What contributions are valuable

All constructive contributions are valuable.

Coders, packagers, and release engineers have always been valued in Fedora. However, a good number of us in Fedora are aware that there are other forms of contribution and those forms are just as important to cultivate. Documentation writers, designers, artists, translators, planners, end user support, teachers, etc. The trick is figuring out how to fit the special talents that someone has with a role that they can fill in Fedora.

Are some contributions more valuable than others? Yes. But it's not necessarily the contributions that we normally think of. It's important to any Linux distribution to have packagers, for instance, but most Linux distributions already have attracted a large number of those. Teachers and UI designers (in my Fedora experience) have been in short supply. That means that even though a Linux distribution could not survive without any packagers but it could survive without any UI designers, attracting one additional UI designer may more valuable than attracting another packager.

How to grow the contributor base?

With the understanding that we do need a varied contributor base, the ways that we grow and nurture those contributors changes. I think it's fairly common for free software developers to think of the process of contributing purely through their own experience. First, they were a computer user. Then they were a free software user. Then they became a free software coder. Or from computer user to system admin to Linux packager to software coder. The danger in this unspoken assumption is that not everyone has the desire to become a software coder in the end even if they have the desire to contribute to the free software community.

I think that one of the challenges that the Fedora Latin American community needs to address is to identify the steps designers, teacher, and other non-coders take as they become more and more involved in the project. Step by step:

  1. What prompted them to try Fedora?
  2. What kept them using it after the initial use?
  3. What got them involved in the Fedora Community as opposed to just being a Fedora user?
  4. What roles have they stepped into since they first became involved in Fedora?
  5. What roles do they want to fill eventually?

Answering these questions helps us to understand what motivates other new contributors and therefore become better at nurturing them as they grow as a Fedora contributor. For instance, let's say we had these answers (note, I'm making this story up; find some real stories for some real answers):

My teacher showed me Fedora in school. I discovered that inkscape was better for drawing than photoshop (which I didn't own a legal copy of anyway) and the gimp was just as good for photo manipulation after I got used to the slight differences. After that, I heard about the call for a Fedora 9 theme and submitted a mockup. Once I did that and started getting involved in critiquing the other submissions, I started hanging out on IRC and talking to the other Fedora contributors regularly. Now I'm on the design team and work on artwork for Fedora proper and localized versions of art for fedora-latam. I'm hoping to get more into UI Design in the future.

What are some things that we can draw from a story like this?

  • School is one venue for recruiting new people. Having events at schools and training teachers could both lead to more users.
  • Tools they needed to do work was more important early on than it being free as in speech. They were using photoshop for a job better served by a vector drawing program -- perhaps because they couldn't get a free (as in beer) copy of the latter. Showing people tools that are better for what they do than what they have now is one way to make an impression.
  • Fedora made a request for the particular type of assistance that the person could provide. The person didn't hang around asking how they could contribute. Having "contest"-like events can be an entry point for new contributors. Note that they stuck around to critique other people's work -- so design was the entrypoint but there was a smooth transition into contributing in other ways. This could also mean that equiping ambassadors with an understanding of how to get people who want to contribute in touch with someone that can give them a task and mentor them right away will lead to better contribution than to expect people to ask on email days after meeting the ambassador.
  • Real time communication played a role in forming a bond to the Fedora community.
  • The contributor feels like they belong to a group now (I'm on the design team).
  • They want to advance by learning how to do UI design. We should get some of our current UI designers to give a class on that.

If we have real stories to think about, we can be better at deciding what types of events we need to organize to get people interested in Fedora and what we need to do after the events to get those interested people involved as contributors, not just users.

Growing active contributors

The Fedora Account System has about 38,000 accounts. Roughly 17,000 of those have signed the cla. Roughly 2,500 belong to another group in addition to the cla_signed group. As the commitment to working on Fedora increases, the number of people who are working on those things decreases -- not just in Latin America but in the project as a whole. I don't have any valuable insight on how to tell that contributors will be active in Fedora but I do know that if the latam group figures out something that works very well, it won't be by copying what the project as a whole has already done. They might take pieces of what we do and adapt it but they will also need to experiment and try out new ideas. Not only because they have a different audience than other regions but also that what is being done in other regions has definite room for improvement.

Working Together for A Better Tomorrow

One thing that was brought up was that Latin America only has two commonly used languages. It should be much easier for latam to communicate and share resources (like documentation and posters) than Europe where there's a multitude of languages. And yet it seems like much of the work in fedora-latam is being done on a country by country level. Listening to the people doing the work, it seems like the main problem with working together is that collaboration takes time. When you have a small group of people that you can meet or talk to regularly, it is easy to arrange to do things together. When you expand to try to talk to other people that you only see once a year, have time zone differences, and see the needs of the people around you differently, you have a harder time getting anything done.

I think that we see this in all of the Fedora project, not just in fedora-latam. There are very definitely people who talk about things, people who make decisions, and people who get work done. There is overlap among the sets of people but there are other people who want to talk forever. I think that working together is definitely something to work towards but those who do things should not be slowed down by those who talk. If someone is willing to work on tools to help collaborate more, create it. If someone is off doing great things, report back what worked and what didn't so others can benefit from your experiences. Try to be open to other ideas but don't wait on other ideas being finalized to implement them if talking about them is dragging on and you think you can do a good job with the idea now.

Well, that's enough of my uninformed opinions for now :-) I'm just excited to hear what fedora-latam starts doing as they're pushing into new territory figuring out how to bring in contributors that are under represented in Fedora at this time.

espaƱol (google translate)

I love this quote: "Basically, my job is to be contagiously enthusiastic" -- Mel Chua in this interview.

Wanted: C++ Programmer to work with Inkscape upstream

One of the things to have emerged from the hallway track at the Google Summer of Code Mentor Summit was the need for a robust, featureful, free software whiteboarding tool. This would allow people to collaboratively work on project design, model workflow, and do things more visually than the current round of instant messaging, pastebins, collaborative text editors, and voip.

Currently, I know of two potential competitors for this. The first is Coccinella, a tcl program that does free-form drawing with a few caveats. Here's what mizmo, one of the main Fedora Design Team Members has to say about it:

For free-form drawing, Jabber-based Coccinella gets me close, but it's a little clunky and when people join a meeting late they don't get to see what was drawn on the whiteboard before they joined. I'd like it to automatically snapshot the whiteboard at various points and synchronize the snaps with the text conversation and automatically email me a report.

Additionally, coccinella doesn't have many of the tools that make diagramming, flow charting, and other, more structured drawings easier. For this, many artists use inkscape. Inkscape allows artists and designers to make mockups and quickly prototype new designs. At least a few open source developers also use it for making charts and diagrams to visualize their program's structure and execution. It would be great if we could collaborate on these over the Internet using inkscape's rich toolset. This is where the inkscape whiteboard plugin enters the picture.

The whiteboard plugin, inkboard, was written as a GSoC project in 2005. Although there's been some work on it since then, development has not kept pace with the rest of inkscape. Currently, it is disabled in the configure script since it doesn't work. However, I talked with inkscape developer Jon A. Cruz at the Mentor Summit and found that all is not lost. Although someone is needed to step up and work on inkboard to bring it back, recent changes in the core of inkscape will make it easier to implement. Removal of id tags in the SVG that bloated the image size and caused potential conflicts between two synchronizing inkscape programs as well as incorporation of a new XMPP implementation should make the next version of inkboard easier to write and more robust.

Now where do you come in? From time to time someone will write me an email that says, "I've been using Linux for years and now I want to give back to the community. I've got programming experience in C++, how can I help?" This is your chance to step up! Contact Jon or subscribe directly to the inkscape developers mailing list. Check out the inkscape code from svn. And then get hacking!

Adel Gadllah (dragoo1) ran my script on his computer with a couple other compressors: pbzip2 (a parallel implementation of bzip2) and pigz (a parallel version of gzip). His computer is a quad core with 6GB of RAM. A definite upgrade from the machine I tested on (dual core with 1GB of RAM). The results are quite interesting.

Since no new algorithms were introduced, just new implementations, the compression ratios didn't change much. But the times for the parallel implementations were very interesting. pbzip2 runs faster than gzip. pigz -9 runs faster than lzop -1! If compression was the only process being run on the machine then the parallel implementations are definitely worthwhile.

Well, after reading this message from notting about speeds and sizes of xz compression at various levels, I got curious about how gzip falls into the picture. So I wrote a little script to do some naive testing, found a 64MB text file (an sql database dump), and ran a naive benchmark. First, the script so you can all see what horrible assumptions I'm making:


LZOP='lzop -U' GZIP='gzip' BZIP='bzip2' XZ='xz'


for program in "$LZOP" "$GZIP" "$BZIP" "$XZ" ; do case $program in gz*) ext='.gz' ;; bz*) ext='.bz2';; xz*) ext='.xz';; lz*) ext='.lzo';; *) echo 'error! No configured compressor extension' exit ;; esac


for lvl in `seq 1 9` ; do c_time=`/usr/bin/time -f '%E' 2>&1 $program -$lvl $TESTFILE` c_size=`ls -l $COMPRESSEDFILE |awk '{print $5}'` d_time=`/usr/bin/time -f '%E' 2>&1 $program -d $COMPRESSEDFILE` printf '%-10s %10s %10s %10s\n' "$program -$lvl" $c_time $c_size $d_time done done

As you can see, I'm not flushing caches between runs or anything fancy to make this a truly rigorous test. I'm also running this on my desktop (although I wasn't actively doing anything on that machine, it was logged into a normal X session with all the wakeups and polling and etc that that implies.) I also only used a single input file for data. Binary files or tarballs with a mixture of text and images and executables could certainly give different results. Grab the script and try this out on your own sample data. And if you get radically different results, post them!

Compressor   Compress     Size   Decompress
----------   --------   -------  ----------
none [*]_     0:00.43   67348587    0:00.00

lzop -U -1 0:00.57 16293912 0:00.35 lzop -U -2 0:00.62 16292914 0:00.40 lzop -U -3 0:00.62 16292914 0:00.34 lzop -U -4 0:00.57 16292914 0:00.42 lzop -U -5 0:00.57 16292914 0:00.42 lzop -U -6 0:00.67 16292914 0:00.41 lzop -U -7 0:13.53 12824930 0:00.30 lzop -U -8 0:39.71 12671642 0:00.32 lzop -U -9 0:41.92 12669217 0:00.28

gzip -1 0:01.96 11743900 0:01.02 gzip -2 0:02.04 11397943 0:00.92 gzip -3 0:02.77 11054616 0:00.89 gzip -4 0:02.59 10480013 0:00.82 gzip -5 0:03.42 10157139 0:00.78 gzip -6 0:05.44 9972864 0:00.77 gzip -7 0:06.71 9703170 0:00.76 gzip -8 0:13.64 9592825 0:00.91 gzip -9 0:15.89 9588291 0:00.76

bzip2 -1 0:20.17 7695217 0:04.73 bzip2 -2 0:21.68 7687633 0:03.69 bzip2 -3 0:23.48 7709616 0:03.63 bzip2 -4 0:26.00 7710857 0:03.69 bzip2 -5 0:25.45 7715717 0:04.09 bzip2 -6 0:26.95 7716582 0:03.95 bzip2 -7 0:28.13 7733192 0:04.23 bzip2 -8 0:29.71 7756200 0:04.36 bzip2 -9 0:31.39 7809732 0:04.50 [@]_

xz -1 0:08.21 7245616 0:01.86 xz -2 0:10.75 7195168 0:02.23 xz -3 0:59.45 5767852 0:01.90 xz -4 1:01.75 5739644 0:01.83 xz -5 1:09.70 5705752 0:02.60 xz -6 1:46.23 5443748 0:02.09 xz -7 1:50.37 5431004 0:02.19 xz -8 2:02.41 5417436 0:02.19 xz -9 [#]_ 2:18.12 5421508 0:02.55

.. _[*]: Time to copy the file. .. _[@]: What's up with bzip2? Why does the size increase with higher levels? .. _[#]: Note, xz -9 is unfair on two counts: 1) it pushed me into swap. 2) As for the size, xz had this output during that run:: Adjusted LZMA2 dictionary size from 64 MiB to 35 MiB to not exceed the memory usage limit of 397 MiB

My conclusions based upon entirely too little data :-)

  • If you want transparent compression, use lzop at one of the lower compression settings. I got 25% of the size at 100 MB/s with lzop -2.
  • Do not use lzop with -7 or higher. If you want more compression than -2/3/4/5/6 (the algorithm for these is currently all the same) use gzip. You'll get better compression with better speed.
  • The only reason to use bzip2 is if you must have both a smaller size than gzip and you can't deploy xz there. If you don't need the smaller size or the remote side can get xz then bzip2 is a waste. This applies to distributing source code tarballs as two formats, for instance. If you're going to release in two formats, use tar.gz and tar.xz instead of tar.gz and tar.bz2.
  • xz gets the smallest size but it's versatile in other ways too: xz -2 is faster than gzip -9 with better compression ratios.
  • gzip beats xz at decompression but not nearly as badly it beat bzip2.

So thanks to cdfrey, I'm a little closer on two fronts.

First, the problem as given has a solution for hack #2 but apparently not hack #1. Here's the new sequence of commands:

git checkout base_url
git log
# Manually find the last commit in staging before I branched
git rebase --onto master [COMMIT ID FROM ABOVE]
git checkout master
git merge base_url

So no more patches, yay! However, you probably notice that we still have to use git log to find the branchpoint. After some discussion of this, it seems that if we have merged from the feature branch back to the branch it came from, there's no way around this. git does not maintain the history of where something came from and where it goes back to, it holds onto the heads and then follows the chain of commits back. So once we're merged, there's no branch point anymore... the trees are the same.

However, we did figure out a potential way to implement our workflow in the future. Instead of branching from staging, the feature branch should start off branching from master. After it's been worked on, it gets merged to staging. But since it started off from master, that should still leave the feature branch with a clear path of changes to apply to master. Once the changes have been tested in staging, we can merge the feature branch into master and it's then "okay" for the branchpoint to disappear since the work is completed.

Okay, git lovers, I have an incredibly simple problem but so far the only working solution is a kludge. I'm hoping someone can tell me what the elegant way to solve this problem is.

I'm working with three branches keeping configuration information for our environment. master is where our production configs live. staging is a branch where we merge changes and test them in our staging environment. Once tested, they get cherrypicked to master.

base_url is where I've been working on a new change that spans several commits. It was branched off of staging. After completion, I merged the changes into the staging branch and tested. So far so good.

Now I want to merge my branch into master. How do I do that?

Here's an idealized diagram of the branch relationships. In reality, sometimes changes go into master before staging.

master       staging   base_url
  |             |  merge |  ___
  | cherrypicks +<-------+   ^
  +<------------+        |   |
  |    (cp)     |        |  How do I merge these to master?
  +<------------+ branch |   |
  |    (cp)     +------->+  _V_
  |   branch    |

So far everything I've tried with git rebase or git merge seems to be sending changes from the staging branch that were present before I branched to base_url into master. I don't want that. I did my changes on a separate branch so I could merge just my changes to both staging and master later.

Here's the kludge that did work:

git checkout base_url
git log
# Manually find the last commit in staging before I branched
git format-patch [COMMIT ID FROM ABOVE]
git checkout master
git am [patch 0001 0002 0003....etc]

The two things that I find too hacky about this solution are:

  1. using git log to find the branch point. git should know where I branched from staging... I just need to tell it that I want to pull changes from the branch point forward somehow and it should find the proper commit id.
  2. generating patches and then applying them. git should be able to do this without generating "temporary files" like this. The deltas are in the repo, why pull them out into text files?

I have copies of my repository before I "fixed" it with the patch commands. So send me your recipes and I'll see if any of them work. Once we have a winner, I'll post strategies that worked and ones that didn't.

Of course, even after I know how to do this, there's still all sorts of follow on questions -- like, what happens if this new feature took a long time and I needed to remerge the base_url branch with staging in the middle?

a.badgergmail.com or abadger1999 on irc.freenode.net

78 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!