Older blog entries for acj (starting at number 5)

Yesterday, 10 July, was the due date for the second milestone of my work on DebGraph. I am happy to report that we are roughly two weeks ahead of schedule, so meeting this milestone was not a cause for worry.

We now have support for the following graph operators:

  • Difference
  • Intersection
  • Filter (by properties)
  • Find Cycles (via Tarjan's algorithm)
  • Find Dependencies
  • Find Reverse Dependencies
  • Symmetric Difference (XOR)
  • Union

The next milestone includes the development of a high-level language (or integration with an existing extension language) that streamlines the construction of complex queries using the operators listed above. We can build arbitrarily complex queries using the C++ operators, but dealing with the static typing and compiler toolchain can be very clunky. As such, I have spent the past week working on Lua bindings for DebGraph, which will enable us to query the graph of Debian packages from a clean, dynamically typed language. Lua has a powerful C API that exposes the Lua stack and lets us move DebGraph information to and from the Lua interpreter. I'm writing documentation that outlines how to use DebGraph from both C++ and Lua in order to make this work accessible to more folks.

Sneak peak
As an example, we can utilize the FindCycles operator in Lua as follows:


libdg = package.loadlib("./libdebgraph.so",
"luaopen_libdebgraph")
libdg()
LoadPackages('cache')
-- 'g' is the main graph of unstable binary-arm packages
cycles = FindCycles(g)
print("Found " .. #fc .. " cycles")
for comp_key,comp in pairs(cycles) do
    comp_nodes = ""
    for node_key,node in pairs(comp) do
        prop = GetProperty(node, "Package")
        comp_nodes = comp_nodes .. prop .. " "
    end
    print("* " .. comp_nodes)
end

The above Lua script produces the following result:


reading cache/unstable/main/binary-arm/Packages
Found 61 cycles
* debconf-english debconf-i18n debconf
* perl-modules perl 
* cdebconf fontconfig-config fontconfig gsfonts-x11
libcairo2 libfontconfig1 libgtk2.0-0 libpango1.0-0
libpango1.0-common libxft2 ucf x11-utils xutils
[...]

These are the package names in the strongly connected components, shown in alphabetical order.

There is still a lot of work to do before the Lua binding will achieve feature parity with the core library, but we have now laid the foundation (and a brick or two).

The overall situation (admittedly a bit of a stretch):
The chicken cycle
(The soil depends on the chicken to keep it fertilized. Work with me here.)

There are an infinitude of cycles that could be extracted from this, but we're only interested in the set of shortest-length cycles.

Exhibit A:

Exhibit B:

And the classic philosophical gem, Exhibit C:

Is the problem solved, then? Not quite. The current DebGraph cycle detection operator works well for dependencies, but we also want to discover cycles of conflicting packages. Furthermore, one of our documented use cases involves finding nodes that depend on and conflict with one another.

The GNU C library (packaged as libc6 in Debian) is the bedrock of a lot of code. What are the packages that it depends on? (Yo, DebGraph!)

Original image [997 kB]

libc6 is one of the well-connected vertices in the lower-left corner (third quadrant, if you like). For graphs like this one, we would like DebGraph to identify dependency cycles (loops). For example, libc6 depends on libgcc1, which in turn depends on libc6. This can be seen in the graph above by applying some careful scrutiny, but why not let the computer work its magic for us? (To be continued.)

As a bonus, who depends on arping 2.05-2 for i386?

The roadmap for DebGraph outlines three sets of graph operators: essential, useful, and exotic. Naturally, each successive set extends the previous set's functionality, and in many cases the complexity of the work performed by the operators increases accordingly.

Over the weekend, I wrapped up work on the essential operators and separated the unit test suite into its own subdirectory and executable. Having a few simple operators around means that we can do basic queries:

  • Show me all 3dchess packages.
  • Show me all all libc6 packages.
  • Show me the union of the two previous results, including dependencies between the two groups.
(Click the graphs for a larger version.)

These graphs can be helpful, but they are rather trivial. More useful would be an operator that could, say, find all dependency loops (cycles) and return them in such a way that each cycle could be examined individually.

Next step: an operator that computes all dependencies for a given package.

Last week I bought a Netgear WG511 wireless adapter for my laptop, intending to replace the archaic prism2-based adapter that didn't support WPA and barely worked. Initial searches indicated success in making it work with Linux, so I wasn't worried.

It turns out that there are at least three different versions of this card, each with a different chipset, and the only clear delineation is the "Made in X" label on the adapter, where X is either "China" or "Taiwan". My WG511 was made in China and has a Marvell chipset. Among other things, this means that I'm forced to use ndiswrapper and the Windows 2000 drivers. (The adapters made in Taiwan are natively supported by the prism54 driver included with the Linux kernel.) Okay, fine, I thought. It's not ideal, but the adapter is popular enough that maybe development of a native Marvell driver is underway.

The next step was to get WPA working. The NetworkManager applet listed my ESSID with good signal strength, but after repeated unsuccessful attempts to associate it was clear that something was wrong. The applet would spin for a few seconds trying to connect before immediately returning to the "no connection" icon. I tried the Windows XP drivers with the same result. I tried rebooting to give the card a chance to reset: same result.

The solution was to use wpasupplicant directly, specifying my ESSID, pre-shared key, algorithms, and so on directly in the wpa_supplicant.conf. After a few seconds of key negotiation, I was associated and had acquired an IP address.

So:

  • Manufacturers need to play ball more. Openness will sell more units!
  • What exactly is causing nm-applet to fail?
  • Yay for SoC hacking on the front porch.

This summer, I'm working on DebGraph (for Debian, as part of GSoC) with Robert Lemmen. Below is a brief description of where we're headed.

DebGraph is a tool that provides a simple, intuitive interface to the Debian package graph. The package graph contains information about packages (name, version, description, etc), their dependencies, and other useful details. By providing an easy way to interact with this graph, DebGraph enables useful analyses that can identify interesting or problematic constellations.

As an example, it is helpful for the Debian QA team to track cyclic dependencies (that is, packages that [in]directly depend on themselves). Such dependency cycles can be symptoms of package problems, and therefore it is important that there is tool support for locating and tracking them.

To date, many tools have been written to perform various tasks that involve analyzing Debian packages. The `lintian' tool performs tests on packages to ensure that they are complete. `Britney', another package analysis tool, assists the release and quality assurance teams by verifying package dependencies before promoting packages from the `unstable' distribution to `testing'.

As the number of packages in Debian has grown, the complexity of the package dependency graph has become difficult to manage. Development of new scripts that interact with package information involves addressing that complexity. This involves writing code that can represent the graph in memory and manipulate it, and such work is typically delicate and error-prone. It would be helpful to separate this functionality into a library that could provide generic graph infrastructure to anyone who needs to query or manipulate the Debian package dependency graph.

DebGraph aims to provide this generic infrastructure through a set of graph operators (e.g., union, intersection, find-cycles, and so on) that can be composed in complex ways to facilitate powerful analysis of the graph. In addition, the development of a functional query language will enable graph queries from domain-specific languages, thus removing the constraint of using C++ code to perform queries.

This work relieves developers of the burden of duplicating graph-specific code when developing new tools (or extending existing ones) that need access to the package dependency graph. By providing a generic infrastructure to support such tools, DebGraph enables tool developers (and others in the community) to focus on the issue at hand without becoming swamped with low-level details.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!