timj is currently certified at Master level.

Name: Tim Janik
Member since: N/A
Last Login: 2011-04-03 01:59:01

FOAF RDF Share This

Homepage: http://timj.testbit.eu/

Notes:

I'm a long standing GTK+ maintainer and also regularly or randomly hacking various other free software components. Of course I also have a couple of private projects that can be found on my homepage. Allthough i never user Advogato for blogging, you can find my personal ramblings on my homepage.

Projects

Recent blog entries by timj

Syndication: RSS 2.0

Sayepurge.sh – determine deletion of aging backups

About

Determine candidates and delete from a set of directories containing aging backups.
As a follow up to the release of sayebackup.sh last December, here’s a complimentary tool we’re using at Lanedo. Suppose a number of backup directories have piled up after a while, using sayebackup.sh or any other tool that creates time stamped file names:

 drwxrwxr-x etc-2010-02-02-06:06:01-snap
 drwxrwxr-x etc-2011-07-07-06:06:01-snap
 drwxrwxr-x etc-2011-07-07-12:45:53-snap
 drwxrwxr-x etc-2012-12-28-06:06:01-snap
 drwxrwxr-x etc-2013-02-02-06:06:01-snap
 lrwxrwxrwx etc-current -> etc-2012-12-28-06:06:01-snap

Which file should be deleted once the backup device starts to fill up?
Sayepurge parses the timestamps from the names of this set of backup directories, computes the time deltas, and determines good deletion candidates so that backups are spaced out over time most evenly. The exact behavior can be tuned by specifying the number of recent files to guard against deletion (-g), the number of historic backups to keep around (-k) and the maximum number of deletions for any given run (-d). In the above set of files, the two backups from 2011-07-07 are only 6h apart, so they make good purging candidates, example:

 $ sayepurge.sh -o etc -g 1 -k 3 
 Ignore: ./etc-2013-02-02-06:06:01-snap
 Purge:  ./etc-2011-07-07-06:06:01-snap
 Keep:   ./etc-2012-12-28-06:06:01-snap
 Keep:   ./etc-2011-07-07-12:45:53-snap
 Keep:   ./etc-2010-02-02-06:06:01-snap

For day to day use, it makes sense to use both tools combined e.g. via crontab. Here’s a sample command to perform daily backups of /etc/ and then keep 6 directories worth of daily backups stored in a toplevel directory for backups:

 /bin/sayebackup.sh -q -C /backups/ -o etc /etc/ && /bin/sayepurge.sh -q -o etc -g 3 -k 3

Let me know in the comments what mechanisms you are using to purge aging backups!

Resources

The GitHub release tag is here: backups-0.0.2
Script URL for direct downloads: sayepurge.sh

Usage
Usage: sayepurge.sh [options] sources...
OPTIONS:
  --inc         merge incremental backups
  -g <nguarded> recent files to guard (8)
  -k <nkeeps>   non-recent to keep (8)
  -d <maxdelet> maximum number of deletions
  -C <dir>      backup directory
  -o <prefix>   output directory name (default: 'bak')
  -q, --quiet   suppress progress information
  --fake        only simulate deletions or merges
  -L            list all backup files with delta times
DESCRIPTION:
  Delete candidates from a set of aging backups to spread backups most evenly
  over time, based on time stamps embedded in directory names.
  Backups older than <nguarded> are purged, so that only <nkeeps> backups
  remain. In other words, the number of backups is reduced to <nguarded>
  + <nkeeps>, where <nguarded> are the most recent backups.
  The puring logic will always pick the backup with the shortest time
  distance to other backups. Thus, the number of <nkeeps> remaining
  backups is most evenly distributed across the total time period within
  which backups have been created.
  Purging of incremental backups happens via merging of newly created
  files into the backups predecessor. Thus merged incrementals may
  contain newly created files from after the incremental backups creation
  time, but the function of reverse incremental backups is fully
  preserved. Merged incrementals use a different file name ending (-xinc).
See Also

Sayebackup.sh – deduplicating backups with rsync

Syndicated 2013-02-08 15:28:59 from Tim Janik

Performance of a C++11 Signal System

C++11Signals

Performance of a C++11 Signal System

First, a quick intro for for the uninitiated, signals in this context are structures that maintain a lists of callback functions with arbitrary arguments and assorted reentrant machinery to modify the callback lists and calling the callbacks. These allow customization of object behavior in response to signal emissions by the object (i.e. notifying the callbacks by means of invocations).

Over the years, I have rewritten each of GtkSignal, GSignal and Rapicorn::Signal at least once, but most of that is long a time ago, some more than a decade. With the advent of lambdas, template argument lists and std::function in C++11, it became time for me to dive into rewriting a signal system once again.

So for the task at hand, which is mainly to update the Rapicorn signal system to something that fits in nicely with C++11, I’ve settled on the most common signal system requirements:

  • Signals need to support arbitrary argument lists.
  • Signals need to provide single-threaded reentrancy, i.e. it must be possible to connect and disconnect signal handlers and re-emit a signal while it is being emitted in the same thread. This one is absolutely crucial for any kind of callback list invocation that’s meant to be remotely reliable.
  • Signals should support non-void return values (of little importance in Rapicorn but widely used elsewhere).
  • Signals can have return values, so they should support collectors (i.e. GSignal accumulators or boost::signal combiners) that control which handlers are called and what is returned from the emission.
  • Signals should have only moderate memory impact on class instances, because at runtime many instances that support signal emissions will actually have 0 handlers connected.

For me, the result is pretty impressive. With C++11 a simple signal system that fullfils all of the above requirements can be implemented in less than 300 lines in a few hours, without the need to resort to any preprocessor magic, scripted code generation or libffi.

I say “simple”, because over the years I’ve come to realize that many of the bells and whistles as implemented in GSignal or boost::signal2 don’t matter much in my practical day to day programming, such as the abilities to block specific signal handlers, automated tracking of signal handler argument lifetimes, emissions details, restarts, cancellations, cross-thread emissions, etc.

Beyond the simplicity that C++11 allows, it’s of course the performance that is most interesting. The old Rapicorn signal system (C++03) comes with its own set of callback wrappers named “slot” which support between 0 and 16 arguments, this is essentially mimicking std::function. The new C++11 std::function implementation in contrast is opaque to me, and supports an unlimited number of arguments, so I was especially curious to see the performance of a signal system based on it.

I wrote a simple benchmark that just measures the times for a large number of signal emissions with negligible time spent in the actual handler.

I.e. the signal handler just does a simple uint64_t addition and returns. While the scope of this benchmark is clearly very limited, it serves quite well to give an impression of the overhead associated with the emission of a signal system, which is the most common performance relevant aspect in practical use.

Without further ado, here are the results of the time spent per emission (less is better) and memory overhead for an unconnected signal (less is better):

Signal System   Emit() in nanoseconds Static Overhead Dynamic Overhead
GLib GSignal 341.156931ns   0   0
Rapicorn::Signal, old  178.595930ns  64   0
boost::signal2   92.143549ns  24  400 (=265+7+8*16)
boost::signal   62.679386ns  40   392 (=296+6*16)
Simple::Signal, C++11    8.599794ns   8   0
Plain Callback    1.878826ns   -   -

 

Here, “Plain Callback” indicates the time spent on the actual workload, i.e. without any signal system overhead, all measured on an Intel Core i7 at 2.8GHz. Considering the workload, the performance of the C++11 Signals is probably close to ideal, I’m more than happy with its performance. I’m also severely impressed with the speed that std::function allows for, I was originally expecting it to be at least a magnitude larger.

The memory overhead gives accounts on a 64bit platform for a signal with 0 connections after its constructor has been called. The “static overhead” is what’s usually embedded in a C++ instance, the “dynamic overhead” is what the embedded signal allocates with operator new in its constructor (the size calculations correspond to effective heap usage, including malloc boundary marks).

The reason GLib’s GSignal has 0 static and 0 dynamic overhead is that it keeps track of signals and handlers in a hash table and sorted arrays, which only consume memory per (instance, signal, handler) triplet, i.e. instances without any signal handlers really have 0 overall memory impact.

Summary:

  • If you need inbuilt thread safety plus other bells and can spare lots of memory per signal, boost::signal2 is the best choice.
  • For tight scenarios without any spare byte per instance, GSignal will treat your memory best.
  •  If you just need raw emission speed and can spare the extra whistles, the C++11 single-file simplesignal.cc excels.

For the interested, the brief C++11 signal system implementation can be found here: simplesignal.cc
The API docs for the version that went into Rapicorn are available here: aidasignal.hh

PS: In retrospect I need to add, this day and age, the better trade-off for Glib could be one or two pointers consumed per instance and signal, if those allowed emission optimizations by a factor of 3 to 5. However, given its complexity and number of wrapping layers involved, this might be hard to accomplish.

Syndicated 2013-01-25 15:11:11 from Tim Janik

Sayebackup.sh – deduplicating backups with rsync

About

Due to popular request, I’m putting up a polished version of the backup script that we’ve been using over the years at Lanedo to backup our systems remotely. This script uses a special feature of rsync(1) v2.6.4 for the creation of backups which share storage space with previous backups by hard-linking files.
The various options needed for rsync and ssh to minimize transfer bandwidth over the Internet, time-stamping for the backups and handling of several rsync oddities warranted encapsulation of the logic into a dedicated script.

Resources

The GitHub release tag is here: backups-0.0.1
Script URL for direct downloads: sayebackup.sh

Example

This example shows creation of two consecutive backups and displays the sizes.

$ sayebackup.sh -i ~/.ssh/id_examplecom user@example.com:mydir # create backup as bak-.../mydir
$ sayebackup.sh -i ~/.ssh/id_examplecom user@example.com:mydir # create second bak-2012...-snap/
$ ls -l # show all the backups that have been created
drwxrwxr-x 3 user group 4096 Dez  1 03:16 bak-2012-12-01-03:16:50-snap
drwxrwxr-x 3 user group 4096 Dez  1 03:17 bak-2012-12-01-03:17:12-snap
lrwxrwxrwx 1 user group   28 Dez  1 03:17 bak-current -> bak-2012-12-01-03:17:12-snap
$ du -sh bak-* # the second backup is smaller due to hard links
4.1M    bak-2012-12-01-03:16:50-snap
128K    bak-2012-12-01-03:17:12-snap
4.0K    bak-current
Usage
Usage: sayebackup.sh [options] sources...
OPTIONS:
  --inc         make reverse incremental backup
  --dry         run and show rsync with --dry-run option
  --help        print usage summary
  -C <dir>      backup directory (default: '.')
  -E <exclfile> file with rsync exclude list
  -l <account>  ssh user name to use (see ssh(1) -l)
  -i <identity> ssh identity key file to use (see ssh(1) -i)
  -P <sshport>  ssh port to use on the remote system
  -L <linkdest> hardlink dest files from <linkdest>/
  -o <prefix>   output directory name (default: 'bak')
  -q, --quiet   suppress progress information
  -c            perform checksum based file content comparisons
  --one-file-system
  -x            disable crossing of filesystem boundaries
  --version     script and rsync versions
DESCRIPTION:
  This script creates full or reverse incremental backups using the
  rsync(1) command. Backup directory names contain the date and time
  of each backup run to allow sorting and selective pruning.
  At the end of each successful backup run, a symlink '*-current' is
  updated to always point at the latest backup. To reduce remote file
  transfers, the '-L' option can be used (possibly multiple times) to
  specify existing local file trees from which files will be
  hard-linked into the backup.
 Full Backups:
  Upon each invocation, a new backup directory is created that contains
  all files of the source system. Hard links are created to files of
  previous backups where possible, so extra storage space is only required
  for contents that changed between backups.
 Incremental Backups:
  In incremental mode, the most recent backup is always a full backup,
  while the previous full backup is degraded to a reverse incremental
  backup, which only contains differences between the current and the
  last backup.
 RSYNC_BINARY Environment variable used to override the rsync binary path.
See Also

Testbit Tools – Version 11.09 Release

flattr this!

Syndicated 2012-12-01 02:32:59 from Tim Janik

ListItemFilter Mediawiki Extension

For a while now, I’ve been maintaining my todo lists as backlogs in a Mediawiki repository. I’m regularly deriving sprints from these backlogs for my current task lists. This means identifying important or urgent items that can be addressed next, for really huge backlogs this can be quite tedious.

A SpecialPage extension that I’ve recently implemented now helps me through the process. Using it, I’m automatically getting a filtered list of all “IMPORTANT:”, “URGENT:” or otherwise classified list items. The special page can be used per-se or via template inclusion from another wiki page. The extension page at mediawiki.org has more details.

The Mediawiki extension page is here: http://www.mediawiki.org/wiki/Extension:ListItemFilter

The GitHub page for downloads is here: https://github.com/tim-janik/ListItemFilter

flattr this!

Syndicated 2012-11-23 17:58:17 from Tim Janik

Meeting up at LinuxTag 2012

 

Like every year, I am driving to Berlin this week to attend LinuxTag 2012 to attend the excellent program. If you want to meet up and chat about projects, technologies, Free Software or other things, send me an email or leave a comment with this post and we will arrange for it.

flattr this!

Syndicated 2012-05-15 13:12:56 from Tim Janik

17 older entries...

 

timj certified others as follows:

  • timj certified raph as Master
  • timj certified federico as Master
  • timj certified miguel as Master
  • timj certified macricht as Journeyer
  • timj certified stric as Journeyer
  • timj certified Adrian as Journeyer
  • timj certified shawn as Journeyer
  • timj certified tigert as Journeyer
  • timj certified lewing as Journeyer
  • timj certified hp as Master
  • timj certified andersca as Master
  • timj certified jacob as Journeyer
  • timj certified nether as Journeyer
  • timj certified vicious as Journeyer
  • timj certified jrb as Journeyer
  • timj certified clahey as Journeyer
  • timj certified LotR as Journeyer
  • timj certified yosh as Master
  • timj certified flaggz as Journeyer
  • timj certified kenelson as Journeyer
  • timj certified bit as Journeyer
  • timj certified jmacd as Master
  • timj certified xach as Journeyer
  • timj certified jlbec as Journeyer
  • timj certified sjburges as Journeyer
  • timj certified alan as Master
  • timj certified Guillaume as Journeyer
  • timj certified Slow as Journeyer
  • timj certified cameron as Apprentice
  • timj certified pavlov as Journeyer
  • timj certified terop as Journeyer
  • timj certified neo as Master
  • timj certified shaver as Master
  • timj certified notzed as Journeyer
  • timj certified feldspar as Apprentice
  • timj certified johnsonm as Master
  • timj certified tml as Journeyer
  • timj certified mjs as Journeyer
  • timj certified pat as Journeyer
  • timj certified riel as Journeyer
  • timj certified martin as Journeyer
  • timj certified blizzard as Journeyer
  • timj certified jbuck as Apprentice
  • timj certified campd as Apprentice
  • timj certified Jimbob as Journeyer
  • timj certified chrisd as Apprentice
  • timj certified bertrand as Apprentice
  • timj certified jsh as Master
  • timj certified jamesh as Master
  • timj certified terral as Apprentice
  • timj certified kelly as Apprentice
  • timj certified justin as Apprentice
  • timj certified Ricdude as Apprentice
  • timj certified lupus as Apprentice
  • timj certified eskil as Apprentice
  • timj certified Raphael as Journeyer
  • timj certified DV as Journeyer
  • timj certified happybob as Apprentice
  • timj certified jonas as Journeyer
  • timj certified mathieu as Apprentice
  • timj certified Telsa as Journeyer
  • timj certified dcm as Master
  • timj certified rms as Master
  • timj certified munizao as Apprentice
  • timj certified mitch as Journeyer
  • timj certified aersoy as Apprentice

Others have certified timj as follows:

  • raph certified timj as Journeyer
  • hp certified timj as Master
  • lewing certified timj as Master
  • stric certified timj as Journeyer
  • clahey certified timj as Master
  • vicious certified timj as Master
  • flaggz certified timj as Master
  • yosh certified timj as Master
  • jacob certified timj as Master
  • Centove certified timj as Master
  • sjburges certified timj as Master
  • Slow certified timj as Master
  • shawn certified timj as Master
  • bit certified timj as Master
  • andrei certified timj as Master
  • ole certified timj as Master
  • cameron certified timj as Master
  • neo certified timj as Master
  • Acapnotic certified timj as Master
  • feldspar certified timj as Master
  • mjs certified timj as Master
  • harold certified timj as Master
  • bombadil certified timj as Master
  • dcm certified timj as Master
  • jsh certified timj as Master
  • Raphael certified timj as Master
  • listen certified timj as Master
  • mitch certified timj as Master
  • mathieu certified timj as Master
  • aaronl certified timj as Master
  • gstein certified timj as Master
  • duncan certified timj as Master
  • lupus certified timj as Journeyer
  • jimmac certified timj as Master
  • odaf certified timj as Master
  • asmodai certified timj as Master
  • kelly certified timj as Master
  • Darin certified timj as Master
  • Adrian certified timj as Master
  • eskil certified timj as Master
  • dsueiro certified timj as Master
  • Guillaume certified timj as Master
  • nils certified timj as Master
  • harinath certified timj as Master
  • jonas certified timj as Journeyer
  • nelsonrn certified timj as Master
  • lauris certified timj as Master
  • nomis certified timj as Master
  • rodrigo certified timj as Master
  • jae certified timj as Master
  • jsheets certified timj as Master
  • dbartold certified timj as Master
  • timg certified timj as Master
  • jules certified timj as Master
  • jonkare certified timj as Master
  • inri certified timj as Master
  • bratsche certified timj as Master
  • timur certified timj as Master
  • motty certified timj as Master
  • jLoki certified timj as Master
  • jfleck certified timj as Master
  • jamesh certified timj as Master
  • adulau certified timj as Master
  • rw certified timj as Master
  • andersca certified timj as Master
  • ricardo certified timj as Master
  • murrayc certified timj as Master
  • gka certified timj as Journeyer
  • carol certified timj as Master
  • mathrick certified timj as Master
  • dbrock certified timj as Master
  • lucasr certified timj as Master
  • cinamod certified timj as Master
  • kfoltman certified timj as Master
  • henrique certified timj as Master
  • nedko certified timj as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page