Older blog entries for gpoo (starting at number 8)

How many recent files does an application really need?

A few days ago, Claudio wrote about the time Eog spent saving the filename of the recenly used image into ~/.recently-used.xbel. The reason: ~/.recently-used.xbel was too big. If I remember correctly, the FileChooser used to have a similar issue in the past.

Passing from 5.8 MiB to 1.8 MiB, through deleting all those items whose files does not exists, seems a bit gain. I wanted to go a bit further and I wondered ¿How many recent files does an application really need? (sorry, not that futher :-) I do not think more than 10, but let me know if I am wrong.

I wrote my own version of <a href=”http://www.gnome.org/~csaavedra/news-2008-03.html#D23″>Claudio’s program</a> with considering that matter. And my ~/.recently-used.xbel file went from 1.2 MiB to 54 KiB. Before to go to the script, let me show the numbers I got in a computer with less than two month of non intensive use:

gpoo@pendragon:~$ python clean-recently-used.py -v
Summary:
     1 Reproductor de películas Totem
     1 Glade
     4 GNU Image Manipulation Program
     4 Navegador web
     9 Visor de documentos Evince
     9 File Roller
    14 Web Browser
    15 Gnumeric Spreadsheet
    26 gedit
    34 Administrador de archivos
    36 Evince Document Viewer
    52 Totem Movie Player
   292 File Manager
  1151 Eye of GNOME Image Viewer

When I load Eog, it only show me the last 5 files I opened before. Why does it need 1146 extra items stored?

Nevermind. The <a href=”http://www.gnome.org/~gpoo/bag/clean-recently-used.py”>script</a> I wrote is simple. It delete the files that does not exists (the same strategy as Claudio’s program), but it also delete the files that are not so recently used, and I got the following numbers:

gpoo@pendragon:~$ python clean-recently-used.py -v
Summary:
     1 Glade
     3 GNU Image Manipulation Program
     4 Navegador web
     8 File Roller
     9 Visor de documentos Evince
    10 Totem Movie Player
    10 Eye of GNOME Image Viewer
    10 Web Browser
    10 Gnumeric Spreadsheet
    12 Evince Document Viewer
    13 Administrador de archivos
    14 gedit
    41 File Manager

Now you can put the <a href=”http://www.gnome.org/~gpoo/bag/clean-recently-used.py”>script</a> to be executed when you start your session or you can program a cron task to do it.

You can play with the script using it just with -v, which will give you only a summary of use.

And it is slow when delete items (in seconds), but much better when it is controlled.

Syndicated 2008-04-01 07:01:33 from Germán Poó-Caamaño

How much space are eating your thumbnails?

Thumbnails are created by applications and thanks to a proposed draft are shared among desktops. But, it doesnot mean that every thumbnail stored in your home directory is useful for the purpose they were created. Some of them points to a file that doesnot exists anymore, some of them are broken images, and some of them were created by applications that doesnot respect the proposed draft.

Basically there are two size of thumbnails: normal (128x128 pixels) and large (256x256 pixels). Each thumbnail must contains at least two pairs of key/value, one of them is the URI of the original file and the another one is the last time the file was modified.

To get the file name of a thumbnail a MD5 sum must be applied to its URI. If you move the file to a new location, then the name of the thumbnail must be updated (also its metadata).

When you delete a file through Nautilus, this file is moved to the Trash folder. Furthermore, its thumbnail must be updated. Nautilus does it right, which is good. But, when you expunge the Trash, only the original file is deleted, not the thumbnail; which is bad, but easy to fix.

On the other hand, when you rename a folder, the next time the folder will be visited (in this case under a new name), the thumbnails will be regenerated, because for each URI there is no a thumbnails associated. Now, you have two thumbnails stored for the same file, but only one is valid. If you repeat this step often, your .thumbnails filder will get polluted of useless thumbnails.

Instead of renaming the folder, you can create a new folder, then move the group of files there, and finally, delete the old one. In this case Nautilus will not regenerate the thumbnails, it will update the thumbnails correctly. At least in the first hiearchy (I have not test it deeply).

The worst case happens when the files are moved or deleted by a non free desktop compliant (or kind of compliant) application, let's say the shell. The thumbnails associated to those files will not be updated or deleted. (inotify to rescue?).

The average for a normal thumbnail is 25Kb of space while for a large one is 75Kb. If you maintain a lot of pictures in a long period of time (with all the file management involved), probably you have enough space wasted by useless thumbnails.

At least, I had. And I have the feeling that some other people, too. A time to live for thumbnails was requested, as is filed in bugzilla #150483.

Instead of delete my old thumbnails, I prefer to delete only the useless ones (in the sense of my first paragraph). So, I wrote a little script in Python (shorter than my comment) that estimate how much space I am wasting because of useless thumbnails.

Rupert in the scene

Rupert in Code Monkey at Work

After the success of "Code Monkey at Work", where Rupert was the young hero of the movie, Rupert was invited to participate in a cameo of "Aardvark'd: 12 weeks with geeks", a documentary film (idea of Joel Spolsky) of the whole process to build copilot (the simplest way to use VNC trhought a reflector; full of features with a simple user interface, and a better name).

Rupert in Aardvark'd
Rupert in the documentary
(Rupert also appears in the trailer)

I just received my copy on Saturday 10 and I watched it the same afternoon. Interesting, even if you agree or not of what Joel usually writes about software development.

Anyway, I thought Fog Creek was bigger than I realized it is; and I was not expecting their concerns about risks in their first days (probably the same happened for Ximian, Fluendo and other related companies).

A keynote of Joel Spolsky could be very interesting indeed. Or have a short documentary of our community (we have the chance at GUADEC, Boston Summit, and so on).

II Fórum GNOME in Brazil

Finally I'm going to Brazil to the second version of Fórum do GNOME that will take place at Curitiba, Brazil. I'm looking forward to meet Lucas Rocha, Sandino Flores, Tim Ney, Marcos Mazoni, Izabel Valverde and local community. (Thanks to Celepar for funding my ticket to get there).

I will also be speaking about GNOME at "I Seminário de Software Livre", at Univeridade Metodista de São Paulo on Tuesday 29.

Sadly, I won't be able to attend to "III SOLISC - Congresso Catarinense de Software Livre" at Florianópolis, because I must to be at my work on Friday 2.

I'll do my best effort to get my slides in brazilian portuguese; (but also in spanish and english as backup). Just I'm little worried, not because of the language itself, I know they understand spanish and I understand some portuguese as well; it's because I usually speak a fast spanish, even for chilean standards. I hope nobody care to stop me any time is needed.

A graph of gnome-session

Thanks to a Tatiana's coursework, we took a short time to make a graph of the init session of GNOME using Bootchart.

Bootchart is a program to collect information of the use of disc and processor of the system. It was created to be able to collect real data with which to be able to determine the necks of bottle in the init process of Linux and thus to be able to accelerate it.

It seemed to be "an invasive" process, that it requires to perhaps modify the kernel or something similar. The certain thing, is that to collect the data it is only required to install an script, which must be invoked by kernel. Don't forget that the first process executed executed in UNIX is init; instead of invoking to init it is necessary to invoke to bootchartd, which leaves running some processes in background and later it will invoke to init. It is as simple as adding the following option in the boot:

init=/sbin/bootchartd

The collected data will be stored in /var/log/bootchart.tgz (which contains 4 archives); file that will be the input for bootchart, a program written in Java that allows to generate a graph from the collected data.

Basically the program reads every 2 seconds the virtual archives/proc/diskstats,/proc/stats and/proc/*/stat (for each one of the processes in execution). Being thus, it is possible to use it to measure other programs; for example, the init session of GNOME.

Perhaps many already they have realized that the starting depends much on the time that takes gconf in reading all the preferences and schemes. Thus, Federico has been indicating it for already a month and Havoc has let know that it is a pending issue of gconf and is awaiting for some developer takes the task to solves the problem.

As an image is worth more than thousand words, I ran bootchartd to make a graph of my GNOME session in a notebook (Centrino 1.8 Ghz, 512 MB ram):

Graph of the GNOME's init session
Graph of the second session. The firts one takes more time, but it's possible to get the idea.

It shows the time that each application in initiating takes, as well as the disc use that each program does. The process really start when the gdmgreeter is replaced by x-session-manager, and it happens whe the user and password are entered and it's finish when the interface is ready to be used; in my case it happened in second 23.

The reading of gconf varies according to the amount of stored keys, which goes in direct relation with the use of applications that a user has. Thus, in the first session of a user, gconf carries out less than 1/3 of the readings that the graph of my session of today. As well as it varies if it corresponds to the first session since the computer start up.

As is possible to see in the graph, there is another process that for much reading to disc, it is Jamboree, which reads the music archives. Also, there is a python process that make a intensive use of CPU in a little period of time (which is related to the notificador of updates of Ubuntu).

The process of data collection is simple:

$ sudo bootchartd start
$ (one begins the session, via gdm, startx, etc.)
$ sudo bootchartd stop
Once the process finish is time to run bootchart to generate the graph. If Java is not available on the machine, still is possible to get the graph filling a form in the website of Bootchart.

At first sight, it seemed that the process of data collection is complex. But it is not. The main process is processing the collected data and get the graph.

Cultivating Third World Countries Developers

I just uploaded the slides of my talk/BOF in the 6th GUADEC at Sttugart.

My original idea was to write about the discussion that we had. It was quite interesting for me. But I haven't have engough time to do it.

Anyway, some of the slides were written as funny and superficial. It was a mix of joke and serious talk; mainly to get an idea of some differences that people perceive in 3th world countries, some of them could seem stupid/strange for a structured mind.

The original idea was born in the middle of a conversation that Roozbeh and I had in the 5th GUADEC at Kristiansand, Norway.

New place for GNOME's writings

As far as I have my own space to store my writings I decided to use this space to write my GNOME's related topics.

Also, my main blog it's written in spanish and I'll follow write there in my native language, because I know it's read by people that only speaks in spanish.

I started writing as a personal record of my activities, after that, to keep my friends updated without repeating every activity every time.

Because there are friends and people that doesn't speaks spanish I decided to start writing in english as well. At first time, just only GNOME's related topics, where belongs the most people I know.

25 Sep 2002 (updated 22 Sep 2005 at 03:45 UTC) »

Please follow my diary at written in spanish. Also I write sometimes an english blog.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!