Older blog entries for simosx (starting at number 3)

You cannot write characters with accents on Linux console while in Unicode mode?

The dead keys functionality for Unicode mode needs some more love.

A workaround is to convert the console keymaps so that they do not require dead keys. For example, with dead keys, to put an accent on "a", you would normally type ";", then "a". You can convert the keymap so that, e.g. Alt+a will print the "a" with an accent.

To sum up on current support on Linux Console (that's not xterm but what you get with Ctrl-Alt-F1 :), you can view documents in Unicode from a wide range of character sets as long as combining characters are not required (Hindi no, Vietnamese yes). You can input (write) in Unicode mode as long as no dead keys are involved.

A dead key is a key that when you press it nothing happens. You press a second key and the character prints (for example, ";"+"a" for ά).

I tried the patch of Chris Heath on Fedora Core 2 and here is my take, from the user's point of view.

The patch works, surprisingly very well. It can be easily intergrated to the various distributions simply by modifying the /etc/sysconfig/i18n and /etc/sysconfig/keyboard configuration files.

The system scripts essentially call the following two commands (assuming we are already in Unicode mode):

% setfont <font-name> -m <console-screen-map>
% loadkeys <keymap>

For example, For Spanish:

% setfont latarcyrheb-sun16 -m 8859-1
% loadkeys es

For Finish:

% setfont latarcyrheb-sun16 -m 8859-1
% loadkeys fi

For Greek:

% setfont iso07u-16 -m 8859-7
% loadkeys gr

The character and key maps used are the "old" 8-bit versions. setfonts loads a Unicode map with the "-u" options instead of "-m". Also, the key maps for a few languages have been updates (for example, "gr-utf" for Greek). The new files (very few) cannot be used here. No need to update them ;-).

I tried a few languages and what follows shows characters produced from the console with composing. I used "vim" as my editor.

gr: Greek ά έ ί ό ύ ώ ϊ ϋ ϊ Ά Έ Ί Ή Ύ Ϋ
es: Spanish ñ á é í ý ú ü ï ÿ ä ë
nl: Dutch á é í ó ú ý à è ì ò ù
cz: Czech ä ë ö
us-ascentos: á é í ó ú ý ä ë ï ö ü ÿ
cf: french-canadian à è ì ò ù
fi: Finish ä ë ï ö ü â ê î ô û
fr French â ê î ô û ä ë ï ö ü ÿ

Therefore, from the user's point of view the patch works.


Interesting name, isn't it? It started as an entry of the blog of Sam Ruby with a recommendation to try the specific string in individual blog software to test if they can process properly Unicode.

If you google for Iñtërnâtiônàlizætiøn, you will get around a thousand results. Plus one.

11 Dec 2004 (updated 11 Dec 2004 at 22:49 UTC) »

While the Linux Desktop has been fully converted to Unicode and UTF-8, the console is still lacking in one significant area. Before going into details, let's start from the basics.

Most mainstream Linux distributions already put the console in Unicode mode. In addition, they load a suitable font for the console to display common languages. For example, the font latarcyrheb-sun16 covers Latin-based alphabets, Arabic, Cyrillic and Hebrew. If a filename uses those characters, they show correctly. The font files have been updated to include information about what is their unique Unicode ID. If the character is not found, a generic character will appear in its place. You may have noticed it; it says "LF" in one character cell.

Suppose you want to view Greek? Replace the font with the corresponding Greek console font.

The file /etc/sysconfig/i18n contains information about I18n for the console (but not limited). Only this file needs to be edited to accomodate other languages. By default it looks like


To configure to display, for example, Greek, the file should look like


The font files are located in /lib/kbd/consolefonts/ and you need to use the variation that has the u character in the filename, denoting a file that for each character it lists the Unicode code point.

I am not sure if there is a tool to edit the i18n file, simply by selecting the system language. This would be the best.

The program script /sbin/setsysfont uses /etc/sysconfig/i18n to do all the dirty work. Read the script for the details.

Up to now we dealt with displaying. How do we write? We write thanks to /bin/loadkeys. The kernel comes with a default keymap so you can write common English in any case.

Both /sbin/setsysfont and /bin/loadkeys are invoked by /etc/rc.d/rc.sysinit, during system startup.

Now, what's the problem? Well, the writing part is not done in "full utf-8" but rather in 8-bit, making it impossible to add accents making it difficult to write crafts such as Iñtërnâtiônàlizætiøn.

There are some patches to enable support, see below.

I have just sent an e-mail to linux-kernel and I am waiting to see the response. I hope something happens.

X fingers!

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!