I always took for granted that my name -- Benoît -- was 6 chars long. I was wrong.
It's 7, UTF-8 speaking.
It took me some time this afternoon to understand why sizeof "Benoît" == 8 where i would expect 7. So i hexdump'ed my C file and realized that î is encoded as 0xC3AE. I'm glad that ASCII chars are still encoded on a single byte in UTF-8 so hacks like this one:
char buf[magic]; /* enough to hold "plop" */are still 0k. I'll try to be less lazy and always code :
char buf[sizeof "plop"];
WTF is î ?
U+00EE LATIN SMALL LETTER I WITH CIRCUMFLEX UTF-8 : 0xC3 0xAEIn French, circumflexes '^' on vowels often replace old French 's' :
- hôpital for hospital
- hôtel for hostel (= hotel)
- Benoît for Benoist
- côte for coste (= coast)
- etc
Benoît is the french for Benedict.