This Time Self-Hosted
dark mode light mode Search

International problems

I’m probably one quite strange person myself, that I knew, but I never thought that I would actually have so many problems when it comes to internationalisation, especially on Linux, but not limited to. I have written before that I have problems with my name (and a similar issue happened last week when the MacBook I ordered for my mom was sent by TNT to “Diego Petten?” ­– which wouldn’t then be found properly by the computer system when looking up the package by name), but lately I have been having even worse problems.

One of the first problem has happened while mailing patches with git on mailing list hosted by the Kernel.org servers; my messages were rejected because I used as sender “Diego E. ‘Flameeyes’ Pettenò”, without the double quotes around. For some RFC, when a period is present in the sender or destination names, the whole name has to be quoted in double quotes, but git does not seem to know about that and sends wrong email messages that get rejected. Even adding the escaped quotes in the configuration file didn’t help, so at the end I send my git email with my (new) full name “Diego Elio ‘Flameeyes’ Pettenò” even if it’s tremendously long and boring to read, and Lennart scolded me because now I figure with three different aliases in PulseAudio (on the other hand, ohloh handles that gracefully ).

Little parenthesis, if you’re curious where the “Elio” part comes from; I have legally changed my name, adding “Elio” as part of my first name last fall (it’s not a “second name” in the strict meaning of this term, because Italy does not have the concept of second name, it’s actually part of my first name). The reason for this is that there are four other “Diego Pettenò” in my city, two of which are around my age, and the Italian system is known for mistaking identities; while it does not really make me entirely safe to just add a second name, it should make it less likely that a mistake would happen. I have chosen Elio because that was the name of my grandfather.

So this was one of the problems; nothing really major, and was solved easily. The next problem happened today when I went for writing some notes about extending the guide (for which I still fail to find a publisher; unless I find one, it’ll keep the open donation approach), and, since the amount of blogging about the subject lately has been massive, I wanted to make sure I used the proper typographical quotation marks . It would have been easy to use them from OS X, but from Linux it seems it’s quite more difficult.

On OS X, I can reach the quotation marks on the keys “1” and “2”, adding the Option and Shift keys accordingly (single and double, open and closed); on Linux, with the US English, Alternate International keyboard I’m using, the thing is quite more difficult. The sequence would be something like Right Control, followed by AltGr and ' (or "), followed by < or >; even if I didn’t have to use AltGr to have the proper keys (without AltGr on the Alternate International keyboard the two symbols are “dead keys”, and are used for composing, quite important since I write both English and Italian with the same keyboard), it’s quite a clumsy way to access the two. And it also wouldn’t work with GNU Emacs on X11.

My first idea would have been to use xmodmap to just change the mappings of “1” and “2” to add third and shifted third levels, just like on OS X. Unfortunately adding extra levels with xmodmap seems to only work with the “mode switch” key rather than with the “ISO Level 3” key; the final result is that I had to “sacrifice” the right Command key (I use an Apple keyboard on Linux) to use as “mode switch” (keeping the right Option as Level 3 shift), and then mapping the 12 keys like I wanted. The result is usable but it also means that all the modifiers on the right side have completely different meaning from what they were designed to, and is not easy to remember all of them.

I thought about using the Keyboard Layout Editor but it requires antlr3 for Python, which is not available in Gentoo and seems to be difficult to update, so for now I’m stuck with this solution; next week when the iMac should arrive I’ll probably spend some more time on the issue (I already spent almost the whole afternoon, more than I should have used), I’d sincerely love to be able to set up the same exact keyboard layout for both systems, so I don’t have to remember in which one I am to get the combinations right; I already publish my custom OSX layout that basically implements the Xorg alternate international layout in OSX (you already have the same layout available in Windows as “US International”, so OSX was the only one lacking that), so I’ll probably just start maintaining layouts for both systems in the future.

Update 2021-10-18: if you’re interested in that particular layout for macOS, check out the US International (PC) layout on modern macOS, since they eventually integrated it and I didn’t have to bother with maintaining my own!

And I don’t even want to start talking about setting up proper IME for Japanese under this configuration…

Comments 6
  1. I’m not much of a gnome user, but GTK apps have an additional input method I’d like to see in KDE.ctrl-shift-u starts a Unicode input where you can just type the required symbols in hex. For instance, here in Firefox, {ctrl+shift+u}{8}{0}{0}{space} renders “ࠀ” , and {ctrl+shift+u}{2}{0}{0}{e}{space} renders the magical “Zero Width Word-Joiner” which is so great for escaping text without users seeing it being escaped, ie: <‎b‎>Not Parsed<‎/b‎>That said, as a predominantly English user, I love compose keys and wish they were easier to use/modify. Sacrificing capslock is something I greatly approve of because it otherwise serves no function to me. «Diego Pettenò» is reasonably easy to compose for me with compose-keys, I guess its just a case of getting used to it.

  2. The problem is that “Pettenò” is far from the only thing I do write on a daily basis that has accents. I could switch keyboard between US an Italian, but at the end of the day I prefer the modified OSX layout…As for the Unicode input method, using SCIM you should get something similar for Qt, but I don’t know how to get it to work on Qt4. And anyway remembering all the Unicode codepoints is a bit … difficult.

  3. Posting this in the remote chance that it’s of any help…On my keyboard (X11, en-GB default layout), almost every key has something on AltGr, and a few of them are deadkeys that produce accents. I can get curly quotes (if they’re the ones I think you’re talking about) with [Shift+]AltGr+{V,B}. Maybe the international layout is causing more problems than it fixes?

  4. I need to be able to input some `accents’ on my GB keyboard. At first I just composed them, but it was rather tedious, especially that I use two languages at once, and didn’t want to move from the GB layout since it’s different from the US and derived layouts (swapped ” and @, £ in place of # etc.)I used this article: http://preview.tinyurl.com/… to modify my GB map to be able to input the diacritics I use with AltGr.You could modify your keyboard map (add accents and quotes) just as you did under OSX (the method is probably slightly different though). All you need is a text editor and the keyboard layout files are straight-forward really.

  5. Hi Diego, I like you advocate for UTF-8. I once was working on project and a bug drove me crazy. After two weeks looking for the cause I had to realize it wasn’t in my fault but a bad programming guideline. Prior programmers just mixed up charsets in the database. So at any request of the searchengine the database returned quite strange stems for words entered. That’s why I advocate strict UTF-8 in any programming guideline.And there is a project I want to ask you to come over as an UTF-8 avenging angel 😉 It’s Arkonadi, used as PIM-API storage backend in KDE 4. They just mix up Latin_1 and UTF-8 tables in the database. See “https://bugs.kde.org/show_b…” for further tactical information 😉

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.