Translating ebuilds, a proof of concept

There was only one comment to my previous post, but I actually didn’t expect that one either, first because I probably lost most of my publicity as I’m no more on Planet Gentoo or Gentoo Universe, and second because I know it’s not an issue that concern most of the people who actually use Gentoo at the current stage.

Nonetheless, I wanted to give it a try to a proof of concept of ebuilds translation. If you’re interested in this, you probably want to fetch my overlay, and look at the media-sound/pulseaudio ebuild that is there. Right now there is only Italian translation for it, as that’s the only language I can translate it to, but it works as a proof of concept for me. To try it out, run LC_ALL=“it_IT” and then emerge –1 pulseaudio.

The trick is actually simple once you know it:

messages_locale() {
        locale | grep LC_MESSAGES | cut -d '=' -f 2 | tr -d '"' | cut -d '.' -f 1

This function is used to extract the locale currently set for LC_MESSAGE value. Why is this needed? Well, it’s simple: you might be using LC_ALL rather than LC_MESSAGE to set the locale, you also might be using just LANG rather than setting the LC_* variables, so at the end, using locale is the best shot to make sure we get the proper language for messages set up on the system. In the example I have you above, by rewriting LC_ALL we bypass all the other settings.

local msgfile="${FILESDIR}/${P}-postinst"
[[ -f "${msgfile}.$(messages_locale|cut -d '_' -f 1)" ]] && msgfile="${msgfile}.$(messages_locale|cut -d '_' -f 1)"
[[ -f "${msgfile}.$(messages_locale)" ]] && msgfile="${msgfile}.$(messages_locale)"

einfo ""
local save_IFS="${IFS}"
while read line; do
        elog "$line"
done < "${msgfile}"
einfo ""

This is instead the code that actually handles the loading of the translated message that is then printed on screen for the user. It’s a very rough code as it is, I know already, so no need for pointing me at that: the tools’ chain shown above is ran at least two times up to four if the current language is in a country locale form (like pt_BR), rather than just a language name (it). The code is also prone to errors as it’s quite long by itself.

But as I said, this is a proof of concept rather than an actual implementation, this is just to demonstrate that it is possible to translate messages in ebuilds without filling the ebuilds with the messages in 20 different languages. Of course to avoid adding that big boilerplate code it should go either in portage itself in some way (but that makes adoption of translation a very long term idea, maybe EAPI=1 related) or in a more feasible i18n.eclass, that would handle all of it, included caching the value returned by $(messages_locale) so that it’s not called four times, but once only, and converting from UTF-8 (the usual encoding for in-tree files) to the local encoding, with iconv, if present.

This works well for the long log messages that are added at the postinst phase for instance, because they rarely change between one version and the next one and so have time to be translated. It doesn’t really fly for the short informative messages we have around, nor it works fine for eclasses messages.

For those, what I can think of on the fly is to try to standardise the strings as much as possible (for instance by letting the eclasses to the job), and then use gettext to translate those, with an “app-i18n/portage-i18n” package where the eclass can get their data from. I’ll try to see if I can get a proof of concept of that too.