This Time Self-Hosted
dark mode light mode Search

Translating ebuilds, a proof of concept

There was only one comment to my previous post, but I actually didn’t expect that one either, first because I probably lost most of my publicity as I’m no more on Planet Gentoo or Gentoo Universe, and second because I know it’s not an issue that concern most of the people who actually use Gentoo at the current stage.

Nonetheless, I wanted to give it a try to a proof of concept of ebuilds translation. If you’re interested in this, you probably want to fetch my overlay, and look at the media-sound/pulseaudio ebuild that is there. Right now there is only Italian translation for it, as that’s the only language I can translate it to, but it works as a proof of concept for me. To try it out, run LC_ALL=“it_IT” and then emerge –1 pulseaudio.

The trick is actually simple once you know it:

messages_locale() {
        locale | grep LC_MESSAGES | cut -d '=' -f 2 | tr -d '"' | cut -d '.' -f 1
}

This function is used to extract the locale currently set for LC_MESSAGE value. Why is this needed? Well, it’s simple: you might be using LC_ALL rather than LC_MESSAGE to set the locale, you also might be using just LANG rather than setting the LC_* variables, so at the end, using locale is the best shot to make sure we get the proper language for messages set up on the system. In the example I have you above, by rewriting LC_ALL we bypass all the other settings.

local msgfile="${FILESDIR}/${P}-postinst"
[[ -f "${msgfile}.$(messages_locale|cut -d '_' -f 1)" ]] && msgfile="${msgfile}.$(messages_locale|cut -d '_' -f 1)"
[[ -f "${msgfile}.$(messages_locale)" ]] && msgfile="${msgfile}.$(messages_locale)"

einfo ""
local save_IFS="${IFS}"
IFS=""
while read line; do
        elog "$line"
done < "${msgfile}"
einfo ""
IFS="${save_IFS}"

This is instead the code that actually handles the loading of the translated message that is then printed on screen for the user. It’s a very rough code as it is, I know already, so no need for pointing me at that: the tools’ chain shown above is ran at least two times up to four if the current language is in a country locale form (like pt_BR), rather than just a language name (it). The code is also prone to errors as it’s quite long by itself.

But as I said, this is a proof of concept rather than an actual implementation, this is just to demonstrate that it is possible to translate messages in ebuilds without filling the ebuilds with the messages in 20 different languages. Of course to avoid adding that big boilerplate code it should go either in portage itself in some way (but that makes adoption of translation a very long term idea, maybe EAPI=1 related) or in a more feasible i18n.eclass, that would handle all of it, included caching the value returned by $(messages_locale) so that it’s not called four times, but once only, and converting from UTF-8 (the usual encoding for in-tree files) to the local encoding, with iconv, if present.

This works well for the long log messages that are added at the postinst phase for instance, because they rarely change between one version and the next one and so have time to be translated. It doesn’t really fly for the short informative messages we have around, nor it works fine for eclasses messages.

For those, what I can think of on the fly is to try to standardise the strings as much as possible (for instance by letting the eclasses to the job), and then use gettext to translate those, with an “app-i18n/portage-i18n” package where the eclass can get their data from. I’ll try to see if I can get a proof of concept of that too.

Comments 3
  1. Well, I imagine there are a few of us users around here still…I’ve been thinking about learning about writing ebuilds but I never seem to find the time. The idea sounds very interesting, though. I can offer only encouragement (or however it’s spelled) and I’m sure you still have enough friends in the developer community to push for your idea.Oh, and speaking of developers, you did know that you won the developer of the year poll? You beat Ciaran by a few votes so feel free to gloat. 🙂

  2. oh, we’re here all right ;-)this looks interesting. what’s the possible overhead of this solution?still, it would be nice if it was possible to turn it off via some option.

  3. As this is just a proof of concept, and not a solution, there are no options to turn this off, also because there’s not really anything that can control it.It also has a lot of overhead as it stands, because the messages are added directly into $FILESDIR, and you can’t even rsync_exclude them.A proper solution would have either a subdirectory in files/ or a new linguas/ directory, where to store the files. Still overhead on the tree size, as we got to add the translations in the tree to be available during emerge -K, but at least using a separate directory would allow to use rsync_exclude to remove the files. Again, it’s still suboptimal because you end up wasting inodes and blocks.As I said, this is a proof of concept, someone should look into a more proper implementation to get it working, that’s why I suggested SoC.I’ll still post a few other alternatives during this week.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.