Multi-language site with gettext

This is a short recipe how to make a multi-language site

First of all, I load all my languages during the WebKit start-up into a variable, which is visible from any place, i.e. I create this variable in __builtin__ dictionary. So, in my contextInitialize() method is:

__builtin__.__dict__['LANGUAGES'] = Languages()

where Languages() is a dictionary-like object, where keys are language codes and items are gettext.translation() objects:

import gettext
import os

class Languages:

    def __init__(self):
        self._langs = {}
        self.load()

    def __getitem__(self, lang):
        return self._langs[lang]

    def load(self):
        localeDir = 'yourLocaleDirectory'
        dirItems = os.listdir(localeDir)
        for dirItem in dirItems:
            if os.path.isdir(os.path.join(localeDir, dirItem)):
                try:
                    self._langs[dirItem] = gettext.translation('yourDomain', localeDir, [dirItem])
                except Exception, e:
                    print 'Error: Loading language: "%s"' % dirItem

    def gettextFunc(self, lang):
        if self._langs.has_key(lang):
            return self._langs[lang].gettext
        else:
            return lambda x: x

So. I have all the translations in memory and now I need to use them. I need to know, which language to choose. Because my site is also for registered users, there are two ways:

In awake() method of a servlet I set right gettext function:

awake(self, transaction):
    ...
    self._ = LANGUAGES.gettextFunc('languageCode')
    ...

Every string, that needs to be translated, I mark as usually _('stringToTranslate') and in the beginig of every method Where I need it, I just do:

someMethod(self):
    _ = self._

That's it. :-)

One last thing. Strings that needs to be translated are not just in servlets, but on other places too (for example in some modules), so I need some _() function there. At the begining of such a place I have:

_ = lambda x: x

If you have lots of situations where you need to internationalize code that won't have access to your context (i.e., to self._), then you might want to try keeping track of language on a per-thread manner. (If you are using that last recipe, _ = lambda x: x, then you might want to try this instead.) An untested example of this:

def awake(self, trans):
    self._ = LANGUAGES.gettextFunc('languageCode')
    languagecontext.register(self._)

def sleep(self, trans):
    languagecontext.deregister()

# then in languagecontext:
import threading

class LanguageContextTracker:

    def __init__(self):
        self._threads = {}

    def curName(self):
        return threading.currentThread().getName()

    def register(self, obj):
        self._threads[self.curName()] = obj

    def deregister(self):
        try:
            del self._threads[self.curName()]
        except KeyError:
            pass

    def language(self):
        return self._threads[self.curName()]

TheLanguageContextTracker = LanguageContextTracker()
register = TheLanguageContextTracker.register
deregister = TheLanguageContextTracker.deregister
language = TheLanguageContextTracker.language

Then use languagecontext.language() to get your _ object, even after you've lost track of the servlet you are associated with.

(If someone implements this, please correct this code as necessary)

-- Ian Bicking


Based on very light testing the above approach seems to work.

Although I implemented it slightly differently. Instead of gettext -functions TheLanguageContextTracker keeps track of preferred translation languages (as strings like 'en', 'fi',...).

My LangUtil.py looks like this:

import threading
import gettext

class LanguageContextTracker:

    def __init__(self):
        self._threads = {}

    def curName(self):
        return threading.currentThread().getName()

    def register(self, obj):
        self._threads[self.curName()] = obj


    def deregister(self):
        try:
            del self._threads[self.curName()]
        except KeyError:
            pass

    def language(self):
        return self._threads[self.curName()]

TheLanguageContextTracker = LanguageContextTracker()
register = TheLanguageContextTracker.register
deregister = TheLanguageContextTracker.deregister
language = TheLanguageContextTracker.language

_translations = {}

def initialize():
    "Load translations and put them to a dictionary. ADD ERROR HANDLING HERE"
    _translations['fi'] = gettext.translation("mydomain","C:/mywebkitproject/locale",["fi"])
    _translations['en'] = gettext.translation("mydomain","C:/mywebkitproject/locale",["en"])


def mygettext(s):
    "Look up the preferred translation and return the translated string"
    try:
        return _translations[language()].lgettext(s)
    except:
        return s

In a Page -class from which all other Page -classes in the site inherit I have the following:

def awake(self,trans):
    if trans.request().hasField("lang"):
        trans.session().setValue("lang",trans.request().field("lang"))
    LangUtil.register(trans.session().value('lang','en'))
    SiteTemplate.awake(self,trans) # I've got a Cheetah template at the top of my Page-class hierarcy


def sleep(self,trans):
    SiteTemplate.sleep(self,trans)
    LangUtil.deregister()

And into Launch.py I've added the following:

# Initialize translations and have _ point to LangUtil.mygettext
import LangUtil
LangUtil.initialize()
__builtins__.__dict__['_'] = LangUtil.mygettext

-- jranki

One issue with the above implementation is that LangUtil can be loaded multiple times. Once in Launch.py and another time in your Page class. Having the effect that when _ is called, there is never anything in the LanguageContextTracker._threads variable.

A better solution is within Launch.py to store the module you loaded:

__builtins__.__dict__['__langutil__'] = LangUtil

and then in your Page class:

__langutil__.register(trans.session().value('lang','en'))

within the LangUtil you could use the threading.local() object which has been around since Python2.4 to store the thread specific language string. Instead of using the _threads dictionary.

class LangUtil:
    def __init__(self):
        self.data = threading.local()

    def register(self, obj):
        self.data.lang = obj

    def language(self):
        return self.data.lang

-- bcctech