Thursday, September 03, 2009

Roland Bouman's blog goes i18n (Powered by Google Translate)

Now that Pentaho Solutions is in print, and the first few copies are finding its way towards the readers, I felt like doing something completely unrelated. So, I hacked up a little page translation widget, based on the Google Language API. You can see the result in the top of the left sidebar of my blog right now:

translator

Using it is very simple: just pick the language of choice, and the page (text and some attributes like alt and title) will be translated. Pick the first entry in the list to see the original language again.

This all happens inline by dynamic DOM manipulation, without having to reload the page. I tested it on Chrome 2, Firefox 3.5, Opera 10, Safari 4 and Internet Explorer 6 and 8. So far, it seems to work for all these browsers.

Personally, I feel that the user experience you get with this widget is superior to what you would get with the google translation gadget. In addition, it is pretty easy to to configure the Translator class .

The code to add this to your page is in my opinion reasonably simple:

<!-- add a placeholder for the user interface -->
<div id="toolbar"><div>

<!-- Include script that defines the Translator class -->
<script type="text/javascript" src="Translator-min.js"></script>
<!-- Instantiate a translator, have it create its gui and render to placeholder -->
<script type="text/javascript">
var translator = new Translator();
var gui = translator.createGUI(null, "Language");
document.getElementById("toolbar").appendChild(gui);
</script>


This really is all the code you need - there are no dependencies on external Javascript frameworks. If you don't need or like the gui, you can of course skip the gui placeholder code as well as the second script and interract with the Translator object programmatically.

The minified javascript file is about 7k, which is not too bad in my opinion. I haven't worried too much about optimizations, and I think it should be possible to cut down on codesize.

Another thing I haven't focused on just now is integration with frameworks - on the contrary I made sure you can use it standalone. But in order to do that, I had to write a few methods to facilitate DOM manipulation and JSON parsing, and its almost certain you will find functions like that are already in your framework.

Anyway, readers, I'd like to hear from you...is this auseful feature on this blog? Would you like to use it on your own blog? If there's enough people that want it, I will make it available on google code or something like that.

4 comments:

Anonymous said...

Hi Roland

Yes, it`s indeed easy to implement but the quality of the translation is really poor.
Dunno if Google Translate is good solution for this kind of thing.

rpbouman said...

Hi Anonymous,

thanks for your comment. My I ask which languages you tried, and what your native language(s) are? I'd really appreciate that information.

Regarding the translation quality: I get varying results. I noticed that my English prose doens't translate very well back to my native language (Dutch). But I also noticed some texts from the wikipedia that seem to yield pretty reasonable translations.

I am hoping that it is possible to adapt a simpler writing style to increase the quality of translations. That may seem like a backwards way of thinking - why adapt human behaviour and not the translation process, right? But sometimes you have some text that you just want to make as accessible as possible for as many people as possible. In these cases, I would certainly consider adapting my writing style if that would give me translations that are decent enough for about 80% of the time.

Unknown said...

Roland,

You should modify this widget to use the Worldwide Lexicon (www.worldwidelexicon.org/api). WWL is a collaborative human/machine translation system that allows users to edit and score translations, and uses machine translations as a fallback.

See also our universal translator for Firefox, which makes browsing foreign language sites seamless and automatic (it kicks in to translate pages when it senses a page is in another language). I can be reached as bsmcconnell@gmail.com to discuss further.

Would love to have a widget like this to distribute with our open source release.

Brian McConnell

Anonymous said...

Indeed the translation is quite poor, at least regarding Greek

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...