WordPress : Preferred Languages Research

Pascal Birchler of the WordPress blogs some interesting research he did in the area of handling preferred language and how different systems – ranging from browsers, wikis, and social networks to all kinds of content management systems – approach and solve the problem.

drupal-language-hierarchy-module

Drupal

Drupal 8 has a rather powerful user interface text language detection mechanism. There is a per session, per user and per browser option in the detection settings. However, users can only choose one language, so they cannot say (in core at least) that they want German primarily and Spanish if German is not available. But the language selected by the user is part of the larger fallback system, so it may fall back further down to other options.

The Language fallback module allows defining one fallback for a language, while the Language Hierarchy module provides a GUI to change the language fallback system. It allows setting up language hierarchies where translations of a site’s content, settings and interface can fall back to parent language translations, without ever falling back to English. This module might be the most interesting one for our research.

Apart from the research itself, I think this is an interesting example of how complex some seemingly simple features are.

Programming and Greek

One thought that cracks me up every now and then is about Greek programmers.  In Greek language, instead of a question mark a semicolon is used.

Greek

In many programming languages, a semicolon is used to represent the end of statement.  So, this:

$a = $b + $c;
print $a;

to Greek programmers must be looking like this:

$a = $b + $c?
print $a?

I don’t know about you, but to me this would be a constant confidence issue.  It’s almost like I’m not sure what I’m going and asking the computer to confirm.

I’m sure though they have their ways of working around this …

By the way, while reading through the Wikipedia article linked above, I thought that the possible origins of the question mark were quite interesting:

questio

 

That would also explain why not all the languages are using the question mark character.

The Definitive Guide to Natural Language Processing

The Definitive Guide to Natural Language Processing” is an easy to follow article on what a challanging task it is for machines to understand human language.  There’s also this cool video of two bots talking to each other.

Global email in Gmail. Bad idea.

Gmail blog reports that Google is working on a more global email.  The first step is internationalized email addresses, like this:

internationalized_email_address

As someone who worked in international environments for years, I strongly dislike this idea.  There is a whole array of issues related to this: readability of the email address (yes, read it!), display issues (do you have the font with all the necessary characters?), writing email address (searching through the addressbook, for example), or even copy-pasting an email address (have you tried copy-pasting something English strings from Hebrew or Arabic documents?  Now you’ll be copy-pasting international email addresses from English documents – so much fun!).  On top of that, all the usual things related to SPAM filters, trust issues (is this a company, free email hosting, or a personal domain?), etc.  Can you spell out this email address over a phone?  How about typing it on the mobile phone?  Do you even know in which language it is?

Using non-accented Latin characters is a pain for all those people who don’t speak English.  But it worked nonetheless for the last few decades.  Now we are heading towards the future, where that pain won’t be limited to those who don’t read English, but to everyone.  As you can’t really learn all the languages of the world, or control which language email addresses are making it into your inbox.  Remember, that just because the email address is in a given language, it doesn’t mean that the content of the email is in the same language.

On top of that, we’ve tried that already with the international URLs.  See how well that worked out.  Yeah, some people sure use them.  But try copy-pasting this URL around and I guarantee you’ll end up with a whole bunch of long and cumbersome escaped strings.  The same or similar fate will hit the emails…

Cryptic changelog messages

Being a developer myself, I’m of course also guilty of an occasional cryptic changelog message.  But this one, from the latest update of the SEO Ultimate WordPress plugin, puzzled quite a few people I showed it to:

Version 7.6.4.3 (April 14, 2014)

  • Bugfix: Rich Snippet Creator’s “Place” search result type address fields appearance fix

What is that all about? How many nouns can you use one after another in a single sentence?