Site icon Leonid Mamchenkov

Language Detection Library for PHP

patrickschur/language-detection – is a language detection library for PHP, which detects the language from a given text string.  Now, a bit more detailed:

This library can detect the language of a given text string. It can parse given training text in many different idioms into a sequence of N-grams and builds a database file in JSON format to be used in the detection phase. Then it can take a given text and detect its language using the database previously generated in the training phase. The library comes with text samples used for training and detecting text in 106 languages.

I tried it briefly with a few languages that I can master a phrase or two in, and it works better with some than with others.  Greek was good, Russian not so much.

Hopefully, the sample data used for training will improve over time, but it’s definitely a good start.

Via this blog post.

 

Exit mobile version