Site icon Leonid Mamchenkov

PHP regular expression to match English/Latin characters only

Today at work I came across a task which turned out to be much easier and simpler than I originally thought it would.  We have have a site with some user registration forms.  The site is translated into a number of languages, but due to the regulatory procedures, we have to force users to input their registration details in English only.  Using Latin characters, numbers, and punctuation.

I’ve refreshed my knowledge of Unicode and PCRE.  And then I came up with the following method which seems to do the job just fine.

/**
 * Check that given string only uses Latin characters, digits, and punctuation
 *
 * @param string $string String to validate
 * @return boolean True if Latin only, false otherwise
 */
public function validateLatin($string) {
    $result = false;

    if (preg_match("/^[\w\d\s.,-]*$/", $string)) {
        $result = true;
    }

    return $result;
}

In other words, just a standard regular expression with no Unicode trickery.  The ‘/u’ modifier would cause this to totally malfunction and match everything.  Good to know.

Exit mobile version