Getting started with workflows in PHP

For a large project at work, we need to integrate or develop a workflow engine.  I worked a little bit with workflow engines in the past, but the subject is way to big and complex for me to claim any expertise in it.

So, I am looking at what’s available these days and what are our options.  This post is a collection of initial links and thoughts, and it’s goal is mostly to document my research process and findings, and not to provide any answers or solutions yet.

Continue reading Getting started with workflows in PHP

CakePHP 3 : Remove Shell Welcome Header

CakePHP 3 has an excellent support for command line Shells, Tasks, and Console Tools.  There are a few that are bundled with the framework itself, and that come from a variety of plugins.  And, of course, you can have your own commands, specific to your application.

$ ./bin/cake

Welcome to CakePHP v3.4.3 Console
---------------------------------------------------------------
App : src
Path: /home/leonid/Work/cakephp_test/src/
PHP : 7.0.16
---------------------------------------------------------------
Current Paths:

* app:  src
* root: /home/leonid/Work/cakephp_test
* core: /home/leonid/Work/cakephp_test/vendor/cakephp/cakephp

Available Shells:

[Bake] bake

[DebugKit] benchmark, whitespace

[Migrations] migrations

[CORE] cache, i18n, orm_cache, plugin, routes, server

[app] console

To run an app or core command, type `cake shell_name [args]`
To run a plugin command, type `cake Plugin.shell_name [args]`
To get help on a specific command, type `cake shell_name --help`

There is one tiny little annoyance though.  Sometimes, it’s useful to get an output of the CakePHP Shell and use it in another script.  For example, you might need to get a list of all loaded plugins and loop over them, performing another action, outside of CakePHP.  Say, in a bash script.  Getting a list of loaded plugins is easy with the bundled shell like so:

$ ./bin/cake plugin loaded

Welcome to CakePHP v3.4.3 Console
---------------------------------------------------------------
App : src
Path: /home/leonid/Work/cakephp_test/src/
PHP : 7.0.16
---------------------------------------------------------------
Bake
DebugKit
Migrations

But, as you can see, the output is not very useful for machine processing. The welcome header is in the way.  Sure, you can parse it out with regular expressions, or even a simple line count.  But that lacks elegance.  Is there a better way?  I thought there was.

My first approach was to use the –quiet option, which, I thought, would leave me with just the needed output.  It turns out, that’s not what it does.  It strips out all the output, and there is no list of plugins at all.

The second approach worked out better.  I learned about it from this thread.  The solution is to extend the needed CakePHP shell and overwrite the protected _welcome() method.  Here’s the content of the newly created application level shell in src/Shell/PluginShell.php:

<?php
namespace App\Shell;

use Cake\Shell\PluginShell as Shell;

class PluginShell extends Shell
{
    /**
     * Silence the welcome message
     *
     * @return void
     */
    protected function _welcome()
    {
    }
}

And now running the same command as before produces a cleaner output:

$ ./bin/cake plugin loaded
Bake
DebugKit
Migrations

This now can be easily used in other scripts without any need for regular expressions and other trimming techniques.

Validating JSON against schema in PHP

GitHub was rather slow yesterday, which affected the speed of installing composer dependencies (since most of them are hosted on GitHub anyway).  Staring at a slowly scrolling list of installed dependencies, I noticed something interesting.

...
  - Installing seld/jsonlint (1.6.0)
  - Installing justinrainbow/json-schema (5.1.0)
...

Of course, I’ve heard of the seld/jsonlint before.  It’s a port of zaach/jsonlint JavaScript tool to PHP, written by Jordi Boggiano, aka Seldaek, the genius who brought us composer dependency manager and packagist.org repository.

But JSON schema? What’s that?

The last time I heard the word “schema” in a non-database context, it was in the XML domain.  And I hate XML with passion.  It’s ugly and horrible and should die a quick death.  The sooner, the better.

But with all its ugliness, XML has does something right – it allows the schema definition, against which the XML file can be validated later.

Can I have the same with JSON?  Well, apparently, yes!

justinrainbow/json-schema package allows one to define a schema for what’s allowed in the JSON file, and than validate against it.  And even more than that – it supports both required values and default values too.

Seeing the package being installed right next to something by Seldaek, I figured, composer might be using it already.  A quick look in the repository confirmed my suspicion.  Composer documentation provides more information, and links to an even more helpful JSON-Schema.org.

Mind.  Officially.  Blown.

At work, we use a whole lot of configuration files for many of our projects.  Those files which are intended for tech-savvy users, are usually in JSON or PHP format, without much validation attached to them.   Those files which are for non-technical users, usually rely on even simpler formats like INI and CSV.  I see this all changing and improving soon.

But before any of that happens, I need to play around with these amazing tools.  Here’s a quick first look that I did:

  1. Install the JSON validator: composer require justinrainbow/json-schema
  2. Create an example config.json file that I will be validating.
  3. Create a simple schema.json file that defines what is valid.
  4. Create a simple index.php file to tie it altogether, mostly just coping code from the documentation.

My config.json file looks like this:

{
	"blah": "foobar",
	"foo": "bar"
}

My schema.json file looks like this:

{
	"type": "object",
	"properties": {
		"blah": {
			"type": "string"
		},
		"version": {
			"type": "string",
			"default": "v1.0.0"
		}
	}
}

And, finally, my index.php file looks like this:

<?php
require_once 'vendor/autoload.php';

use JsonSchema\Validator;
use JsonSchema\Constraints\Constraint;

$config = json_decode(file_get_contents('config.json'));
$validator = new Validator; $validator->validate(
	$config,
	(object)['$ref' => 'file://' . realpath('schema.json')],
	Constraint::CHECK_MODE_APPLY_DEFAULTS
);

if ($validator->isValid()) {
	echo "JSON validates OK\n";
} else {
	echo "JSON validation errors:\n";
	foreach ($validator->getErrors() as $error) {
		print_r($error);
	}
}

print "\nResulting config:\n";
print_r($config);

When I run it, I get the following output:

$ php index.php 
JSON validates OK

Resulting config:
stdClass Object
(
    [blah] => foobar
    [foo] => bar
    [version] => v1.0.0
)

What if I change my config.json to have something invalid, like an integer instead of a string?

{
	"blah": 1,
	"foo": "bar"
}

The validation fails with a helpful information:

$ php index.php 
JSON validation errors:
Array
(
    [property] => blah
    [pointer] => /blah
    [message] => Integer value found, but a string is required
    [constraint] => type
)

Resulting config:
stdClass Object
(
    [blah] => 1
    [foo] => bar
    [version] => v1.0.0
)

This is great. Maybe even beyond great!

The possibilities here are endless.  First of all, we can obviously validate the configuration files.  Secondly, we can automatically generate the documentation for the supported configuration options and values.  It’s probably not going to be super fantastic at first, but it will cover ALL supported cases and will always be up-to-date.  Thirdly, this whole thing can be taken to the next level very easily, since the schema files are JSON themselves, which means schema’s can be generated on the fly.

For example, in our projects, we allow the admin/developer to specify which database field of a table is used as display field (in links and such).  Only existing database fields should be allowed.  So, we can generate the schema with available fields on project deployment, and then validate the user configuration against his particular database setup.

There are probably even better ways to utilize all this, but I’ll have to think about it, which is not easy with the mind blown…

Update (March 16, 2017): also have a look at some alternative JSON Schema validators.  JSON Guard might be a slightly better option.

Language Detection Library for PHP

patrickschur/language-detection – is a language detection library for PHP, which detects the language from a given text string.  Now, a bit more detailed:

This library can detect the language of a given text string. It can parse given training text in many different idioms into a sequence of N-grams and builds a database file in JSON format to be used in the detection phase. Then it can take a given text and detect its language using the database previously generated in the training phase. The library comes with text samples used for training and detecting text in 106 languages.

I tried it briefly with a few languages that I can master a phrase or two in, and it works better with some than with others.  Greek was good, Russian not so much.

Hopefully, the sample data used for training will improve over time, but it’s definitely a good start.

Via this blog post.