Dependency resolution with graphs in PHP

One of the projects I am working on at work presented an interesting problem.  I had a list of items with dependencies on one another and I needed to figure out the order in which to use those items, based on their dependencies.

For the sake of the example, think of a list of database tables, which reference each other.  I need a way to export those tables in such a way, that when imported back in, tables that have dependencies will be imported after all those tables on which they depend.  (It’s not the real task I’m working on, but close enough.)

Consider the following list as an example of input data:

// List of items with dependencies.  Order is not important
$tables = [ 
    'articles_meta' => ['articles'],
    'articles' => ['users', 'categories', 'tags'],
    'categories' => [],
    'comments' => ['users', 'articles'],
    'options' => [],
    'tags' => [],
    'users' => [],
    'users_meta' => ['users'],
];

The result of the dependency resolution should give me the list like this (there are variations in order, of course, as long as the dependencies are satisfied):

categories
options
tags
users
users_meta
articles
articles_meta
comments

There are several ways to solve this problem.  My first attempt took about 50 lines of code and worked fine, but it lacked elegance.  It had too many nested loops and tricky conditions and was difficult to read.  My second attempt was slightly better, with a bit of a recursion, but still looked somewhat off.  It felt like there is a better way to do it, and that I’ve done something similar before, but I could put my finger on it.

I thought I’d take a look at something that solves a similar problem.  Composer, PHP package and dependency manager, surely had something to offer.  A brief check of the GitHub repository, and that idea is out of my hand.  Composer deals with much more complex issues, so its Dependency Resolver code is not something I can grasp in a few minutes.

It was time for some Googling.  Moments later, my deja vu feeling of “I’ve seen this before” was easily explained.  This problem fits into the graph theory, which I probably used last back in my college years.  Of course, I could have grabbed the book off the shelf and refresh my knowledge, practicing the sacred art of the Real Programming.  But time was an issue, so I cheated.

I found this “Dependency resolving algorithm” blog post by Ferry Boender over at Electric Monk (thanks man!).  He had exactly what I needed – simple and straight forward recursive algorithm for walking the graph, circular dependency detection, and even some performance optimization.

dep_graph1

The only problem was that his code is all in Python.  But that’s not really a problem.  So I’ve rewritten his code in PHP and got exactly what I needed.  Here it is:

// List of items with dependencies.  Order is not important
$tables = [
    'articles_meta' => ['articles'],
    'articles' => ['users', 'categories', 'tags'],
    'categories' => [],
    'comments' => ['users', 'articles'],
    'options' => [],
    'tags' => [],
    'users' => [],
    'users_meta' => ['users'],
];

$resolved = []; 
$unresolved = []; 
// Resolve dependencies for each table
foreach (array_keys($tables) as $table) {
    try {
        list ($resolved, $unresolved) = dep_resolve($table, $tables, $resolved, $unresolved);
    } catch (\Exception $e) {
        die("Oops! " . $e->getMessage());
    }   
}

// Print out result
foreach ($resolved as $table) {
    $deps = empty($tables[$table]) ? 'none' : join(',', $tables[$table]);
    print "$table (deps: $deps)\n";
}

/**
 * Recursive dependency resolution
 * 
 * @param string $item Item to resolve dependencies for
 * @param array $items List of all items with dependencies
 * @param array $resolved List of resolved items
 * @param array $unresolved List of unresolved items
 * @return array
 */
function dep_resolve($item, array $items, array $resolved, array $unresolved) {
    array_push($unresolved, $item);
    foreach ($items[$item] as $dep) {
        if (!in_array($dep, $resolved)) {
            if (!in_array($dep, $unresolved)) {
                array_push($unresolved, $dep);
                list($resolved, $unresolved) = dep_resolve($dep, $items, $resolved, $unresolved);
            } else {
                throw new \RuntimeException("Circular dependency: $item -> $dep");
            }
        }
    }
    // Add $item to $resolved if it's not already there
    if (!in_array($item, $resolved)) {
        array_push($resolved, $item);
    }
    // Remove all occurrences of $item in $unresolved
    while (($index = array_search($item, $unresolved)) !== false) {
        unset($unresolved[$index]);
    }

    return [$resolved, $unresolved];
}

Running the above code produces the following result:

$ php dependecy.php 
users (deps: none)
categories (deps: none)
tags (deps: none)
articles (deps: users,categories,tags)
articles_meta (deps: articles)
comments (deps: users,articles)
options (deps: none)
users_meta (deps: users)

Which is exactly what I was looking for.  And now that I have it here, I’ll probably be needing it again and again.  It’s an elegant hammer to a lot of my nails.

How to Read and Improve the C.R.A.P Index of your code

crapclasscompletetest

Levi Hackwith has an excellent post explaining “How to Read and Improve the C.R.A.P Index of your code“:

The C.R.A.P. (Change Risk Analysis and Predictions) index is designed to analyze and predict the amount of effort, pain, and time required to maintain an existing body of code.

It iterates over the old bits of wisdom – write simpler code and cover it with unit tests – but it does so in a very simple and measurable way.

He also reminds us that:

…software metrics, in general, are just tools. No single metric can tell the whole story; it’s just one more data point. Metrics are meant to be used by developers, not the other way around – the metric should work for you, you should not have to work for the metric. Metrics should never be an end unto themselves. Metrics are meant to help you think, not to do the thinking for you. ~Alberto Savoia

Terminology – split screen terminal alternative to Terminator

terminology

If you are spending a lot of time in console, and have to manage multiple windows, there are a few options for you – screen, tmux, and, of course, Terminator.  Recently, I’ve come across one more – Terminology.

Terminology is a console with built-in window multiplexing.  It feels a bit more fancy than the options above and I enjoyed using it for about half a day.  From then on, the look, feel, and unfamiliar mouse and keyboard behavior threw me back into the Terminator window.  But f you were looking for an alternative to the well established options, here is one to try.

Robo – Modern Task Runner for PHP

robo

There is a whole lot of ways to build and deploy web applications these days.  I’ve done my own circle of trials and errors and have some very strong opinions on which ones are good, which ones are bad, and which ones are ugly.

My most recent discovery was Robo – a modern task runner for PHP.  I’ve been pushing it to one of those weekends, where I have nothing better to do, to trying it out.  And today I did.  Not just tried it out, but replaced a part of our infrastructure, which was previously running on Laravel Envoy.

Robo is very nice.  On one hand, it’s simple and straight forward and is very easy to get you started with.  On the other hand, it provides quite a bit of functionality to help you with the build and deploy process.  Here are some of the things that I love about it:

  • It’s PHP!  Which makes it perfect for PHP projects, the kind I’m dealing with most of my time.  No need to translate your PHP into XML (hey Phing), or into a weird rake/capistrano like syntax.
  • Instant support for command line arguments and their validation, help screens, ANSI colors, and the like.  Honestly, I don’t know why we are still fighting these things in 2016.
  • Transparency.  You install robo with composer, create your RoboFile.php, and you are done.  All public methods of the class in the RoboFile.php will be available as robo commands.  All parameters of public methods are populated by the command line arguments.  There is no black magic to it.  All that is executed is whatever you write.
  • Extensions (or Tasks and Stacks).  There are a few to utilize already (SSH, git, and more).  And it is trivial to write your own and share them with composer/Packagist.  That’s one of the things that is difficult with our current build setup based on phake (hence our phake-builder).

Things I don’t like (remember I have been using Robo for just about 3 minutes):

  • Logging.  While it’s trivial to add logs into the RoboFiles using your choice of the logging library (monolog is awesome), I haven’t found a way to get all the output after the command execution.  For now, I’ve wrapped the Robo run into a shell script, which collects all outputs and sends it by email into our deployment archives storage.
  • Remote power.  Robo has some very cool tasks and stacks for local work.  But I haven’t found a way to utilize them when using Robo’s SSH task.  This slightly spoils the clean remote commands with sprinkles of bash, which I would love to avoid.

Overall, it looks like a very nice and elegant system and I’ll probably be using it for much more.  Once I get a bit more comfortable with it, I will probably replace our phake-builder setup with Robo.

If you are looking for a good tool to use for your build and deploy needs, give it a try.  You’ll probably like it a lot.

vimrcfu – shared knowledge of vimrc

Dear all contributors to vimrcfu,

thank you very much for all my sleepless nights this week.  I’ve almost forgot how my bed looks like.  On the other hand, I’ve learned a tonne and have significantly rearranged my vimrc and related files, expanding it with new bits and pieces.

The sleep I can get back.  The awesome features of Vim at my fingertips now – couldn’t have happened without you.

You rock!

Best regards,

yours truly.

Do you know YAML?

I thought I did.  Especially after all the hours spent with Ansible.  Turns out I don’t.  I have a very limited understanding of the YAML format.  How do I know that, you ask?  Well, that’s because I am reading the YAML specification now.

yaml

Holy Molly that’s an interesting format!  Much recommended weekend reading.

base32 advantages over base64

Andrey shares some of the advantages of base32 over base64 encoding:

  1. The resulting character set is all one case, which can often be beneficial when using a case-insensitive filesystem, spoken language, or human memory.
  2. The result can be used as a file name because it can not possibly contain the ‘/’ symbol, which is the Unix path separator.
  3. The alphabet can be selected to avoid similar-looking pairs of different symbols, so the strings can be accurately transcribed by hand. (For example, the RFC 4648 symbol set omits the digits for one, eight and zero, since they could be confused with the letters ‘I’, ‘B’, and ‘O’.)
  4. A result excluding padding can be included in a URL without encoding any characters.

Personally, I don’t think I’ve heard about base32 until today.

tagbar-phpctags : Vim plugin for PHP developeres

phpctags

If you are using Vim editor to write PHP code, you probably already know about the excellent tagbar plugin, which lists methods, variables and the like in an optional window split.  Recently, I’ve learned of an awesome phpctags-tagbar plugin, which extends and improves this functionality via a phpctags tool, which has a deeper knowledge of PHP than the classic ctags tool.

Once installed, you’ll have a more organized browser of your code, with support for namespaces, classes, interfaces, constants, and variables.

O’Reilly Free Programming Ebooks

books

O’Reilly is giving away some programming ebooks for free.  Not the greatest of selections, but might still come handy, as subjects vary from Java and Python to micro-services and software architecture.  The books are available in ePub, Mobi, and PDF, but you’ll need to register / login to download them.

PHP: array_merge_recursive() vs. array_replace_recursive()

Here is a nice blog post describing the important differences between array_merge_recursive() and array_replace_recursive() functions in PHP.  These are often overlooked when testing new developments with simpler data structures.  Troubleshooting for it later is not too obvious.