Dependency resolution with graphs in PHP

One of the projects I am working on at work presented an interesting problem.  I had a list of items with dependencies on one another and I needed to figure out the order in which to use those items, based on their dependencies.

For the sake of the example, think of a list of database tables, which reference each other.  I need a way to export those tables in such a way, that when imported back in, tables that have dependencies will be imported after all those tables on which they depend.  (It’s not the real task I’m working on, but close enough.)

Consider the following list as an example of input data:

// List of items with dependencies.  Order is not important
$tables = [ 
    'articles_meta' => ['articles'],
    'articles' => ['users', 'categories', 'tags'],
    'categories' => [],
    'comments' => ['users', 'articles'],
    'options' => [],
    'tags' => [],
    'users' => [],
    'users_meta' => ['users'],
];

The result of the dependency resolution should give me the list like this (there are variations in order, of course, as long as the dependencies are satisfied):

categories
options
tags
users
users_meta
articles
articles_meta
comments

There are several ways to solve this problem.  My first attempt took about 50 lines of code and worked fine, but it lacked elegance.  It had too many nested loops and tricky conditions and was difficult to read.  My second attempt was slightly better, with a bit of a recursion, but still looked somewhat off.  It felt like there is a better way to do it, and that I’ve done something similar before, but I could put my finger on it.

I thought I’d take a look at something that solves a similar problem.  Composer, PHP package and dependency manager, surely had something to offer.  A brief check of the GitHub repository, and that idea is out of my hand.  Composer deals with much more complex issues, so its Dependency Resolver code is not something I can grasp in a few minutes.

It was time for some Googling.  Moments later, my deja vu feeling of “I’ve seen this before” was easily explained.  This problem fits into the graph theory, which I probably used last back in my college years.  Of course, I could have grabbed the book off the shelf and refresh my knowledge, practicing the sacred art of the Real Programming.  But time was an issue, so I cheated.

I found this “Dependency resolving algorithm” blog post by Ferry Boender over at Electric Monk (thanks man!).  He had exactly what I needed – simple and straight forward recursive algorithm for walking the graph, circular dependency detection, and even some performance optimization.

dep_graph1

The only problem was that his code is all in Python.  But that’s not really a problem.  So I’ve rewritten his code in PHP and got exactly what I needed.  Here it is:

// List of items with dependencies.  Order is not important
$tables = [
    'articles_meta' => ['articles'],
    'articles' => ['users', 'categories', 'tags'],
    'categories' => [],
    'comments' => ['users', 'articles'],
    'options' => [],
    'tags' => [],
    'users' => [],
    'users_meta' => ['users'],
];

$resolved = []; 
$unresolved = []; 
// Resolve dependencies for each table
foreach (array_keys($tables) as $table) {
    try {
        list ($resolved, $unresolved) = dep_resolve($table, $tables, $resolved, $unresolved);
    } catch (\Exception $e) {
        die("Oops! " . $e->getMessage());
    }   
}

// Print out result
foreach ($resolved as $table) {
    $deps = empty($tables[$table]) ? 'none' : join(',', $tables[$table]);
    print "$table (deps: $deps)\n";
}

/**
 * Recursive dependency resolution
 * 
 * @param string $item Item to resolve dependencies for
 * @param array $items List of all items with dependencies
 * @param array $resolved List of resolved items
 * @param array $unresolved List of unresolved items
 * @return array
 */
function dep_resolve($item, array $items, array $resolved, array $unresolved) {
    array_push($unresolved, $item);
    foreach ($items[$item] as $dep) {
        if (!in_array($dep, $resolved)) {
            if (!in_array($dep, $unresolved)) {
                array_push($unresolved, $dep);
                list($resolved, $unresolved) = dep_resolve($dep, $items, $resolved, $unresolved);
            } else {
                throw new \RuntimeException("Circular dependency: $item -> $dep");
            }
        }
    }
    // Add $item to $resolved if it's not already there
    if (!in_array($item, $resolved)) {
        array_push($resolved, $item);
    }
    // Remove all occurrences of $item in $unresolved
    while (($index = array_search($item, $unresolved)) !== false) {
        unset($unresolved[$index]);
    }

    return [$resolved, $unresolved];
}

Running the above code produces the following result:

$ php dependecy.php 
users (deps: none)
categories (deps: none)
tags (deps: none)
articles (deps: users,categories,tags)
articles_meta (deps: articles)
comments (deps: users,articles)
options (deps: none)
users_meta (deps: users)

Which is exactly what I was looking for.  And now that I have it here, I’ll probably be needing it again and again.  It’s an elegant hammer to a lot of my nails.

Algorithm Economy and Containers

Containers (Docker, et al) have been getting all the hype recently.  I’ve played around with these a bit, but I’m not yet convinced this is the next greatest thing for projects that I am involved with currently.  However, it helps to look at these from different perspectives.  Here’s a blog post that ties containers to a new term that I haven’t heard before – algorithm economy.

The “algorithm economy” is a term established by Gartner to describe the next wave of innovation, where developers can produce, distribute, and commercialize their code. The algorithm economy is not about buying and selling complete apps, but rather functional, easy to integrate algorithms that enable developers to build smarter apps, quicker and cheaper than before.

CSS Colorguard – keep a watchful eye on your CSS colors

CSS Colorguard – keep a watchful eye on your CSS colors.

Here is a better description from the README:

Every CSS project starts out with good intentions, but inevitably, one too many people eye-dropper colors into nooks and crannies that you never knew existed. CSS Colorguard helps you maintain the color set that you want, and warns you when colors you’ve added are too similar to ones that already exist. Naturally, it’s all configurable to your tastes.

And here is my favorite part:

Colorguard uses the CIEDE2000 algorithm to determine the similarity of each of the colors in your CSS file. This algorithm is quite complex, but is used in the broadcasting community as the best approximation of human ability to discern differences in color. RGB on the other hand, is pretty bad at representing differences in color purely based on the numerical difference of the hex values.

Luckily, someone else already implemented CIEDE2000, so I didn’t have to. Tight. Cause this thing is mathy as hell.

CIEDE2000 math

John Carmack talks about the physics of light and rendering

For a while now I am thinking that you don’t really know something until you can easily explain it or talk about it, in simple words and with people who might not even know one thing about the subject.  John Carmack is well known and respected in the field of computer graphics and gaming, and watch him talk about light and rendering!  I now nothing of it, and I watch this whole talk, glued to the screen, catching every word.

Apart from the physics of light, this provokes thought on other subjects too.   The complexity of simple things comes to mind.  Something that we all observe every day and seldom think about – turns out to be so complex.  The importance of computer games is another subject.  I’m a big fan of Quake in particular, and I’ve heard a billion times people asking questions on why is this important at all and how this makes the world better.  Well, I guess, that question is easy to answer now.  Some game makers push the technology, push the science, and they do make the world better.  But they need us – gamers, once in a while, to pay for that and to provide feedback on what works and what doesn’t.

The Archive of Interesting Code

The Archive of Interesting Code

The Archive of Interesting Code is an (ambitious) effort on my part to research, intuit, and code up every interesting algorithm and data structure ever invented. In doing so, I hope both to learn the mathematical techniques that power these technologies and to improve my skills as a programmer.