100 Favorite Programming, Computer and Science Books

Peteris Krumins, of the Browserling fame, has a series of blog posts on his top favorite programming, computer and science books.  It’s an excellent selection of titles, from which I’ve read only a fraction.  Good timing for the Christmas shopping too.  Here are the blog posts in the series so far (5 books per post):

Even with the 30 books mentioned so far, there are new things to read and learn.  I wonder how many of the notes to self I’ll have by the time the whole 100 are listed.

Quick and easy introduction into PHP Mess Detector (PHPMD)

PHP Mess Detector is yet another one of those tools that help to keep the code base manageable and clean.  Here’s the description straight from the site:

What PHPMD does is: It takes a given PHP source code base and look for several potential problems within that source. These problems can be things like:

  • Possible bugs
  • Suboptimal code
  • Overcomplicated expressions
  • Unused parameters, methods, properties

Here is how you can jump right in.  It’s super easy.  It only takes 6 steps.

Step 1: Pick a project to try it on.

You can use any of your own PHP projects, or grab one from GitHub.  It doesn’t matter.  You’ll know better where to apply it once you get comfortable with the tool.  For sake of this quick guide, I’ll use one of our Open Source repositories – cakephp-groups plugin.

cd /tmp
git clone git@github.com:QoboLtd/cakephp-groups.git
cd cakephp-groups

Step 2: Install PHPMD with composer.

composer require phpmd/phpmd

Step 3: Run PHPMD.

If you run “./vendor/bin/phpmd“, you’ll see a help screen. But what’s the purpose of this blog post if you have to read the manual, right? So, let me simplify it for you. PHPMD needs three parameters:

  1. Path to the PHP source code that it will be examining.  We’ll use “src/“.
  2. Report format – one of: xml, text, or html.  We’ll use “html“.
  3. A choice of mess detection rules that you want it to apply.  You can create your own, or you can pick one from: cleancode, codesize, controversial, design, naming, unusedcode.  We’ll use “unusedcode“.

Also, we’ll give it an extra one: “–reportfile“, because by default PHPMD will spit everything to the standard output.  So, let’s put it together and see what we’ve got.

phpmd src/ html unusedcode --reportfile phpmd.html

Step 4: Examine the report.

After running PHPMD command above, you’ll find a phpmd.html file in the same folder. Here’s how it looked for me, when open in the browser.

PHP mess detector

So, PHPMD found one problem in the “src/Shell/Task/ImportTask.php” file on line 93.  Here’s the relevant piece of code:

    protected function _getImportErrors($entity)
    {
        $result = []; 
        if (!empty($entity->errors())) {
            foreach ($entity->errors() as $field => $error) {
                if (is_array($error)) {
                    $msg = implode(', ', $error);
                } else {
                    $msg = $errors;
                }
                $result[] = $msg . ' [' . $field . ']';
            }
        }

        return $result;
    }

As you can see (line 09 above is line 93 in the report), the issue reported by the PHPMD is a typo in the variable name. It should be $error, not $errors.

Step 5: Fix the problem.

  • Rename the $errors variable to $error.
  • Rerun the PHPMD report as per Step 3.
  • Examine report as per Step 4 to make sure that the problem is fixed and no new issues were introduced.
  • Create a new branch.
  • Commit the code.
  • Push the branch to GitHub.
  • Create the Pull Request.

All of the above mini steps took about 7 seconds.

Step 6: Pour yourself a drink.

You’ve just learned how to use a new tool, found a bug, and submitted a patch to the Open Source project.  At least I hope you did.

Not bad at all.

If you are wondering what to do next, here are a few suggestions:

  • Try running PHPMD for other types of issues.  As I said, it supports cleancode, codesize, controversial, design, naming, unusedcode, and we’ve only ran it for the “unusedcode”.  See what else is there.
  • Integrate PHPMD into your projects, to run automatically, together with your unit tests.  You do have automated unit tests, right?
  • Customize the ruleset that PHPMD is using to find more/less issues, which are maybe more specific to your project.
  • Use your newly acquired knowledge to fix issues with more Open Source projects.  You’ll make a name for yourself and you’ll make a world a better place.

Let me know how it goes.

PHP Static Analysis Tool – discover bugs in your code without running it!

Ondřej Mirtes shares the idea behind the creation of PHPStan – a static analysis tool for PHP:

Compiled languages need to know about the type of every variable, return type of every method etc. before the program runs. This is why the compiler needs to make sure that the program is “correct” and will happily point out to you these kinds of mistakes in the source code, like calling an undefined method or passing a wrong number of arguments to a function. The compiler acts as a first line of defense before you are able to deploy the application into production.

On the other hand, PHP is nothing like that. If you make a mistake, the program will crash when the line of code with the mistake is executed. When testing a PHP application, whether manually or automatically, developers spend a lot of their time discovering mistakes that wouldn’t even compile in other languages, leaving less time for testing actual business logic.

I’d like to change that.

This made sense to me, so I rushed to the repository.  I have quite a few projects to try this on.  I hurried so much that I didn’t pay attention to the important notes (aka prerequisities).  These are:

PHPStan requires PHP 7.0. You have to run it in environment with PHP 7 but the actual code does not have to use PHP 7 features. (Code written for PHP 5.6 and earlier can run on 7 mostly unmodified.)

PHPStan works best with modern object-oriented code. The more strongly-typed your code is, the more information you give PHPStan to work with.

Properly annotated and typehinted code (class properties, function and method arguments, return types) helps not only static analysis tools but also other people that work with the code to understand it.

Erm … if I had properly annotated and typehinted code, which is nicely organized into objects, I think, I wouldn’t need PHPStan as much as I need it now.  Anybody can analyze beautiful code.  Try figuring out what’s going on in a WordPress theme with 150 PHP files, split into classes, functions and chunks of unmaintainable code.  That’s where I wanted PHPStan to help me.

But OK.  Let’s see what it can do.  Gladly, my laptop already runs PHP 7 – here is a good first use for it.

Intstalling PHPStan with composer was easy.  All I had to do was resolve the nikic/php-parser dependency conflict between PHPStan and Sami, which is our source code documentation tool of choice (the newer version uses a much more recent version of the PHP Parser, so it wasn’t rocket science).

Once installed, a simple “vendor/bin/phpstan analyse ./src” command produced a report with a few issues.  Most of those were false positives, which can be fixed with a bit of PHPStan configuration.  But a few real problems that were found, were indeed bits that sneaked through our automated and manual testing.  For example:

------ ---------------------------------
 Line   src/Shell/EmailShell.php
------ ---------------------------------
 37      Return typehint of method App\Shell\EmailShell::getOptionParser() has invalid type App\Shell\ConsoleOptionParser.
------ ---------------------------------

I don’t think we’ll use PHPStan across all our code base just yet.  It’ll be too noisy for some projects.  And the PHP 7 requirement is not that easy to satisfy just yet.  But maybe sometime next year, once we finalize our move to PHP 7, I will integrate it into our automatic testing process.

All in all, it’s quite a useful tool and much needed for larger code bases.

PHP : Microsoft Office 365 and Active Directory

Disclaimer: I am not the biggest fan of Microsoft.  On the contrary.  I keep running into situations, where Microsoft technologies are a constant source of pain.  If that annoys you, please stop reading this post now and go away.  I don’t care.  You’ve been warned.

A few recent projects that I’ve been working on in the office required integration with Microsoft Office 365.  Office 365 is a new kid on the block as far as I am concerned, so I had no experience of integrating with these services.

The first look at what needs to be done resulted in a heavy drinking session and a mild depression.  Here are a few links to get you started on that path, if you are interested:

We’ve discussed the options with the client and decided to go a different route – limit the integration to the single sign-on (SSO) only, and use their Active Directory server (I’m not sure about the exact setup on the client side, but I think they use Active Directory Federation Services to have a local server in the office synchronized with the Office 365 directory).

Exposing the Active Directory server to the entire Internet is not the smartest idea, so we had to wrap this all into a virtual private network (VPN).  You can read my blog post on how to setup the CentOS 7 server as an automated VPN client.

Once the Active Directory was established, PHP LDAP module was very useful for avoiding any low-level programming (sockets and such).  With a bit of Google searching and StackOverflow reading, we managed to figure out the magic combination of parameters for ldap_connect(), ldap_set_option(), and ldap_search().

It took longer than expected, but some of it was due to the non-standard configuration and permissions on the client side.  Anyways, it worked, which were the good news.

The client accepted the implementation and we could just close the chapter, have another drink, and forget about this nightmare.  But something was bothering me about it, so I was thinking the heavy thoughts at the back of my mind.

The things that bother me about this implementation are the following:

  • Although it works, it’s a rather raw implementation, with very limited flexibility (filters, multiple servers, etc).
  • The code is difficult to test, due to the specifics of the AD setup and the network access limitations.
  • There is a lack of elegance to the solution.  Working code is good, but I like things to be beautiful too.  As much as possible at least.

So, I was keeping an eye open and I think today I came across a couple of links that can help make things better:

  1. adLDAP PHP library, which provides LDAP authentication and integration with Active Directory.  I don’t know how I missed it so far, but I think now things will be much easier and cleaner.
  2. Search Filter Syntax documentation on MSDN.
  3. This Reddit thread.  Yes, a lot of the things I’ve learned today are linked from it.  But it’ll be much easier for me to find all this information in my own blog, next time I’ll have to deal with Microsoft again.
  4. Public-facing LDAP server thanks to Georgia Institute of Technology, for testing connection and simple queries.

Armed with this new knowledge, I’m sure the current working solution can be improved a lot – simplified with fewer lines of code, based on the much more robust and tested code base, and given a basic test script to make sure the code works somewhere else, outside of a particular client’s setup.

I wish I came across that all much earlier.

 

Taking the Pain Out of MySQL Schema Changes

Taking the Pain Out of MySQL Schema Changes” covers the following approaches to deploying MySQL schema changes:

  1. Schema Change in Downtime
  2. Role Swap (cluster setup)
  3. pt-online-schema-change

The last one is the usage of pt-online-schema-change tool developed by Percona guys, as part of their Percona Toolkit – an Open Source set of command-line tools for MySQL.