tagbar-phpctags : Vim plugin for PHP developeres


If you are using Vim editor to write PHP code, you probably already know about the excellent tagbar plugin, which lists methods, variables and the like in an optional window split.  Recently, I’ve learned of an awesome phpctags-tagbar plugin, which extends and improves this functionality via a phpctags tool, which has a deeper knowledge of PHP than the classic ctags tool.

Once installed, you’ll have a more organized browser of your code, with support for namespaces, classes, interfaces, constants, and variables.

PHP: array_merge_recursive() vs. array_replace_recursive()

Here is a nice blog post describing the important differences between array_merge_recursive() and array_replace_recursive() functions in PHP.  These are often overlooked when testing new developments with simpler data structures.  Troubleshooting for it later is not too obvious.

BitBucket Pipelines and Docker for PHP Developers

I’ve been meaning to look into Docker for a long while now.  But, as always, time is the issue.  In the last couple of days though I’ve been integrating BitBucket Pipelines into our workflow.  BitBucket Pipelines is a continuous integration solution, which runs your project tests in a Docker container.  So, naturally, I had to get a better idea of how the whole thing works.

Docker for PHP Developers” article was super useful.  Even though it wasn’t immediately applicable to BitBucket Pipelines, as they don’t currently support multiple containers – everything has to run within a single container.

The default BitBucket Pipelines configuration suggests the phpunit/phpunit image.  If you want to run PHPUnit tests only, that works fine.  But if you want to have a full blown Nginx and MySQL setup for extra bits (UI tests, integration tests, etc), then you might find smartapps/bitbucket-pipelines-php-mysql image much more useful.  Here’s the full bitbucket-pipelines.yml file that I’ve ended up with.

MySQL, PHP and “Integrity constraint violation: 1062 Duplicate entry”

Anna Filina blogs about an interesting problem she encountered with when working on a PHP and MySQL project:

MySQL was complaining about “Integrity constraint violation: 1062 Duplicate entry”. I had all the necessary safeguards in my code to prevent duplicates in tha column.

I gave up on logic and simply dumped the contents of the problematic column for every record. I found that there was a record with and without an accent on one of the characters. PHP saw each as a unique value, but MySQL did not make a distinction, which is why it complained about a duplicate value. It’s a good thing too, because based on my goal, these should have been treated as duplicates.

She also mentions two possible solutions to the problem:

My solution was to substitute accented characters before filtering duplicates in the code. This way, similar records were rejected before they were sent to the database.


As pointed out in the comments, a more robust and versatile solution would be to check the collation on the column.

I’m sure this will come in handy one day.

400,000 GitHub repositories, 1 billion files, 14 terabytes of code: Spaces or Tabs?

Here is an interesting bit of research – do people prefer tabs or spaces when programming the most popular languages?

Tabs or spaces. We are going to parse a billion files among 14 programming languages to decide which one is on top.

The results are not very surprising and somewhat disappointing (for all of us, tab fans):

tabs vs. spaces

As far as PHP goes, I’m sure the choice of spaces has to do with the PSR-2 coding style guide, which states:

Code MUST use 4 spaces for indenting, not tabs.

On a more technical note, I think this is also related to the explosion of editors and IDEs in the recent years, which, as good as they are, aren’t as good as Vim.  Vim allows for a very flexible configuration, where your code can be formatted and re-formatted any way you like, making tabs or spaces a non-issue at all.

Regardless of the results of the study, what’s more interesting is the method and tools used.  I’ve had my eye on the Google Big Query for a while now, but I’m too busy these days to give it a try.  The article gives a few insights, into how awesome the tool is.  1.6 terabytes of data processed in 864.6 seconds:

That query took a relative long time since it involved joining a 190 million rows table with a 70 million rows one, and over 1.6 terabytes of contents. But don’t worry about having to run it, since I left the result publicly available at [fh-bigquery:github_extracts.contents_top_repos_top_langs].


Analyzing each line of 133 GBs of code in 16 seconds? That’s why I love BigQuery.

If you enjoyed this article, also have a look at “Analyzing GitHub issues and comments with BigQuery“, which works with a similar-sized data, trying to figure out how to write bug reports and pull request comments, so that they would be acted upon faster.

PHP backdoors

PHP backdoors repository is a collection of obfuscated and deobfuscated PHP backdoors. (For educational or testing purposes only, obviously.)  These provide a great insight into what kind of functionality the attackers are looking for when they exploit your application.  Most of these rotate around file system operations, executing commands, and sending emails.

One of the things from those files that I haven’t seen before is FOPO – Free Online PHP Obfuscator tool.

The Twelve-Factor App

I first heard about the twelve-factor app a couple of years ago, in Berlin, during the International PHP conference.  It was the basis for David Zulke (of Heroku fame) talk on the best practices for the modern day PHP applications.

The twelve-factor app is a methodology for building software-as-a-service apps that:

  • Use declarative formats for setup automation, to minimize time and cost for new developers joining the project;
  • Have a clean contract with the underlying operating system, offering maximum portability between execution environments;
  • Are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration;
  • Minimize divergence between development and production, enabling continuous deployment for maximum agility;
  • And can scale up without significant changes to tooling, architecture, or development practices.

The twelve-factor methodology can be applied to apps written in any programming language, and which use any combination of backing services (database, queue, memory cache, etc).

Here are the 12 factors, each one covered in detail on the site:

  1. Codebase: one codebase tracked in revision control, many deploys.
  2. Dependencies: explicitly declare and isolate dependencies.
  3. Config: store config in the environment.
  4. Backing services: treat backing services as attached resources.
  5. Build, release, run: strictly separate build and run stages.
  6. Processes: execute the app as one or more stateless processes.
  7. Port binding: export services via port binding.
  8. Concurrency: scale out via the process model.
  9. Disposability: maximize robustness with fast startup and graceful shutdown.
  10. Dev/prod parity: keep development, staging, and production as similar as possible.
  11. Logs: treat logs as event streams.
  12. Admin processes: run admin/management tasks as one-off processes.

These seem simple and straightforward, but in reality not always as easy to follow.  Regardless, these are a good goal to aim at.

504 Gateway Timeout error on Nginx + FastCGI (php-fpm)


“504 Gateway Timeout” error is a very common issue when using Nginx with PHP-FPM.  Usually, that means that it took PHP-FPM longer to generate the response, than Nginx was willing to wait for.  A few possible reasons for this are:

  • Nginx timeout configuration uses very small values (expecting the responses to be unrealistically fast).
  • The web server is overloaded and takes longer than it should to process requests.
  • The PHP application is slow (maybe due to database behind it being or slow).

There is plenty advice online on how to troubleshoot and sort these issues.  But when it comes down to increasing the timeouts, I found such advice to be scattered, incomplete, and often outdated.  This page, however, has a good collection of tweaks.  They are:

  1. Increase PHP maximum execution time in /etc/php.inimax_execution_time = 300
  2. Increase PHP-FPM request terminate timeout in the pool configuration (/etc/php-fpm.d/www.conf): request_terminate_timeout = 300
  3. Increase Nginx FastCGI read timeout (in /etc/nginx/nginx.conf): fastcgi_read_timeout 300;

Also, see this Stack Overflow thread for more suggestions.

P.S.: while you are sorting out your HTTP errors, have a quick look at HTTP Status Dogs, which I blogged about a while back.

httpoxy – a CGI application vulnerability for PHP, Go, Python and others


httpoxy is a set of vulnerabilities that affect application code running in CGI, or CGI-like environments.

It comes down to a simple namespace conflict:

  • RFC 3875 (CGI) puts the HTTP Proxy header from a request into the environment variables as HTTP_PROXY
  • HTTP_PROXY is a popular environment variable used to configure an outgoing proxy

This leads to a remotely exploitable vulnerability. If you’re running PHP or CGI, you should block the Proxy header now.

Composer magic

Now that everyone is super comfortable with composer, I thought I’d share these two gems which I didn’t know or think about.

composer info

This command lists all of your packages installed with composer.  This is super handy if you want to include a page in your project, listing all the libraries and versions which are currently installed.  It also gives you a description of each library as provided by the package.

composer outdated

This command lists packages which you are using, which have updates available.  With this you can have a better understanding of what will happen if you run composer update (depending on your composer.json of course).

Update (July 21, 2016): Guess what? There is even a way to combine the two with one command: composer info -l .  This will list all the packages, with their versions and descriptions, and with an additional column of the latest version for each package.