Monitoring PHP errors, warnings, and notices

There are a number of ways to monitor PHP errors, warnings, and notices.   You can have your application code trigger some error handling, you can use PHP built-in methods, you can have some scripts running in the background analyzing logs, etc.  While you already probably do some of it, here is something that you’ll find handy.

First of all, don’t log all PHP noise into a single file.   You can easy make separate logs for each project.  Somewhere at the top of your project, when it only starts loading, add the following configuration settings:

ini_set('error_reporting', E_ALL);
ini_set('log_errors', '1');
ini_set('error_log', '/path/to/project/logs/php_errors.log');
ini_set('display_errors', '0');

This will enable logging of all errors, warnings, and notices into a file that you specified. And, at the same time, it will disable the display of all the logs to your visitors (something that you should definitely do for a production server).

One you’ve done that, you’ll notice another problem. If your application is of any considerable size and/or if it uses a lot of third-party code, you’ll get buried in all those warnings and notices. The file will quickly become very large and boring and leave your attention span. Not good. While you can fight the size of the file with a tool like logrotate, the boredom is a more serious problem. The same notices and warnings appear over and over and over. You’ll fix some of them and the others will stay there forever. What you need as a way to have a quick overview of what is broken and what is noisy.

Today I wrote a quick cronjob to do just that. Here it is in all its entirety.

#!/bin/bash

# This script parses the project PHP errors logs every hour, creates the summary of all
# errors/warnings/notices/etc and emails that summary to the email specified below.

EMAIL="[email protected]"
SUBJECT="here.com PHP errors summary for the last hour"
PHP_ERRORS_FILE="/path/to/project/logs/php_errors.log"

# The log starts with timestamp like [01-Mar-2010 12:48:56]. Timestamp + 1 stamp occupy about 24 bytes
ONE_HOUR_AGO=`date +'[%d-%b-%Y %H:' -d '1 hour ago'`

# We only need that double backslash because date pattern uses square bracket
grep "^\\$ONE_HOUR_AGO" $PHP_ERRORS_FILE | cut -b 24- | sort | uniq -c | sort -n -r | mail -s "$SUBJECT" $EMAIL

You can drop this file into /etc/cron.hourly/report_php_errors.sh, change permissions to executable, and wait for the next run of hourly scripts. If you’ve updated the variables inside the script to reflect the correct email address and path to log file, you’ll get an email every hour which will look something like this:

From: [email protected]
To: [email protected]
Subject: here.com PHP errors summary for the last hour

  14 PHP Notice:  Use of undefined constant PEAR_LOG_DEBUG - assumed 'PEAR_LOG_DEBUG' in /some/path/to/some/file.php on line 17
    12 PHP Notice:  Undefined index:  is_printed in /path/to/something.php on line 2035
     9 PHP Notice:  Undefined index:  blah in /some/foo/bar.php on line 42
     7 PHP Notice:  Undefined offset:  1 in /some/verifier/script.php on line 120

The email will not be limited to 3 or 4 lines. It will actually contain each and every individual notice, error, and warning that occurred during the last hour in your project. The list will be sorted by how often each warning occurred, with the most frequent entries at the top.

With this list you can start fixing your most frequently seen problems, and you can also notice weird activity much faster than just checking the log file and hoping to catch it with your own eyes.

Enjoy!

CakePHP : Building factories with models and behaviors

CakePHP is a wonderful framework.   Recently I proved it to myself once again (not that I need much of that proof anyway).  The problem that we had at work was a whole lot of code in once place and no obvious way of how to break that code into smaller, more manageable pieces.  MVC is a fine architecture, but it wasn’t obvious to me how to apply it to larger projects.

In our particular case, we happen to have several data types, which are very similar to each other, yet should be treated differently.  Two examples are:

  1. Client account registrations.   Our application supports different types of accounts and each type has its own processing, forms, validations, etc.  However, each type of account is still an account registration.
  2. Financial transactions.  Our clients can send us money via a number of payment methods – credit cards, PayPal, bank wires, etc.  Each type of the transaction has its own processing, forms, validations, etc.  However, each type of the transaction is still a financial transaction of money coming in.

Having a separate model for each type of account or for each type of transaction seems excessive.  There are differences between each type, but not enough to justify a separate model.  Having a single model though means that it’ll continue to grow with each and every difference that needs to be coded in.  Something like a class factory design pattern would solve the problem nicely, but the question is how to fit it into an existing MVC architecture.  Read the rest of this post for a demonstration.

Continue reading CakePHP : Building factories with models and behaviors

PHP variables, strings, and curly braces

For the last couple of days we had a number arguments at work about what is the best way to surround a complex PHP variable inside a double-quoted string.  More specifically, should the sigil ($, dollar sign) be on the inside of the braces or on the outside.  Consider an example:

# my way
echo "Result: ${blah['something']}\n";
# the highway
echo "Result: {$blah['something']}\n";

While considering a number of examples, there seems to be no difference – both ways work.  We’d still need to pick one for consistency reasons though.  And I, as an ex-Perl programmer, was suggesting that we should use the dollar sign on the outside of the expression.  This how I remember it being in Perl (and PHP originated from Perl) .  This is how I am used to it.  And this is how makes most sense to me – a dollar sign immediately warns the programmer that the variable is ahead.

However, after consulting PHP documentation, I was proved wrong.  It is said that both ways often work, but it is much safer to use the dollar sign on the inside.  The manual page even provides a few examples where the dollar on the outside won’t work (such in case with objects).

While this is just a small thing to know and get used to, it still looks annoying to me.

PHP date() and 53 weeks

Let’s say you have a bunch of statistical data.  And all that data is date-related.  And let’s say want to display that data on a chart, a weekly average or something along those lines.  One of the ways for you to place the value into the proper week would be something like this:

$week = date('W', strtotime($stats_date));
$values[$week][] = $stats_value;

And if you did it this way, sooner or later, you’d notice that something is not quite working right at the edges of your chart.  With code as simple and straight-forward as this, you’d probably look for the problem elsewhere.  Maybe it’s your statistical data which is wrong, or the graph is not generated properly.  But the problem is here.

How many weeks do you think there are in a year?  A common knowledge says 52.  However, if you think for a moment about how the weeks are related to the year, you’ll realize that the first and last weeks don’t necessary start and end at the edge of the year.  If you play around with 1st of January and 31st of December across several years, you’ll notice that sometimes they fall into the 53rd week.  (As do a few more days, not just these two).

And here, the problem with the “W” date() format starts to emerge.   When scattering your data across a single year, you’d most often expect January to start with the first week of the year.  But it doesn’t. Sometimes the start of it falls into the 53rd week of the previous year.  And date(‘W’, $your_time) will happily return 53.  What will this do to your chart?  Two things are most likely.  The first week’s values would get reduced, and the last week’s values would get increased.  Or those values of the first week would altogether vanish from the graph.  That is unless you are careful.  Which I hope you’ll now be.

See comments to PHP date() manual for several solutions of this problem.

Subversion is not dead

Git is on the rise right now, especially in the Open Source Software development circles.  Some even went as far as predict the death of Subversion.  As much as I appreciate git (here is a link for you, if you don’t) and what it is doing for the Open Source Software, I have to agree with Brandon Savage:

Corporate America needs a centralized version control system. Subversion still offers this: Subversion centralizes the repository and simply checks out a working copy (versus Git, which gives you a complete repository). Corporate America still needs to have cannonical version numbers, and the ability to see the progress of a product over time as a single line – not a bunch of branches and independent repositories.

And this is true not only for the corporate America.