Blog of Leonid Mamchenkov

You just stepped in a pile of posts.

Entries Tagged ‘PHP’

Whatever happened to programming

Via this Slashdot post I came across an excellent blog rant – Whatever happened to programming (and the follow-up).  Subject in focus – modern programming, and how boring it have become (mostly).

Today, I mostly paste libraries together.  So do you, most likely, if you work in software.  Doesn’t that seem anticlimactic?  We did all those courses on LR grammars and concurrent software and referentially transparent functional languages.  We messed about with Prolog, Lisp and APL.  We studied invariants and formal preconditions and operating system theory.  Now how much of that do we use?

Of course, when a subject like that is brought up, it’s pretty much guaranteed that the web will respond with numerous discussions on if and how much of it is true, how did we get here, and how we can get out, and anything else remotely or not at all related.  And that’s just what happened.  You can read Slashdot or Reddit comments or Google for more.  But I think, if you do programming for living, you’d probably agree with the main point of the article.  And even if you won’t, it’s still fun to read.  Like this bit for example:

Especially, I have learned that anything that has “Enterprise” in its name is so incredibly boring that the people who use it had to shove the name of the Star Trek ship into its title just to keep themselves awake.

On the serious note though, working with mainly two programming languages – Perl and PHP, I see that there is indeed a difference to the “being boring” degree.  PHP is way more boring than Perl.   Surprisingly even with Perl being so well known for its CPAN – a huge archive of modules and libraries to use.   I guess it has something to do with There Is More Than One Way To Do It – motto of Perl.

Monitoring PHP errors, warnings, and notices

There are a number of ways to monitor PHP errors, warnings, and notices.   You can have your application code trigger some error handling, you can use PHP built-in methods, you can have some scripts running in the background analyzing logs, etc.  While you already probably do some of it, here is something that you’ll find handy.

First of all, don’t log all PHP noise into a single file.   You can easy make separate logs for each project.  Somewhere at the top of your project, when it only starts loading, add the following configuration settings:

ini_set('error_reporting', E_ALL);
ini_set('log_errors', '1');
ini_set('error_log', '/path/to/project/logs/php_errors.log');
ini_set('display_errors', '0');

This will enable logging of all errors, warnings, and notices into a file that you specified. And, at the same time, it will disable the display of all the logs to your visitors (something that you should definitely do for a production server).

One you’ve done that, you’ll notice another problem. If your application is of any considerable size and/or if it uses a lot of third-party code, you’ll get buried in all those warnings and notices. The file will quickly become very large and boring and leave your attention span. Not good. While you can fight the size of the file with a tool like logrotate, the boredom is a more serious problem. The same notices and warnings appear over and over and over. You’ll fix some of them and the others will stay there forever. What you need as a way to have a quick overview of what is broken and what is noisy.

Today I wrote a quick cronjob to do just that. Here it is in all its entirety.

#!/bin/bash

# This script parses the project PHP errors logs every hour, creates the summary of all
# errors/warnings/notices/etc and emails that summary to the email specified below.

EMAIL="me@here.com"
SUBJECT="here.com PHP errors summary for the last hour"
PHP_ERRORS_FILE="/path/to/project/logs/php_errors.log"

# The log starts with timestamp like [01-Mar-2010 12:48:56]. Timestamp + 1 stamp occupy about 24 bytes
ONE_HOUR_AGO=`date +'[%d-%b-%Y %H:' -d '1 hour ago'`

# We only need that double backslash because date pattern uses square bracket
grep "^\\$ONE_HOUR_AGO" $PHP_ERRORS_FILE | cut -b 24- | sort | uniq -c | sort -n -r | mail -s "$SUBJECT" $EMAIL

You can drop this file into /etc/cron.hourly/report_php_errors.sh, change permissions to executable, and wait for the next run of hourly scripts. If you’ve updated the variables inside the script to reflect the correct email address and path to log file, you’ll get an email every hour which will look something like this:

From: cron@your.host
To: me@here.com
Subject: here.com PHP errors summary for the last hour

  14 PHP Notice:  Use of undefined constant PEAR_LOG_DEBUG - assumed 'PEAR_LOG_DEBUG' in /some/path/to/some/file.php on line 17
    12 PHP Notice:  Undefined index:  is_printed in /path/to/something.php on line 2035
     9 PHP Notice:  Undefined index:  blah in /some/foo/bar.php on line 42
     7 PHP Notice:  Undefined offset:  1 in /some/verifier/script.php on line 120

The email will not be limited to 3 or 4 lines. It will actually contain each and every individual notice, error, and warning that occurred during the last hour in your project. The list will be sorted by how often each warning occurred, with the most frequent entries at the top.

With this list you can start fixing your most frequently seen problems, and you can also notice weird activity much faster than just checking the log file and hoping to catch it with your own eyes.

Enjoy!

CakePHP : Building factories with models and behaviors

CakePHP is a wonderful framework.   Recently I proved it to myself once again (not that I need much of that proof anyway).  The problem that we had at work was a whole lot of code in once place and no obvious way of how to break that code into smaller, more manageable pieces.  MVC is a fine architecture, but it wasn’t obvious to me how to apply it to larger projects.

In our particular case, we happen to have several data types, which are very similar to each other, yet should be treated differently.  Two examples are:

  1. Client account registrations.   Our application supports different types of accounts and each type has its own processing, forms, validations, etc.  However, each type of account is still an account registration.
  2. Financial transactions.  Our clients can send us money via a number of payment methods – credit cards, PayPal, bank wires, etc.  Each type of the transaction has its own processing, forms, validations, etc.  However, each type of the transaction is still a financial transaction of money coming in.

Having a separate model for each type of account or for each type of transaction seems excessive.  There are differences between each type, but not enough to justify a separate model.  Having a single model though means that it’ll continue to grow with each and every difference that needs to be coded in.  Something like a class factory design pattern would solve the problem nicely, but the question is how to fit it into an existing MVC architecture.  Read the rest of this post for a demonstration.

[Read the rest of this entry...]

PHP variables, strings, and curly braces

For the last couple of days we had a number arguments at work about what is the best way to surround a complex PHP variable inside a double-quoted string.  More specifically, should the sigil ($, dollar sign) be on the inside of the braces or on the outside.  Consider an example:

# my way
echo "Result: ${blah['something']}\n";
# the highway
echo "Result: {$blah['something']}\n";

While considering a number of examples, there seems to be no difference – both ways work.  We’d still need to pick one for consistency reasons though.  And I, as an ex-Perl programmer, was suggesting that we should use the dollar sign on the outside of the expression.  This how I remember it being in Perl (and PHP originated from Perl) .  This is how I am used to it.  And this is how makes most sense to me – a dollar sign immediately warns the programmer that the variable is ahead.

However, after consulting PHP documentation, I was proved wrong.  It is said that both ways often work, but it is much safer to use the dollar sign on the inside.  The manual page even provides a few examples where the dollar on the outside won’t work (such in case with objects).

While this is just a small thing to know and get used to, it still looks annoying to me.

PHP date() and 53 weeks

Let’s say you have a bunch of statistical data.  And all that data is date-related.  And let’s say want to display that data on a chart, a weekly average or something along those lines.  One of the ways for you to place the value into the proper week would be something like this:

$week = date('W', strtotime($stats_date));
$values[$week][] = $stats_value;

And if you did it this way, sooner or later, you’d notice that something is not quite working right at the edges of your chart.  With code as simple and straight-forward as this, you’d probably look for the problem elsewhere.  Maybe it’s your statistical data which is wrong, or the graph is not generated properly.  But the problem is here.

How many weeks do you think there are in a year?  A common knowledge says 52.  However, if you think for a moment about how the weeks are related to the year, you’ll realize that the first and last weeks don’t necessary start and end at the edge of the year.  If you play around with 1st of January and 31st of December across several years, you’ll notice that sometimes they fall into the 53rd week.  (As do a few more days, not just these two).

And here, the problem with the “W” date() format starts to emerge.   When scattering your data across a single year, you’d most often expect January to start with the first week of the year.  But it doesn’t. Sometimes the start of it falls into the 53rd week of the previous year.  And date(‘W’, $your_time) will happily return 53.  What will this do to your chart?  Two things are most likely.  The first week’s values would get reduced, and the last week’s values would get increased.  Or those values of the first week would altogether vanish from the graph.  That is unless you are careful.  Which I hope you’ll now be.

See comments to PHP date() manual for several solutions of this problem.

Web statistics and visitor tracking : things you need to know

First of all, just to make it clear, I don’t recommend writing your own web statistics / analytics / tracking application.  Google Analytics can track and report pretty much everything you will ever need. Period. If you think it can’t do it, chances are you just don’t know how.  That’s much easier to correct than to write your own tracking / reporting application.  I promise.  In case though, Google Analytics doesn’t do something that you need, grab one of those Open Source applications and modify it to suit.  While not as easy as learning Google Analytics, that would still be much easier than doing your own thing from scratch.

However, if you still decide to roll out your own tracker, here are a few things that you need to know.

  • Use the bicycle, don’t reinvent it. Most of the tracking applications that I’ve seen use some form of JavaScript, which is appended right before the end of the page markup.  Said JavaScript collects as much statistics as you need and generates a request to an image on the remote server (your tracking application), passing gathered statistics as parameters to the image.  On the server side, your tracking application gathers sent parameters, merges them with whatever else you can get from the server side, and saves in the database or in your data storage of choice.
  • Keep ad blocking applications in mind. Many ad blocking plugins for different browsers block 1×1 pixel images from remote servers.  Be a bit more creative – use a 2×1 or a 1×2 pixel image.  If it is a transparent GIF at the bottom of the page, nobody will notice it anyway.
  • Gather as much as you can from the server side. It’s simpler, and you minimize the chances of breaking things with an URL which is too long (your GET request for the image with all parameters can run pretty long, especially if you pass current page and referring page URLs).
  • Minimize the length of your parameter names and values when you pass them to image GET request. Again, this is to avoid extremely long URLs.  You can sacrifice readability in your JavaScript and instead document parameters in the server side tracker application.
  • Record both client’s IP address and possible proxy server’s IP address. That is available for you in the request headers ($_SERVER['HTTP_X_FORWARDED_FOR'] in PHP for example).  Once you got the IP addresses, use GeoIP to lookup the country, region, city, coordinates, etc.  It’s better to do so at the time you record the data.  There is a free GeoIP service as well, but it will give you much less information.  The commercial one is not that expensive.
  • Record client’s browser information. Browsercap is very useful for that.  However, it’s better to parse user agent string with browsercap at the report / export time, not at the request recording time.  This will guarantee that you always have the most correct information about the browser in your report.  Browsercap gets updated with new signatures pretty often.
  • If you are tracking a secure site (HTTPS), chances are you won’t have referrer information available to you.  Apparently, that’s a security feature.
  • If you use both JavaScript and PHP to figure out the referrer, keep in mind that JavaScript uses document.referrer, while PHP uses $_SERVER['HTTP_REFERER'].  Notice that one is spelled with two Rs, while the other – with one.  That might save you some troubleshooting time.
  • It’s better to use the same JavaScript code snippet across all your sites.  To avoid SSL-related security warnings, your JavaScript need to figure out if it’s in HTTPS web site or in plain HTTP one. See Google Analytics example on how to actually do that.   It doesn’t hurt to have a signed SSL certificate for the HTTPS hosting of your tracker application.
  • Don’t forget about HTML and URL escaping / encoding. Check that everything works properly for you in different browsers.  JavaScript is still hard to nail right sometimes.
  • Keep the version of tracker application in every request log entry. This will much simplify your migrations later.  One of the ways to keep this automated is to use tags / keyword substitutions in your version control software (here is how to do this in Subversion).
  • Make sure your tracker spits out that transparent image no matter what. Broken image icons are very visible and you don’t want those on your site just because your tracker database went down temporarily.
  • For the best cross-site tracking, start tracker session, which will remain the same when visitor will go from one of your tracked web sites to another.  If your tracked web sites use sessions, pass their IDs to tracker, so that both tracked and tracker session IDs could be logged in the same request. This will help you link stats from several sites together, as well as do all sorts of drill-downs into site-specific stats straight from the bird-view reports.
  • Don’t be evil! There is a lot that you can collect about your visitors.  Make sure that you tell them exactly what you are collecting and how you are using it.  Aggregate and anonymize your logs to prevent negative consequences.  I’m sure you know what I mean.

Once again, think really good before you decide to do one yourself.  It’s not an easy job.  And even if you grab all the data you want and save it in your database, there is an incomparably bigger issue to solve yet – reports, graphs, export, and overall visualization and analytics part of that data.  Why would you even want to go into that?

Enforcing coding styles in PHP

I came across a plugin for CakePHP which helps to check if the certain code follows CakePHP coding style.  While I haven’t tried it, I think the better way is to utilize CodeSniffer.  As per PHP_CodeSniffer PEAR page:

PHP_CodeSniffer tokenises PHP, JavaScript and CSS files and detects violations of a defined set of coding standards.

Which basically means that PHP_CodeSniffer is a generic tool for validating your code.  You can use for CakePHP, WordPress, or any other PHP project that you are working on.  The best part is that you can create your own set of rules regarding coding style and then make sure that your team follows it. If you don’t care that much for your own rules, then you can use one of the many existing rulesets.  Some of these come together with CodeSniffer package, others are available on the Web.

Setting up CodeSniffer for my team at work has been a long lasting TODO item, however it looks like I will be able to start working on this next week.  Once it created, tested, and everyone is happy with it, we’ll have it in the pre-commit hook in our Subversion repository.  This way, we will prevent commits of any code that does not follow our rules.  Of course, I plan to only run CodeSniffer against the code that we wrote in-house.  There is no need to re-format all the third-party code just for the sake of it.  Plus, we are rarely doing any modifications of the third-party code at all.

Attending PHP UK Conference 2009

Security centered design

The conference day.  We woke up early to get in queue at registration which opened at 08:30.  When we got to the Olympia Conference Center, which was about 5 minutes walk from our hotel, it was full of people.   More than a hundred people already, and we were early.  Got our badges and notepads, grabbed a coffee, and started wondering around.  There were a few sponsor stands, so we had something to do.

Honestly, I thought there would be more stands, and from companies which are closer related to web development.  We got to O’Reilly to buy some books at 35% discount (I was the first customer of the day, beta-testing the receipt issuing procedure, hehe).  Looked at iBuildings stand briefly.  Looked at Sun MySQL something to do with reporting tool something.  It was crowded over there and I had a cup of coffee in my hands, so didn’t get too close.  Saw a few people playing with Wii and some more with MS Xbox 360.  Seemed like fun.

The conference itself featured a few talks, and it was a double track, so each attndee had to chose from one of the two concurrent speeches which to attend.  Here are the ones that I went to:

  • Keynote talk: The future’s so bright, I gotta wear shades by Aral Balkan. It was a bit too lengthy for the points it made, but inspiration non-the-less.
  • Sharding Architectures by David Soria Parra.  Very interesting discussion on scaling database across several servers. Sharding technique described can be applied to much more than just that.
  • Of Lambda Functions, Closures and Traits by Sebastian Bergmann.  A look into some advanced features of PHP 5.3.  These will make writing PHP code a bit more fun, and result a bit more pleasant to look at.
  • Living with Frameworks by Stuart Herbert.  Nice, balanced look at why frameworks are important.  It was a bit misplaced though, since it was more for people who don’t yet use frameworks, while most of the audience was from the frameworks camp.
  • Myphp-busters: symfony framework by Stefan Koopmanschap.  An overview of Symfony framework, which made me love CakePHP even more.
  • Security-Centered Design — exploring the impact of human behavior by Chris Shiflett. Interesting descussion (with cool examples) of social part in security approaches.

Sharding Architectures and Lambda Functions were two of my favourite talks for technical insight.  Security-Centered Design and Living with Frameworks were the two favourites for non-technical inspiration.

After the last talk there were a few free beers at the venue, and after that there was another beer session at Brook Green Hotel.  Quite a few people, quite a few pints, quite a few interesting conversations and contacts made, excellent buffett, and overall a time well spent.

A note to conference organizers: I know you guys worked hard to make this happen, and that you are a bunch of hobbyiests who are not getting paid to do this, so, first of all, thank you.  I really enjoyed the event.  Here are a few things that I think could be improved, just in case  you will have control over them the next time:

  • WiFi coverage.  Yes, it was there and it was sort of working, but it was also slow and unstable.  At the beginning I thought that was just me for some reason, but then heard a few more people complain.
  • Power sockets.  I remember seeing only 3.   Maybe I just didn’t find them, of course, but they are sort of important.
  • Beer is the ultimate conversation maker.  Have it nearby from lunch on and more magic would happen.  (It doesn’t have to be free)
  • Mechandize.  Stickers, t-shirts, badges, etc to help remember and promote the event.
  • More stands.  I wanted to see people who do hosting, consulting, trainging, build tools, and more of the related.

As I said, I had an excellent time, learned a few new things, got inspired, met interesting people, etc.  An event was definitely a success and I’d gladly attend the future ones as well.  Oh, and I made a few pictures, which are available in my PHP UK Conference 2009 Flickr set.

Programming religions

I’m slowly catching up with the news stream and all the jokes of the last few weeks.  “If programming languages were religions” is a nice one.  Here is PHP, which I spent the most time with now:

PHP would be Cafeteria Christianity – Fights with Java for the web market. It draws a few concepts from C and Java, but only those that it really likes. Maybe it’s not as coherent as other languages, but at least it leaves you with much more freedom and ostensibly keeps the core idea of the whole thing. Also, the whole concept of “goto hell” was abandoned.

And here is Perl, which is my favourite programming language so far:

Perl would be Voodoo – An incomprehensible series of arcane incantations that involve the blood of goats and permanently corrupt your soul. Often used when your boss requires you to do an urgent task at 21:00 on friday night.

Check the rest of them for fun and profit.

Perl vs. PHP : variable scoping

I’ve mentioned quite a few times that I am a big fan of Perl programming languge.  However, most of my programming time these days is spent in PHP.  The languages are often similar, with PHP having its roots in Perl, and Perl being such a influence in the world of programming languages.  This similarity is often very helpful.  However there are a few difference, some of which are obvious and others are not.

One such difference that I came up recently (in someone else’s code though), was about variable scoping.  Consider an example in Perl:

#!/usr/bin/perl -w
use strict;
my @values = qw(foo bar hello world);
foreach my $value (@values) {
    print "Inside loop value = $value\n";
}
print "Outside loop value = $value\n";

The above script will generate a compilation error due to undefined variable $value.  The one outside the loop.

A very similar code in PHP though:

#!/usr/bin/php
<?php
$values = array('foo','bar','hello','world');
foreach ($values as $value) {
    print "Inside loop value = $value\n";
}
print "Outside loop value = $value\n";
?>

Will output the following:

Inside loop value = foo
Inside loop value = bar
Inside loop value = hello
Inside loop value = world
Outside loop value = world

In Perl, variable $value is scoped inside the loop.  Once the execution is out of the loop, there is no such thing as $value anymore, hence the compilation error (due to the use of strict and warnings).  In PHP, $value is in global scope, so the last value “world” is carried further down the road.  In case you reuse variable names in different places of your program, counting on scope to be different, you might get some really interesting and totally unexpected results.  And they won’t be too easy to track down too.  Be warned.