Perl vs. PHP : variable scoping

I’ve mentioned quite a few times that I am a big fan of Perl programming languge.  However, most of my programming time these days is spent in PHP.  The languages are often similar, with PHP having its roots in Perl, and Perl being such a influence in the world of programming languages.  This similarity is often very helpful.  However there are a few difference, some of which are obvious and others are not.

One such difference that I came up recently (in someone else’s code though), was about variable scoping.  Consider an example in Perl:

#!/usr/bin/perl -w
use strict;
my @values = qw(foo bar hello world);
foreach my $value (@values) {
    print "Inside loop value = $value\n";
}
print "Outside loop value = $value\n";

The above script will generate a compilation error due to undefined variable $value.  The one outside the loop.

A very similar code in PHP though:

#!/usr/bin/php
<?php
$values = array('foo','bar','hello','world');
foreach ($values as $value) {
    print "Inside loop value = $value\n";
}
print "Outside loop value = $value\n";
?>

Will output the following:

Inside loop value = foo
Inside loop value = bar
Inside loop value = hello
Inside loop value = world
Outside loop value = world

In Perl, variable $value is scoped inside the loop.  Once the execution is out of the loop, there is no such thing as $value anymore, hence the compilation error (due to the use of strict and warnings).  In PHP, $value is in global scope, so the last value “world” is carried further down the road.  In case you reuse variable names in different places of your program, counting on scope to be different, you might get some really interesting and totally unexpected results.  And they won’t be too easy to track down too.  Be warned.

Oracle and PHP – the deadly mix

WI’ve spent most of the last week getting into, around, and out of the issues related to interoperability of Oracle and PHP.  Before you start laughing, cursing, and blaming, Oracle wasn’t my choice of the database for this specific project.  It’s just the company already had it installed and working for the background, and there needed to be some integration with the front, which is of course MySQL and PHP based.

First thing I do, obviously, is visit PHP.net to check for the prefix of the functions that I need for Oracle.  Through out my experience with PHP, that’s about the only thing I need to know to start working with the new database.  Oh, and the PHP module installed to provide those functions. Oracle interface for PHP is called is called OCI8.  All you need to do now is install the oci8 module.

Here comes the first trouble.  oci8 is not provided as a pre-compiled package for Fedora Linux.  There is an alternative yum repository – Remi, which has oci8 RPMs, but first of all, the oci8 module is compiled against somewhat outdated Oracle headers (version 10.2.0.4 instead of the latest 11.1.0.1), and it also needs to replace your native PHP and MySQL packages.  I tried that, and it sort of worked, but I wasn’t happy.  So I got my Fedora packages back and decided that I need to compile oci8 myself.

In order to compile oci8, one needs to download Oracle InstantClient (basic package) and some header files (devel package).  These can be downloaded from the Oracle web site, for free, minus the time for the registration.  The little trick here is that during oci8 compilation process, the includes are searched from locations which do not include the one from Oracle RPM.  I did a simple symlink of the includes folder to where Oracle headers were, and compilation went on just fine.  (Hint: otherwise you’ll get a whole lot of Zend related messages and a fatal error).  Gladly, I only had to do this path correction on the Fedora 9 machine.  My production server with Red Hat Enterprise Linux 5 compiled oci8 without any problems all by itself.

Update: more detailed instructions on the actual installation can be found here and here.

Now that oci8 installed and configured, I spent some time figuring the correct way to specify the DSN.   Oracle uses some weirdly name file (tnsnames.ora) in some weird location, but luckily there is a way to go around it.  More so, I recommend that you remove tnsnames.ora file altogether, since it can add to your troubles.  For example, if you mix spaces and tabs as whitespaces in that file, you are screwed.  So, just get rid of it.  The way you specify DSN is directly in the PHP script, and you use the syntax like so:  “//hostname.or.ip:port/dbname“.  Intuitive, I know.

Once you’ll get connected to the server, you have a whole bag of surprises waiting for you.  That is if you are too used to working with MySQL.  First is the syntax.  Oracle is using PL/SQL, so you wipe the dust of from that really old Pascal textbook that you have somewhere.  “begin :result := some.procedure.call(:param1, :param2); end;” – that sort of thing.  Secondly, you’ll be happy to know that prepared queries are supported.  So your workflow will slightly change.  Perl programmers will feel more at home here.  oci_bind_by_name() and oci_execute() are your friends here.  Oh, and while you are at, get familiar with the types of the parameters, because they are important.  And don’t forget that you’ll have to bind each and every variable in the query, or get a fatal error. And since you are learning something here, get ready for the oracle errors.  The most frequent one you’ll get would be something like “Failed to retreive the error message for ORA-12345”, where 12345 would be a number of the error.  So you’ll google for ORA-12345 and ORA-54321 and ORA-XYZZZ a lot.  But than you’ll have a wrapper library and you’ll be OK.

Update: as was noted in the comments, PL/SQL is just an option, not a requirement.  Also, most of the headaches of the above paragraph could be avoided by using one of the PHP frameworks.  I personally haven’t yet tried the framework yet, since I’d like to see things working directly first.  Especially since we are not in the test mode only.

The bigger surprise is still waiting for you though.  You are very likely to discover that OCI8 implementation for PHP is very slow.  And I do mean extremely very slow.  I couldn’t believe that it could be slow, so I went into the source code and OMG!  It is really slow.  The slow part is around fetch_all() against fetch_row().  Basically, it’s always row by row and never all, even if you tell it how many rows you need fetched.

In my case, I have the server a bit far away, and there is a possibility to get many rows back.  So even for a simple query with 140 rows in results I was getting 20 seconds execution time.  Oracle was serving results fast, the network was OK, machines on both sides were powerful and all, but it was still taking 20 seconds or more.

I am still trying to find the solution to this issue, but so far it seems that the current way I do it will be the way to do it.  And the way I do it now is the following.  Never ever run direct SQL queries.  Everything goes through a stored procedure.  The results are returned all in a single row.  And that single row has the BLOB (CLOB actually) with all results in one single XML.  Fetching works good enough to get it, and then parsing is done with one of the billion XML parsers for PHP.

In my case MiniXML worked pretty good until bigger results started coming in.  That’s when I learned an important lesson.  MiniXML parses XML with a regular expression.  PHP has a couple of settings in the configuration file that limits the size of the memory and recursion during regex parsing – pcre.backtrack_limit and pcre.recursion_limit.  If you really want to kill your server, set these to -1 (instead of default 100000) and try a regex against a 1 MB XML file.  Enjoy, cause it won’t be long before everything goes down. I didn’t feel like changing from MiniXML so we just implemented some limits in the queries and stored procedures on the Oracle side, and add a few checks in PHP fail rather than crash the system.

So, to some it up, here is my experience with Oracle and PHP from the last week:

  • I had to register on Oracle web site to download packages
  • I had to re-learn my long forgotten compilation skills
  • I had to go read some C
  • I had to step on the “re-inventing the wheel” path more than once
  • I am parsing XML when working with the database
  • I had a head ache more than twice
  • I didn’t have much fun
  • After all, it works.  Sort of.

One last point in this saga is about Googling.  Ask me any question, and I do mean any question, about MySQL.  Heck, even PostgreSQL.  And the answer is just there, on the first page of Google results.  In any human or programming language.  For any operating system.  You’ll be sorted out and working in less then a minute.   Then, try asking even the simplest of the simplest questions about Oracle and PHP.  Sometimes you’ll find something.  Some other times, you won’t.  The overall feeling I have is that not a lot of people are using Oracle with PHP, and those of them who do are in their majority not very happy.

Now I’ve joined the army.

Programming language barrier

One of the frequent things that I hear about programmers is that it doesn’t matter which language the person is using and which language you need him to use, because if he is any good he’ll learn and catch up pretty fast.  In other words, if you take a decent Java programmer and push him to write PHP code for you, you’ll only have issues for a few days.  Or weeks, at most.

I understand the reasons for this statement, but I don’t agree with it.  At least not completely.

Firstly, the reasons.  They are rather obvoius, but I’d rather stagte them anyway.  Computer Science is not specific to any programming language.  The concepts and approaches are more or less the same everywhere.  Flow control, data structures, and algorithms are not language specific.  Each language has its own best practices and recommended variations, but a bubble sort in PHP will be very similar to bubble sort in Java.   Then you need some common sense, which is also not laguage bound at all.

Secondly, the disagreement.  I think that the Computer Science theory and common sense aren’t the only things that make up a programmer.  What makes a lot of difference is experience.  Programming languages, in their practical applicatoin, are just collections of software – compilers, linkers, debuggers, libraries, IDEs, etc.  Like any other software, programming language software has bugs, undocumented features, and Days When Things Don’t Just Work.  It’s the experience with the language that teaches the programmer how to handle the issues of each software piece.  And that experience is priceless (almost).

Even if you’d manage to push a Java programmer into writing PHP code, that would a waste of resources.  A Java programmer is a Java programmer, not PHP programmer.  He will, of course, learn PHP nuances with time, but, he’ll probably lose a part of his priceless (almost) bagage.  Sounds a lot like misuse of resources.

Another part of my disagreement is not so much reasoned as emotionalized.  I’ve seen a few C and Java developers switch to Perl and PHP for their new positions.  Not that I was forcing them to or anything, but they did.  And the switch was moslty painful to say the least.  Here are some of the areas that I noticed as being hard to comprehend.

Compiling vs. interpreting. Those people who were used to their compilation process were missing something for the first few days.  Some needed as much as a week to adopt, even though write-save-reload browser was done a few hundred times a day.

Debugging. There are two major camps here.  In the first one are all those people who live in the debugger.  They know all the keyboard shortcuts and they have their highlighting customized.  In another camp are people of the simpler nature, those who use print() and die() for most of their debugging needs.  It seems that most people coming from C and Java prefer the debugger way.  Most of the interpretted languages do have either a standalone debugger or a built in debugging tool, but it seems that the majority of interpretted language crowd use the print() and die() approach.

Sigils. If you don’t know what a sigil is, read this Wikipedia page.  Because you do know what it is.  Many strong type language don’t use any sigils.  Most of the loosely typed languages do.  Furthermore, when both the language from which you are changing and the language to which you are changing use sigils, chances are there will still be a difference.  PHP, for example, uses $ for both scalars and arrays.  In Perl though, you’ll get a $ for scalar, @ for array, and % for hash.  Perl’s sigils are extremely helpful when figuring out someone else’s code. I remember the pain of having just a $ in PHP, when I was learning it.  And I can’t even imagine how confusing it is for people who are used to non-sigilized programming languages.

Types. As already mentioned above, strong typed language programmers can be often confused with the fact that variables can change their type on the fly, and that they don’t even need to be declared before use.  Loosely typed language programmers will often complain about the requirement to define their types.  Three of the most common questions that I’ve heard regarding this matter were:

  • “How do I define an array of elements of a certain type of a certain length?”
  • “Is this line a piece of non-sense or does it really do something:   $sum += 0; ?”
  • “What’s wrong with writing:  int amount; amount = 2.5; ?”

There are, of course, more areas than just those – include pathes, include files, OOP, database abstraction, loops (“What the heck is foreach?”), memory management, libraries, and so on and so forth.

Even the list of the resources for each programming language takes time to build.   Yes, time.  And time is one thing that’s always against us.  Everything else we ca handle.

Follow-up to “Where did all the PHP programmers go?”

This is a quick follow-up to yesterday’s post – “Where did all the PHP programmers go?“.

First of all, let me take the moment and say “Wow!”.  Somebody submitted the post to Reddit and it made it to the front page and got an unbelievable amount of comments.  Almost 500, and still coming.  Thank you all.

Secondly, the comments on this blog are fixed finally.  Murphy’s Law in action – they got broken just before the wave came in and they got fixed shortly after.

Thirdly, I should clear up a few things.  My apologies for getting you guys confused.  I never asked any candidate to compare sorting algorithms, much less to implement them.  I asked to sort an array.  I was expecting one of those PHP function calls in return.  But I only got it a few times.  Many candidates didn’t know how to sort an array (apparently they use MySQL to sort an array).  A few suggested “bubble sort”.  Probably thinking that the tasks for testing sorting algorithms.  One even went as far as implementing a bubble sort in PHP.  With pen and paper.  This one was the toughest to decide about, by the way.

Fourthly, the correction.  The language is indeed called Ruby, not Ruby on Rails. I am aware of that.  I was just trying to catch a thought.  Thanks for pointing it out though.

Fifthly, explanation for the pen and paper.  Yes, I know that programmers are used to typing code.  I know that they are used to their tools and online references.  But.  This is an interview.  My time is limited and I have to make a decision.  If I give all the tools and references to my mother, she will be able to solve the problem I am giving in reasonable time.  She is not a PHP developer.  She has no experience with PHP.  But she has enough of common sense to do it.  If I take everything away – she won’t be able to do that.  But any semi-decent programmer will do.  Further on, I am not feeding the resulting paper into the machine.  The only parser that sees that code is the one embedded in my brain.  And I assure you it is very tolerant to minor syntax errors and missing parameters.  I want to see the process.  The approach. Some data structures and algorithms.  A bit of style in variable names, indentation, and empty lines, if I am lucky.  That’s all.

Sixthly, on the exercise itself.  I like to think that I am pretty flexible with answers.  For this particular exercise, a Perl programmer inside me thinks associative array is the best data structre.  (And yes, before you start bashing further, I know that associative arrays in PHP aren’t the same as hashes in Perl.)  I can accept an OOP solution just fine.  What I find hard to accept is a single dimensional array with hopping over a pre-defined number of fields per record.

Seventhly, this post, once it got to reddit and then furthermore to other news streams, generated more candidates and hints to where to find them, then all of my prevoius efforts.  Thanks to all of you who sent me resumes, links, and pointers.  My inbox is a bit overwhelmed right now, but I’ll reply to everyone over the next few days.

Thanks a lot to all of you.

Where did all the PHP programmers go?

During the last six month or so, I’ve been looking to hire a PHP programmer for at least three companies.  I have spoken to quite a few people on the phone, reviewed a bunch of resumes, and even interviewed a few.  Out of all those candidates I recommended to hire exactly zero.

Before you start bashing my high standards, let me explain.  I wasn’t looking for a rocket scientist or anything remotely similar.  Not even a senior PHP developer.  Someone with enough knowledge to take over maintenance of a couple of projects, both of which are based on famous open source software – CakePHP and WordPress.

I can understand that not everyone have worked with or even heard of CakePHP or WordPress.  I can understand that getting used to that source code and going through documentation might need some time.  I can understand that not everyone is familiar with open source software development model and that not everyone has worked in groups, so familiarity with version control software, documentation tools, and bug tracking was never a requirement.

What I cannot understand is why a person who have (according to him) developed more than two dozens of web projects with PHP and MySQL cannot write the simplest piece of code with pen and pencil.  What I cannot undertand is how a “senior web developer” with years of PHP experience and team leading becomes useless when his Dreamweaver is taken away.  What I cannot understand is why people with more than one Bachelor Degree in Computer Science recommend using bubble sort.  What I cannot understand is why programmers start teaching the potential employer about the interviewing process instead of answering technical questions.  And what I don’t understand is why technical people with years of team work, get pissed off or burst into tears when you ask them a technical question, and a simple one at that, during the job interview.

If you are wondering what sort of questions I’ve been asking, here is an example.  A simple questions would be something like: “What is the difference between the stack (also known as FILO) and the queue (also known as pipe, also known as FIFO)?“.  Most of the answer is already in the questions, isn’t it?

Those of the candidates who were boasting about their years of experience and prevoius projects, were given a simple programming task, which could be something like: “Using PHP programming language, create a list to store information about people.  For each person you’ll need to store name, age, and gender. Populate the list with three sample records.  Then, print out an alphabetically sorted list of names of all males in that list. Bonus points for not using the database.“.  Each candidate was given a piece of paper, a pen, and unlimited amount of time.  And in the last six month I haven’t seen one candidate who could write the code to solve that problem.

We’ve been through all job sites, newspapers, local and foreign forums, and recruiting agencies, trying to find the candiate.  We haven’t found even one.  At least three are needed right now.  More will be needed in the nearest future.

Hopefully, by now you will agree with me on that the situation with the human resources on the island of Cyprus is disastrous.  There is more demand than there is supply, and it’s not getting any better.

Those of you who argue in favour of Cyprus being a small, unimportant country in the middle of technological nowhere, might want to wait.  Last year I’ve been in Greece at the Greek Blogger Camp.  This year I’ve been in Amsterdam at The Next Web Conference.  At both events I’ve chatted with a lot of people from all over Europe and the USA.  I’ve also been all over forums and job web sites both local and foreign.  And the feeling I’ve got is that the problem is not Cyprus specific, although, of course, Cyprus has it a bit worse than others, due to its position in the technology world, as well as geographical location.

While still spending a lot of time looking for a PHP programmer, I was thinking about the roots of the problem.  PHP seems to be quite a popular language.  So, why is it such a problem to find a good PHP programmer? (note: “good”, not “great” or even “very good”) Thinking about the roots of the problem, I got this theory, which isn’t even a theory yet, but rather a raw chain of assumptions and conclusions.  Here is how it goes.

PHP is an ugly language

I know a few good programmers personally.  I also read blogs and comments of a few more good programmes on the Web.  And even though many of them use PHP often, or even on a daily basis, I don’t remember anyone of them every saying that they enjoy PHP.  If given the choice of a programming language for a new project, they’ll pick anything – Java, C, Python, Perl, Ruby, Haskell… Anything, but not PHP. PHP has its pros, but being a beautiful or convenient language is not one of them.

PHP is newbie safe

One of the reasons for why PHP is so popular is because it is newbie safe.  You don’t need to know much about anything to start programming in PHP.  Most of the hosting companies will provide you with a PHP enabled hosting account for just a few dollars a month.  You can write PHP in any text editor, so you won’t need a high end machine or expensive IDE.  PHP.net web site has all the documentation and examples that you’ll ever need, so you don’t need to study hard in college or pay for subscription to developers’ network.  All of these make PHP very attractive to beginner programmers.

PHP avoidance

Most of the good programmers that I know, have learned PHP to some degree.   Most of the bad programmers that I know, have also learned PHP to some degree.  But for good programmers PHP was either not the first programming language under their belt, or they’ve moved forward to some other programming language.  Most of the bad programmers that I know, only know one programming language – PHP – and they don’t know it good enough.  So, for good programmers, learning and using PHP is more like a temporary state, while for the bad programmers using PHP is more like a constant state.

PHP is rich with secondary reasons

There are many reasons for why PHP is so popular.  It is free.  It is open source.  It is easy to setup.  Most hosting companies offer PHP-enabled packages, as well as a lot of PHP software pre-installed.

With primary technical reasons (execution speeds, required resources, development speed, etc) not being very different from many other programming languages, PHP wins a lot of popularity with its secondary powers.

PHP is getting mature

PHP started off as a handy Perl library for web development.  It grew and expanded over time.  And so did the projects which were written in PHP.  If, before, most of PHP scripts were doing the simplest of things, such as contact and registration forms, visitor counters and some templating, then now most projects are closer to full scale applications with user management, financial operations, high availability and load balancing setups, etc.

The moment of conflict

And here comes the moment of conflict.  The complexity of PHP applications is growing higher and higher (see above).  And the language is not beautiful enough to attract good programmers and make them stay (see above).  The result?  More and more applications are written by underqualified programmers, and it becomes harder and harder to find qualified personnel (the complexity of your own projects are growing too).

Questions?

How can we attract good programmers to PHP development?  What are really the reasons for using PHP all that often, if it shares the biggest problem with the other languages – impossibility of finding qualified personnel.  Is there any other programming language that can solve this problem?  Is there any solution at all?

Solutions

These, of course, I don’t have, as usual.  But.  I am looking with interest at hosted application services.  The ones like from Amazon and Google.  I think these will mature of the next few month and years.  And there will be a few more (Yahoo, Microsoft, and IBM maybe?).

The way I see hosted application services is like this.  They will split the programmers into two categories.  The first category will be all those novice programmers, who don’t know how or don’t have the resources to take care of everything.  They’ll be using hosting, databases, libraries, and programming interfaces provided by hosting application services. (Of course, good programmers will be using these too, but they will have a choice, not like the newbies).  Hosted application services will (not yet though) make it easy to cover the ignorance and help to make a few bucks here and there.  Exactly like PHP has been doing it for years now.  The good programmers though will mostly participate in in-house projects and customization developments, which won’t be fitting into hosted application services, and will require additional knowledge and experience.

Summary

If you are a PHP developer looking for a job in Cyprus, please let me know.

PHP 6 – hopefully not the end of the road

I’ve heard plenty of positive buzz about PHP 6 in the last few weeks.  Yes, it’s coming out.  Yes, it brings quite a few improvements, including better Unicode support, better security, and more help for larger projects through namespaces.  However, I hope that it won’t be the last PHP release, since there are so many other things that need fixing.

Here is a good overview, as compared to the best programming language ever – Perl.  But this probably reminds you of a famous Euro-English joke, no?   But I do miss sigils and proper hashes.  I’d love to see better memory management when programming objects.  I’d love to see improved database interfaces with prepared statements and database abstraction layer.  I would really welcome a cleanup in function names and return values. I … I … I … I hope that PHP 6 is not the end of the road, and that PHP 7, PHP 8, and PHP 9 will follow.

MIME type of uploaded files in PHP

Today I came across something that rather puzzled me at first, seemed irresponsible and such, but was cleared later, upon reading the manual.  When uploading files in PHP, variable $_FILES stores a bunch of information about each file.  One of those stored bits is the MIME type of the file.  I was puzzled with how easy it was to trick PHP into setting a wrong MIME type.  However, documentation clearly says that:

The mime type of the file, if the browser provided this information. An example would be “image/gif”. This mime type is however not checked on the PHP side and therefore don’t take its value for granted.

Upgraded to WordPress 1.5.1.3

I have finally upgraded to this blog to WordPress 1.5.1.3. A couple of security issues with XML RPC are fixed by this release. I was a bit slow, since the fixes were released for over a week now, but not to worry – my PHP installation already had all the fixes for XML RPC installed.

Slashdot is running a story on the issue. One of the comments shows an easy way of upgrading PEAR that not everyone might be familiar with:

pear clear-cache
pear upgrade XML_RPC

PHP turns 10 or new ways of starting holly wars

Slashdot has an article about PHP turning 10 years old. Scrolling through the comments for the post, it seems that Slashdot editors found a new way of starting holly wars. But instead of limiting themselves with “X vs. Y” type of holly war, they do it in “X vs. everything else” manner. Makes for some interesting reading.