Unlike some other languages, PHP does not continue to run in the background even when "stuff" stops happening. Compare PHP to Java for example and you'll see what I mean. This can pose problems for large requests, where perhaps the browser gets locked up longer than most users are willing to sit by and endure. This can cause problems. One of the issues I've also run into deals with updating static content. Sometimes content changes, and in order to detect static content changes you have often have to do some work which calls into question the benefit of static content, or resort to caching which brings its own set of problems. Other times you just need to do something at a given interval or period, say for example checking to see if an RSS feed has changed. You may want to do this every 30 minutes. Your options are to check the time when a user access a page and decide if you should do some background work or resort to a cronjob and let it take care of this business. Most of the time one of these two options will work just fine.
I'm a firm believer that an out-of-the-box application should be as stupid-proof as possible, and that as the application packager I should assume the user knows nothing and do everything I can to make installation seamless. This is why I use a Phar to distribute app's and have them extract their own SQLite database, for example. With this line of thinking it's just not safe to assume the user has the slightest clue about cronjob's or how to set them up, let alone have access to them. So, what do you do?
Initially I assumed I would setup a single cronjob and call a single script which would dictate scheduling for programmatically designed "jobs". This would allow any one of my various modules to create a job on the fly programmatically and not have to concern itself with the particulars of how that job would be executed. Sounds fine, but still doesn't solve my dummy proof approach. Then I stumbled across this code snippet in the PHP manual while looking at ignore_user_abort()...
<?php
ignore_user_abort(); // run script in background
set_time_limit(0); // run script forever
$interval=60*15; // do every 15 minutes...
do{
// add the script that has to be ran every 15 minutes here
// ...
sleep($interval); // wait 15 minutes
}while(true);
?>
Some purists won't like this code, but I think it's golden! I've been running some tests locally, and as best I can tell if you're smart about your "job" code you won't run into any issues. Part of the trick is remembering to make sure the scheduling script is running. I do this by updating a row in a database to indicate the last time the scheduler ran, and if that last scheduler execution exists outside of a given range (say twenty minutes with the above example), then I trigger the scheduler again. For this to work, though, keep in mind the scheduler needs to be triggered with an http request. You can do this either in the HTML-word using javascript or something similar, ie. an AJAX request to the scheduler. Or you can use PHP itself like this...
<?php
$ctx = stream_context_create(array(
'http' => array(
'timeout' => 1
)
)
);
file_get_contents("http://www.mydomain.com/scheduler", 0, $ctx);
?>
Notice this will timeout right away, the purpose is to initiate the scheduler and not to actually see what the scheduler is doing or wait for it. Another important thing to note is that you'll want to check to make sure the scheduler isn't already running when you go to start up the scheduler. I realize we just covered spawning the scheduler, but in the actual scheduler code you want to double check it's not running. It's entirely possible depending on your setup that a scheduler could get triggered when another scheduler is executing, and you want to be assure to avoid tying up your apache processes with unnecessary instances of the scheduler.
As a final precaution, realize that you need to be smart about your code in the scheduler. You need to aware of variable usage and what you're doing to the memory allotted to PHP. I suggest even going so far as to maybe evaluate memory usage in the scheduler itself, shutdown the scheduler when necessary and spawn a subsequent scheduler process. You also have to deal with error handling, both from PHP and from user-land. Cron takes output and e-mails it to you, that's an option, or an external log file is an option. Either way you want to be aware of errors and make sure the right people know about them.
Well, happy scheduling!