Home Page
Archive > Posts > Tags > PHP
Archive > Posts > Tags > PHP
Search:
RABiD BUNNY FEVER
K.T.K

Warning: you do not have javascript enabled. This WILL cause layout glitches.

Opening IntelliJ via the Symfony ide setting
Nasty Escaping Problems

I wanted a simple setup in Symfony where the programmer could define their ide in the parameters file. Sounds simple, right? Just add something like ide_url: 'phpstorm' to parameters.yml->parameters and ide: '%ide_url%' to config.yml->framework. And it worked great, however, my problem was much more convoluted.

I am actually running the Symfony server on another machine and am accessing the files via NFS on Windows. So, it would try to open PHPStorm with the incorrect path. Symfony suggests the solution to this is writing your own custom URL handler with %f and %l to fill in the filename and line, and use some weird formatting to do string replaces. So I wrote in 'idea://%%f:%%l&/PROJECT_PATH_ON_SERVER/>DRIVE_LETTER:/PATH_ON_WINDOWS/' (note the double parenthesis for escaping) directly in the config.yml and that worked, kind of. The URL was perfect, but IntelliJ does not seem to register the idea:// protocol handler like PHPStorm theoretically does (according to some online threads) with phpstorm://. So I had to write my own solution.

This answer on stackoverflow has the answer on how to register a protocol handler in Windows. But the problem now was that the first parameter passed to IntelliJ started with the idea:// which broke the command-line file-open. So I ended up writing a script to fix this, which is at the bottom.

OK, so we’re almost there; I just had to paste the string I came up with back into the parameters.yml, right? I wish. While this was now working properly in a Symfony error page, a new problem arose. The Symfony bin/console debug:config framework command was failing with You have requested a non-existent parameter "f:". The darn thing was reading the unescaped string as 'idea://%f:%l&...' and it thought %f:% was supposed to be a variable. Sigh.

So the final part was to double escape the strings with 4 percent signs. 'idea://%%%%f:%%%%l&...'. Except now the URL on the error pages gave me idea://%THE_PATH:%THE_LINE_NUMBER. It was adding an extra parenthesis before both values. This was simple to resolve in the script I wrote, so I was finally able to open scripts directly from the error page. Yay.



So here is the final set of data that has to be added to make this work:
Registry: HKCR/idea/(default) = URL:idea Protocol HKCR/idea/URL Protocol = "" HKCR/idea/shell/open/command = "PATH_TO_PHP" -f "PATH_TO_SCRIPT" "%1" "%2" "%3" "%4" "%5" "%6" "%7" "%8" "%9" parameters.yml: parameters: ide_url: 'idea://%%%%f:%%%%l&/PROJECT_PATH_ON_SERVER/>DRIVE_LETTER:/PATH_ON_WINDOWS/' config.yml: framework: ide: '%ide_url%' PHP_SCRIPT_FILE:
<?php
function DoOutput($S)
{
	//You might want to do something like output the error to a file or do an alert here
	print $S;
}

if(!isset($argv[1]))
	return DoOutput('File not given');
if(!preg_match('~^idea://(?:%25|%)?([a-z]:[/\\\\][^:]+):%?(\d+)/?$~i', $argv[1], $MatchData))
	return DoOutput('Invalid format: '.$argv[1]);

$FilePath=$MatchData[1];
if(!file_exists($FilePath))
	return DoOutput('Cannot find file: '.$FilePath);

$String='"C:\Program Files\JetBrains\IntelliJ IDEA 2018.1.6\bin\idea64.exe" --line '.$MatchData[2].' '.escapeshellarg($FilePath);
DoOutput($String);
shell_exec($String);
?>
Compiling XDebug for PHP from source in Windows

I decided to try out the PHP IDE PHPStorm by jet brains (you know, the people that made android studio) again recently but ran into a few problems. I tried it a while back and I abandoned it because I was using the browser connect method instead of xdebug and it was a major PITA. This time around, I tried xdebug and everything works much smoother and easier. However, there is a major bug with it in which single line statements inside control statements could not be stopped on in the debugger (breakpoint or stepping). This has been a problem since 2012 and was marked until very recently as unfixable. I am honestly shocked this is not a thing more cited. Do people just not use that form of formatting anymore? I find it much cleaner. Example:

if(true)
  $a=1; //This line could never be reached
So all code like this had to be changed to the following for debugging.
if(true) {
  $a=1;
}

However, in a comment by one of the developers on 2016-12-11 (just 2 months ago) he said “This turned out to be a duplicate of 1165, for which I now have cooked up a fix, which will make it into 2.5.1.”. Unfortunately, there has not yet been a release of this version, so I was forced to a compile a copy myself on Windows. This turned out to also be a major PITA. I have documented the process here. Here is the version I compiled (7.1.2 threadsafe x86)

  • To compile a php extension, you also have to compile php itself. So first download both the source and zip files for your chosen version. I used the latest version, 7.1.2 VC14 x86 Thread Safe (2017-Feb-14 23:28:41) [7.1.2 source]. The source is needed for obvious reasons. The binary is needed so we can extract some dlls not included with the source. Extract the php source root directory files to “c:\php\src.
  • You also need to grab the xdebug source [github] via “git clone git://github.com/xdebug/xdebug.git”. I used commit #62ac284bf36f7521e78d2b21492ce2899f7cc4ff #379ab7ab68d28d0af8f637aa28c7e9fab921f27a, but it actually had a bug in it which I fixed in my fork. I sent a pull request so it should hopefully be integrated soon. Clone the git repo in “c:\php\src-extensions” (it will create the “xdebug” directory inside of it)
  • You will also need visual studio 14.0 (VS 2015) [direct download] installed.
  • Once visual studio is installed, you need to open “Developer Command Prompt for VS2015”. It can be found in your start menu under “Programs\Visual Studio 2015\Visual Studio Tools”.
  • Run the following commands to get php compiled:
    cd\php\src
    buildconf --add-modules-dir=..\src-extensions
    :: You may want some different flags for the below command
    configure "--enable-snapshot-build" "--enable-debug-pack" "--without-pdo-mssql" "--enable-com-dotnet=shared" "--with-mcrypt=static" "--without-analyzer"
    :: At this point we are going to actually build php to get the phpize.bat
    :: You may receive some warnings at the end that say something to the extent of
    :: “'-9' is not recognized as an internal or external command”. You can ignore these
    nmake snap
    :: This next command will “install” php to “c:\php”.
    nmake install
  • Next we get to compile xdebug. But it requirements bison, which can be downloaded here [Direct download]. Extract bin/bison.exe and bin/m4.exe to the VC/bin directory, most likely at “C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin”.

    For some reason, the first time I did this process I needed the following files also in VC/bin directory, all found at http://gnuwin32.sourceforge.net. However, during my second run to create this tutorial, I did not need them. iconv.exe, libcharset1.dll, libiconv2.dll, libintl3.dll, mspdb110.dll, regex2.dll. And the “share” folder from the bison zip file needed to be extracted to “C:\Program Files (x86)”.
  • Some file edits in the xdebug directory:
    • Add the following line to the top of configure.js: “PHP_SECURITY_FLAGS=true;
    • If you are using the commit I mentioned, Add the following to line 471 of xdebug_code_coverage.c “zend_op *base_address = &(opa->opcodes[0]);
    • Also edit line 9 of template.rc by changing it from “#include "main/php_version.h"” to “#include "../../src/main/php_version.h"
  • And now to actually compile xdebug
    cd ..\src-extensions\xdebug
    ..\..\SDK\phpize
    configure --with-xdebug
    nmake php_xdebug.dll
    copy Release_TS\php_xdebug.dll ..\..\ext
  • Your complete php install is now located in “c:\php”. But exclude the “src” and “src-extensions” folder. It still needs a few dlls that are found in the php zip file you downloaded earlier. Extract all the dlls from that root of that file to your root php folder, but do not overwrite anything that is already there. If you wanted to compile your own version of the apache dll, you can add one of “--enable-apache2handler, --enable-apache2-2handler, --enable-apache2-4handler” to the src/configure command.
  • For some really odd reason, my version of php would always give me the following error “Packets out of order. Expected 3 received 2. Packet size=22” when trying to connect to mysql (both pdo and mysqli go through mysqlnd). So I just ended up using the php files provided through the zip file, with my newly compiled xdebug dll.
  • Definitely have to mention that the following blog post was a big help to me in figuring this all out: http://blog.benoitblanchon.fr/build-php-extension-on-windows/
Monitoring PHP calls

I recently had a Linux client that was, for whatever odd reason, making infinite recursive HTTP calls to a single script, which was making the server process count skyrocket. I decided to use the same module as I did in my Painless migration from PHP MySQL to MySQLi post, which is to say, overriding base functions for fun and profit using the PHP runkit extension. I did this so I could gather, for debugging, logs of when and where the calls that were causing this to occur.


The below code overrides all functions listed on the line that says “List of functions to intercept” [Line 9]. It works by first renaming these built in functions to “OVERRIDE_$FuncName[Line 12], and replacing them with a call to “GlobalRunFunc()” [Line 13], which receives the original function name and argument list. The GlobalRunFunc():

  1. Checks to see if it is interested in logging the call
    • In the case of this example, it will log the call if [Line 20]:
      • Line 21: curl_setopt is called with the CURLOPT_URL parameter (enum=10002)
      • Line 22: curl_init is called with a first parameter, which would be a URL
      • Line 23: file_get_contents or fopen is called and is not an absolute path
        (Wordpress calls everything absolutely. Normally I would have only checked for http[s] calls).
    • If it does want to log the call, it stores it in a global array (which holds all the calls we will want to log).
      The logged data includes [Line 25]:
      • The function name
      • The function parameters
      • 2 functions back of backtrace (which can often get quite large when stored in the log file)
  2. It then calls the original function, with parameters intact, and passes through the return [Line 27].

The “GlobalShutdown()” [Line 30] is then called when the script is closing [Line 38] and saves all the logs, if any exist, to “$GlobalLogDir/$DATETIME.srl”.

I have it using “serialize()” to encode the log data [Line 25], as opposed to “json_encode()” or “print_r()” calls, as the latter were getting too large for the logs. You may want to have it use one of these other encoding functions for easier log perusal, if running out of space is not a concern.

<?
//The log data to save is stored here
global $GlobalLogArr, $GlobalLogDir;
$GlobalLogArr=Array();
$GlobalLogDir='./LOG_DIRECTORY_NAME';

//Override the functions here to instead have them call to GlobalRunFunc, which will in turn call the original functions
foreach(Array(
        'fopen', 'file_get_contents', 'curl_init', 'curl_setopt', //List of functions to intercept
) as $FuncName)
{
        runkit_function_rename($FuncName, "OVERRIDE_$FuncName");
        runkit_function_add($FuncName, '', "return GlobalRunFunc('$FuncName', func_get_args());");
}

//This optionally 
function GlobalRunFunc($FuncName, $Args)
{
        global $GlobalLogArr;
        if(
                ($FuncName=='curl_setopt' && $Args[1]==10002) || //CURLOPT enumeration can be found at https://curl.haxx.se/mail/archive-2004-07/0100.html
                ($FuncName=='curl_init' && isset($Args[0])) ||
                (($FuncName=='file_get_contents' || $FuncName=='fopen') && $Args[0][0]!='/')
        )
                $GlobalLogArr[]=serialize(Array('FuncName'=>$FuncName, 'Args'=>$Args, 'Trace'=>array_slice(debug_backtrace(), 1, 2)));

        return call_user_func_array("OVERRIDE_$FuncName", $Args);
}

function GlobalShutdown()
{
        global $GlobalLogArr, $GlobalLogDir;
        $Time=microtime(true);
        if(count($GlobalLogArr))
                file_put_contents($GlobalLogDir.date('Y-m-d_H:i:s.'.substr($Time-floor($Time), 2, 3), floor($Time)).'.srl', implode("\n", $GlobalLogArr));

}
register_shutdown_function('GlobalShutdown');
?>
PHP String Concatenation - Stringbuilder results

I wrote the code at the end of this post to test the different forms of string concatenation and they really are all almost exactly equal in both memory and time footprints.


The two primary methods I used are concatenating strings onto each other, and filling an array with strings and then imploding them. I did 500 string additions with a 1MB string in PHP 5.6 (so the result is a 500MB string). At every iteration of the test, all memory and time footprints were very very close (at ~$IterationNumber*1MB). The runtime of both tests was 50.398 seconds and 50.843 seconds consecutively which are most likely within acceptable margins of error.

Garbage collection of strings that are no longer referenced seems to be pretty immediate, even without ever leaving the scope. Since the strings are mutable, no extra memory is really required after the fact.


HOWEVER, The following tests showed that there is a different in peak memory usage WHILE the strings are being concatenated.


$OneMB=str_repeat('x', 1024*1024);
$Final=$OneMB.$OneMB.$OneMB.$OneMB.$OneMB;
print memory_get_peak_usage();
Result=10,806,800 bytes (~10MB w/o the initial PHP memory footprint)

$OneMB=str_repeat('x', 1024*1024);
$Final=implode('', Array($OneMB, $OneMB, $OneMB, $OneMB, $OneMB));
print memory_get_peak_usage();
Result=6,613,320 bytes (~6MB w/o the initial PHP memory footprint)

So there is in fact a difference that could be significant in very very large string concatenations memory-wise (I have run into such examples when creating very large data sets or SQL queries).

But even this fact is disputable depending upon the data. For example, concatenating 1 character onto a string to get 50 million bytes (so 50 million iterations) took a maximum amount of 50,322,512 bytes (~48MB) in 5.97 seconds. While doing the array method ended up using 7,337,107,176 bytes (~6.8GB) to create the array in 12.1 seconds, and then took an extra 4.32 seconds to combine the strings from the array.


Anywho... the below is the benchmark code I mentioned at the beginning which shows the methods are pretty much equal. It outputs a pretty HTML table.

<?
//Please note, for the recursion test to go beyond 256, xdebug.max_nesting_level needs to be raised.
//You also may need to update your memory_limit depending on the number of iterations

//Output the start memory
print 'Start: '.memory_get_usage()."B<br><br>Below test results are in MB<br>";

//Our 1MB string
global $OneMB, $NumIterations;
$OneMB=str_repeat('x', 1024*1024);
$NumIterations=500;

//Run the tests
$ConcatTest=RunTest('ConcatTest');
$ImplodeTest=RunTest('ImplodeTest');
$RecurseTest=RunTest('RecurseTest');

//Output the results in a table
OutputResults(
  Array('ConcatTest', 'ImplodeTest', 'RecurseTest'),
  Array($ConcatTest, $ImplodeTest, $RecurseTest)
);

//Start a test run by initializing the array that will hold the results and manipulating those results after the test is complete
function RunTest($TestName)
{
  $CurrentTestNums=Array();
  $TestStartMem=memory_get_usage();
  $StartTime=microtime(true);
  RunTestReal($TestName, $CurrentTestNums, $StrLen);
  $CurrentTestNums[]=memory_get_usage();

  //Subtract $TestStartMem from all other numbers
  foreach($CurrentTestNums as &$Num)
    $Num-=$TestStartMem;
  unset($Num);

  $CurrentTestNums[]=$StrLen;
  $CurrentTestNums[]=microtime(true)-$StartTime;

  return $CurrentTestNums;
}

//Initialize the test and store the memory allocated at the end of the test, with the result
function RunTestReal($TestName, &$CurrentTestNums, &$StrLen)
{
  $R=$TestName($CurrentTestNums);
  $CurrentTestNums[]=memory_get_usage();
  $StrLen=strlen($R);
}

//Concatenate 1MB string over and over onto a single string
function ConcatTest(&$CurrentTestNums)
{
  global $OneMB, $NumIterations;
  $Result='';
  for($i=0;$i<$NumIterations;$i++)
  {
    $Result.=$OneMB;
    $CurrentTestNums[]=memory_get_usage();
  }
  return $Result;
}

//Create an array of 1MB strings and then join w/ an implode
function ImplodeTest(&$CurrentTestNums)
{
  global $OneMB, $NumIterations;
  $Result=Array();
  for($i=0;$i<$NumIterations;$i++)
  {
    $Result[]=$OneMB;
    $CurrentTestNums[]=memory_get_usage();
  }
  return implode('', $Result);
}

//Recursively add strings onto each other
function RecurseTest(&$CurrentTestNums, $TestNum=0)
{
  Global $OneMB, $NumIterations;
  if($TestNum==$NumIterations)
    return '';

  $NewStr=RecurseTest($CurrentTestNums, $TestNum+1).$OneMB;
  $CurrentTestNums[]=memory_get_usage();
  return $NewStr;
}

//Output the results in a table
function OutputResults($TestNames, $TestResults)
{
  global $NumIterations;
  print '<table border=1 cellspacing=0 cellpadding=2><tr><th>Test Name</th><th>'.implode('</th><th>', $TestNames).'</th></tr>';
  $FinalNames=Array('Final Result', 'Clean');
  for($i=0;$i<$NumIterations+2;$i++)
  {
    $TestName=($i<$NumIterations ? $i : $FinalNames[$i-$NumIterations]);
    print "<tr><th>$TestName</th>";
    foreach($TestResults as $TR)
      printf('<td>%07.4f</td>', $TR[$i]/1024/1024);
    print '</tr>';
  }

  //Other result numbers
  print '<tr><th>Final String Size</th>';
  foreach($TestResults as $TR)
    printf('<td>%d</td>', $TR[$NumIterations+2]);
  print '</tr><tr><th>Runtime</th>';
    foreach($TestResults as $TR)
      printf('<td>%s</td>', $TR[$NumIterations+3]);
  print '</tr></table>';
}
?>
Painless migration from PHP MySQL to MySQLi

The PHP MySQL extension is being deprecated in favor of the MySQLi extension in PHP 5.5, and removed as of PHP 7.0. MySQLi was first referenced in PHP v5.0.0 beta 4 on 2004-02-12, with the first stable release in PHP 5.0.0 on 2004-07-13[1]. Before that, the PHP MySQL extension was by far the most popular way of interacting with MySQL on PHP, and still was for a very long time after. This website was opened only 2 years after the first stable release!


With the deprecation, problems from some websites I help host have popped up, many of these sites being very, very old. I needed a quick and dirty solution to monkey-patch these websites to use MySQLi without rewriting all their code. The obvious answer is to overwrite the functions with wrappers for MySQLi. The generally known way of doing this is with the Advanced PHP Debugger (APD). However, using this extension has a lot of requirements that are not appropriate for a production web server. Fortunately, another extension I recently learned of offers the renaming functionality; runkit. It was a super simple install for me.

  1. From the command line, run “pecl install runkit”
  2. Add “extension=runkit.so” and “runkit.internal_override=On” to the php.ini

Besides the ability to override these functions with wrappers, I also needed a way to make sure this file was always loaded before all other PHP files. The simple solution for that is adding “auto_prepend_file=/PATH/TO/FILE” to the “.user.ini” in the user’s root web directory.

The code for this script is as follows. It only contains a limited set of the MySQL functions, including some very esoteric ones that the web site used. This is not a foolproof script, but it gets the job done.


//Override the MySQL functions
foreach(Array(
    'connect', 'error', 'fetch_array', 'fetch_row', 'insert_id', 'num_fields', 'num_rows',
    'query', 'select_db', 'field_len', 'field_name', 'field_type', 'list_dbs', 'list_fields',
    'list_tables', 'tablename'
) as $FuncName)
    runkit_function_redefine("mysql_$FuncName", '',
        'return call_user_func_array("mysql_'.$FuncName.'_OVERRIDE", func_get_args());');

//If a connection is not explicitely passed to a mysql_ function, use the last created connection
global $SQLLink; //The remembered SQL Link
function GetConn($PassedConn)
{
    if(isset($PassedConn))
        return $PassedConn;
    global $SQLLink;
    return $SQLLink;
}

//Override functions
function mysql_connect_OVERRIDE($Host, $Username, $Password) {
    global $SQLLink;
    return $SQLLink=mysqli_connect($Host, $Username, $Password);
}
function mysql_error_OVERRIDE($SQLConn=NULL) {
    return mysqli_error(GetConn($SQLConn));
}
function mysql_fetch_array_OVERRIDE($Result, $ResultType=MYSQL_BOTH) {
    return mysqli_fetch_array($Result, $ResultType);
}
function mysql_fetch_row_OVERRIDE($Result) {
    return mysqli_fetch_row($Result);
}
function mysql_insert_id_OVERRIDE($SQLConn=NULL) {
    return mysqli_insert_id(GetConn($SQLConn));
}
function mysql_num_fields_OVERRIDE($Result) {
    return mysqli_num_fields($Result);
}
function mysql_num_rows_OVERRIDE($Result) {
    return mysqli_num_rows($Result);
}
function mysql_query_OVERRIDE($Query, $SQLConn=NULL) {
    return mysqli_query(GetConn($SQLConn), $Query);
}
function mysql_select_db_OVERRIDE($DBName, $SQLConn=NULL) {
    return mysqli_select_db(GetConn($SQLConn), $DBName);
}
function mysql_field_len_OVERRIDE($Result, $Offset) {
    $Fields=$Result->fetch_fields();
    return $Fields[$Offset]->length;
}
function mysql_field_name_OVERRIDE($Result, $Offset) {
    $Fields=$Result->fetch_fields();
    return $Fields[$Offset]->name;
}
function mysql_field_type_OVERRIDE($Result, $Offset) {
    $Fields=$Result->fetch_fields();
    return $Fields[$Offset]->type;
}
function mysql_list_dbs_OVERRIDE($SQLConn=NULL) {
    $Result=mysql_query('SHOW DATABASES', GetConn($SQLConn));
    $Tables=Array();
    while($Row=mysqli_fetch_assoc($Result))
        $Tables[]=$Row['Database'];
    return $Tables;
}
function mysql_list_fields_OVERRIDE($DBName, $TableName, $SQLConn=NULL) {
    $SQLConn=GetConn($SQLConn);
    $CurDB=mysql_fetch_array(mysql_query('SELECT Database()', $SQLConn));
    $CurDB=$CurDB[0];
    mysql_select_db($DBName, $SQLConn);
    $Result=mysql_query("SHOW COLUMNS FROM $TableName", $SQLConn);
    mysql_select_db($CurDB, $SQLConn);
    if(!$Result) {
        print 'Could not run query: '.mysql_error($SQLConn);
        return Array();
    }
    $Fields=Array();
    while($Row=mysqli_fetch_assoc($Result))
        $Fields[]=$Row['Field'];
    return $Fields;
}
function mysql_list_tables_OVERRIDE($DBName, $SQLConn=NULL) {
    $SQLConn=GetConn($SQLConn);
    $CurDB=mysql_fetch_array(mysql_query('SELECT Database()', $SQLConn));
    $CurDB=$CurDB[0];
    mysql_select_db($DBName, $SQLConn);
    $Result=mysql_query("SHOW TABLES", $SQLConn);
    mysql_select_db($CurDB, $SQLConn);
    if(!$Result) {
        print 'Could not run query: '.mysql_error($SQLConn);
        return Array();
    }
    $Tables=Array();
    while($Row=mysql_fetch_row($Result))
        $Tables[]=$Row[0];
    return $Tables;
}
function mysql_tablename_OVERRIDE($Result) {
    $Fields=$Result->fetch_fields();
    return $Fields[0]->table;
}

And here is some test code to confirm functionality:
global $MyConn, $TEST_Table;
$TEST_Server='localhost';
$TEST_UserName='...';
$TEST_Password='...';
$TEST_DB='...';
$TEST_Table='...';
function GetResult() {
    global $MyConn, $TEST_Table;
    return mysql_query('SELECT * FROM '.$TEST_Table.' LIMIT 1', $MyConn);
}
var_dump($MyConn=mysql_connect($TEST_Server, $TEST_UserName, $TEST_Password));
//Set $MyConn to NULL here if you want to test global $SQLLink functionality
var_dump(mysql_select_db($TEST_DB, $MyConn));
var_dump(mysql_query('SELECT * FROM INVALIDTABLE LIMIT 1', $MyConn));
var_dump(mysql_error($MyConn));
var_dump($Result=GetResult());
var_dump(mysql_fetch_array($Result));
$Result=GetResult(); var_dump(mysql_fetch_row($Result));
$Result=GetResult(); var_dump(mysql_num_fields($Result));
var_dump(mysql_num_rows($Result));
var_dump(mysql_field_len($Result, 0));
var_dump(mysql_field_name($Result, 0));
var_dump(mysql_field_type($Result, 0));
var_dump(mysql_tablename($Result));
var_dump(mysql_list_dbs($MyConn));
var_dump(mysql_list_fields($TEST_DB, $TEST_Table, $MyConn));
var_dump(mysql_list_tables($TEST_DB, $MyConn));
mysql_query('CREATE TEMPORARY TABLE mysqltest (i int auto_increment, primary key (i))', $MyConn);
mysql_query('INSERT INTO mysqltest VALUES ()', $MyConn);
mysql_query('INSERT INTO mysqltest VALUES ()', $MyConn);
var_dump(mysql_insert_id($MyConn));
mysql_query('DROP TEMPORARY TABLE mysqltest', $MyConn);
Getting HTML from Simple Machine Forum (SMF) Posts

When I first created my website 10 years ago, from scratch, I did not want to deal with writing a comment system with HTML markups. And in those days, there weren’t plugins for everything like there is today. My solution was setting up a forum which would contain a topic for every Project, Update, and Post, and have my pages mirror the linked topic’s posts.

I had just put in a quick hack at the time in which the pulled SMF message’s body had links converted from bbcode (there might have been 1 other bbcode I also hooked). I had done this with regular expressions, which was a nasty hack.

So anywho, I finally got around to writing a script that converts SMF messages’ bbcode to HTML and caches it. You can download it here, or see the code below. The script is optimized so that it only ever needs to load SMF code when a post has not yet been cached. Caching happens during the initial loading of an SMF post within the script’s main function, and is discarded if the post is changed.

The script requires that you run the query on line #3 of itself in your SMF database. Directly after that are 3 variables you need to set. The script assumes you are already logged in to the appropriate user. To use it, call “GFTP\GetForumTopicPosts($ForumTopicID)”. I have the functions split up so you can do individual posts too if needed (requires a little extra code).


<?
//This SQL command must be ran before using the script
//ALTER TABLE smf_messages ADD body_html text, ADD body_md5 char(32) DEFAULT NULL;

namespace GFTP;

//Forum database variables
global $ForumInfo;
$ForumInfo=Array(
    'DBName'=>'YourDatabase_smf',
    'Location'=>'/home/YourUser/www',
    'MessageTableName'=>'smf2_messages',
);

function GetForumTopicPosts($ForumTopicID)
{
    //Change to the forum database
    global $ForumInfo;
    $CurDB=mysql_fetch_row(mysql_query('SELECT database()'))[0];
    if($CurDB!=$ForumInfo['DBName'])
        mysql_select_db($ForumInfo['DBName']);
    $OldEncoding=SetEncoding(true);

    //Get the posts
    $PostsInfos=Array();
    $PostsQuery=mysql_query('SELECT '.implode(', ', PostFields())." FROM $ForumInfo[MessageTableName] WHERE id_topic='".intval($ForumTopicID).
        "' AND approved=1 ORDER BY id_msg ASC LIMIT 1, 9999999");
    if($PostsQuery) //If query failed, do not process
        while(($PostInfo=mysql_fetch_assoc($PostsQuery)) && ($PostsInfos[]=$PostInfo))
            if(md5($PostInfo['body'])!=$PostInfo['body_md5']) //If the body md5s do not match, get new value, otherwise, use cached value
                ProcessPost($PostsInfos[count($PostsInfos)-1]); //Process the lastest post as a reference

    //Restore from the forum database
    if($CurDB!=$ForumInfo['DBName'])
        mysql_select_db($CurDB);
    SetEncoding(false, $OldEncoding);

    //Return the posts
    return $PostsInfos;
}

function ProcessPost(&$PostInfo) //PostInfo must have fields id_msg, body, body_md5, and body_html
{
    //Load SMF
    global $ForumInfo;
    if(!defined('SMF'))
    {
        global $context;
        require_once(rtrim($ForumInfo['Location'], DIRECTORY_SEPARATOR).DIRECTORY_SEPARATOR.'SSI.php');
        mysql_select_db($ForumInfo['DBName']);
        SetEncoding();
    }

    //Update the cached body_html field
    $ParsedCode=$PostInfo['body_html']=parse_bbc($PostInfo['body']);
    $EscapedHTMLBody=mysql_escape_string($ParsedCode);
    $BodyMD5=md5($PostInfo['body']);
    mysql_query("UPDATE $ForumInfo[MessageTableName] SET body_html='$EscapedHTMLBody', body_md5='$BodyMD5' WHERE id_msg=$PostInfo[id_msg]");
}

//The fields to select in the Post query
function PostFields() { return Array('id_msg', 'poster_time', 'id_member', 'subject', 'poster_name', 'body', 'body_md5', 'body_html'); }

//Swap character encodings. Needs to be set to utf8
function SetEncoding($GetOld=false, $NewSet=Array('utf8', 'utf8', 'utf8'))
{
    //Get the old charset if required
    $CharacterVariables=Array('character_set_client', 'character_set_results', 'character_set_connection');
    $OldSet=Array();
    if($GetOld)
    {
        //Fill in variables with default in case they are not found
        foreach($CharacterVariables as $Index => $Variable)
            $OldSet[$Variable]='utf8';

        //Query for the character sets and update the OldSet array
        $Query=mysql_query('SHOW VARIABLES LIKE "character_%"');
        while($VariableInfo=mysql_fetch_assoc($Query))
            if(isset($OldSet[$VariableInfo['Variable_name']]))
                $OldSet[$VariableInfo['Variable_name']]=$VariableInfo['Value'];

        $OldSet=array_values($OldSet); //Turn back into numerical array
    }

    //Change to the new database encoding
    $CompiledSets=Array();
    foreach($CharacterVariables as $Index => $Variable)
        $CompiledSets[$Index]=$CharacterVariables[$Index].'="'.mysql_escape_string($NewSet[$Index]).'"';
    mysql_query('SET '.implode(', ', $CompiledSets));

    //If requested, return the previous values
    return $OldSet;
}
?>
Missing phar wrapper

Phar files are PHP’s way of distributing an entire PHP solution in a single package file. I recently had a problem on my Cygwin PHP server that said “Unable to find the wrapper "phar" - did you forget to enable it when you configured PHP?”. I couldn’t find any solution for this online, so I played with it a bit.

The quick and dirty solution I came up with is to include the phar file like any normal PHP file, which sets your current working directory inside of the phar file. After that, you can include files inside the phar and then change your directory back to where you started. Here is the code I used:

if(preg_match('/^(?:win|cygwin)/i', PHP_OS))
{
    $CWD=getcwd();
    require_once('Scripts/PHPExcel.phar');
    require_once('PHPExcel/IOFactory.php');
    chdir($CWD);
}
else
    require_once('phar://Scripts/PHPExcel.phar/PHPExcel/IOFactory.php');
Pulling HTML from Github markdown for external use
Although, converting to markdown is a time consuming pain

So I started getting on the Github bandwagon FINALLY. I figured that while I was going to the trouble of remaking readme files for the projects into github markdown files, I might as well duplicate the compiled HTML for my website.

The below code is a simple PHP script to pull in the converted HTML from Github’s API and then do some more modifications to facilitate directly inserting it into a website.


Usage:
  • The variables that can be updated are all at the top of the file.
  • The script will always output the finished result to the user’s browser, but can also optionally save it to an external file by setting the $SaveFileName variable.
  • Stylesheet:
    • The script automatically includes a specified stylesheet from the $StylesheetLocation variable.
    • The stylesheet I used is from https://gist.github.com/somebox/1082608. I’m not too happy with its coloring scheme, but it’ll do for now.
    • The required modifications that need to be made to the css are to change “body” to “.GHMarkdown”, and then add “.GHMarkdown” before all other rules.
    • This is the one I am currently using for my website, but it also has a few modifications made specifically for my layouts.
  • Modifications
    • In my markdowns, I like to link to internal sections by first creating a bookmark as “<div name="BOOKMARK_NAME">...</div>” and then linking via “[LinkName](#BOOKMARK_NAME)”. While this works on github, the bookmark’s names are actually changed to something like “user-content-BOOKMARK-NAME”, which is not useable outside of github. The first $RegexModifications item therefore updates the bookmarks back to their original name, and turns them into <span>s (which github does not support).
    • The second rule just removes the “aria-hidden” attributes, which my W3C checking scripts throw a warning on.
  • Note that sometimes, the script may return an error of “transfer closed with XXX bytes remaining to read”. This means that github denied the request (probably due to too many requests in too short a timespan), but the input is too large so github prematurely terminated the connection. If this happens, try sending a tiny input and see if you get back a proper error.

<?php
//Variables
$SaveFileName='Output.html'; //Optionally save output to a file. Comment out to not save
$InputFile='Input.md';
$StylesheetLocation='github-markdown.css';
$RegexModifications=Array(
        '/<div name="user-content-(.*?)"(.*?)<\/div>/s'=>'<span id="$1"$2</span>', //Change <div name="user-contentXXX ---TO--- <span name="XXX
        '/ ?aria-hidden="true"/'=>'' //Remove aria-hidden attribute
);

//Set the curl options
$CurlHandle=curl_init(); //Init curl
curl_setopt_array($CurlHandle, Array(
        CURLOPT_URL=>           'https://api.github.com/markdown/raw', //Markdown/raw takes and returns plain text input and output
        CURLOPT_FAILONERROR=>   false,
        CURLOPT_FOLLOWLOCATION=>1,
        CURLOPT_RETURNTRANSFER=>1, //Return result as a string
        CURLOPT_TIMEOUT=>       300,
        CURLOPT_POST=>          1,
        CURLOPT_POSTFIELDS=>    file_get_contents($InputFile), //Pull in the requested file
        CURLOPT_HTTPHEADER=>    Array('Content-type: text/plain'), //Github expects the given data to be plaintext
        CURLOPT_SSL_VERIFYPEER=>0, //In case there are problems with the PHP ssl chain (often the case in Windows), ignore the error
        CURLOPT_USERAGENT=>     'Curl/PHP' //Github now requires a useragent to process the request
));

//Pull in the html converted markdown from Github
$Return=curl_exec($CurlHandle);
if(curl_errno($CurlHandle)) //Check for error
        $Return=curl_error($CurlHandle);
curl_close($CurlHandle);

//Make regex modifications
$Return=preg_replace(array_keys($RegexModifications), array_values($RegexModifications), $Return);

//Generate the final HTML. It will also be output here if not saving to a file
header('Content-Type: text/html; charset=utf-8');
if(isset($SaveFileName)) //If saving to a file, buffer output
        ob_start();
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Markdown pull</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link href="<?=$StylesheetLocation?>" rel=stylesheet type="text/css">
</head><body><div class=GHMarkdown>
<?=$Return?>
</div></body></html>
<?php
//Save to a file if requested
if(isset($SaveFileName))
        file_put_contents($SaveFileName, ob_get_flush()); //Actual output happens here too when saving to a file
?>
Installing PSLib for PHP
I was setting up a brand new server yesterday running Cent OS 6.4 and had the need to install PSLib (PostScript) for PHP.
The initial setup commands all worked as expected:
cd /src #A good directory to install stuff in. You may need to create it first

#Install intltool (required for pslib)
yum install intltool

#Install pslib
wget http://sourceforge.net/projects/pslib/files/latest/download?source=files
tar -xzvf pslib-*.tar.gz
cd pslib-*
./configure
make
make install
cd ..

#Install pslib wrapper for php
pecl download ps
tar -xzvf ps-*.tgz
cd ps-*
phpize
./configure
make

At this point, the make failed with
/src/ps-1.3.6/ps.c:58: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ps_functions’

After a little bit of browsing the code and a really lucky first guess, I found that changing the following fixed the problem:
In ps.c change: “function_entry ps_functions[] = {” to “zend_function_entry ps_functions[] = {

Then to finish the install, just run:
make
make install
and you’re done!
Using Twitter API v1.1

Twitter recently turned off their v1.0 API which broke a multitude of applications, many of which are still broken today. I had the need to immediately update at least one website to the new Twitter v1.1 API, but I could not find a simple bare bones example on the internet. Pretty much all the code/libraries out there I could find for the new API version were hundreds to thousands of lines long, or didn’t work >.<; . So anywho, here is the simple PHP code I put together to make API calls.

This code requires certain consumer/authentication keys and secrets. You can find how to generate them elsewhere online.

//OAuth Request parameters
$ConsumerKey='FILL ME IN';
$ConsumerSecret='FILL ME IN';
$AccessToken='FILL ME IN';
$AccessTokenSecret='FILL ME IN';

function EncodeParam($input) { return strtr(rawurlencode($input), Array('+'=>' ', '%7E'=>'~')); }
function SendTwitterRequest($RequestURL, $Params=Array())
{
	//Compile the OAuth parameters
	global $ConsumerKey, $ConsumerSecret, $AccessToken, $AccessTokenSecret;
	$Params=array_merge(
		$Params,
		Array('oauth_version'=>'1.0', 'oauth_nonce'=>mt_rand(), 'oauth_timestamp'=>time(), 'oauth_consumer_key'=>$ConsumerKey, 'oauth_signature_method'=>'HMAC-SHA1'),
		isset($AccessToken) ? Array('oauth_token'=>$AccessToken) : Array()
	);
	uksort($Params, 'strcmp'); //Must be sorted to determine the signature
	foreach($Params as $Key => &$Val) //Create the url encoded parameter list
		$Val=EncodeParam($Key).'='.EncodeParam($Val);
	$Params=implode('&', $Params); //Combine the parameter list
	$Params.='&oauth_signature='.EncodeParam(base64_encode(hash_hmac('sha1', 'GET&'.EncodeParam($RequestURL).'&'.EncodeParam($Params), EncodeParam($ConsumerSecret).'&'.EncodeParam($AccessTokenSecret), TRUE)));

	//Do the OAuth request
	$CurlObj=curl_init();
	foreach(Array(CURLOPT_URL=>"$RequestURL?$Params", CURLOPT_SSL_VERIFYHOST=>0, CURLOPT_SSL_VERIFYPEER=>0, CURLOPT_RETURNTRANSFER=>1) as $Key => $CurlVal)
		curl_setopt($CurlObj, $Key, $CurlVal);
	$Result=curl_exec($CurlObj);
	curl_close($CurlObj);
	return $Result;
}

If you don’t have an AccessToken and AccessSecret yet, you can get them through the following code:
$OAuthResult=SendTwitterRequest('https://twitter.com/oauth/request_token');
parse_str($OAuthResult, $OauthRet);
if(!isset($OauthRet['oauth_token']))
	throw new Exception("OAuth error: $OAuthResult");
$AccessToken=$OauthRet['oauth_token'];
$AccessSecret=$OauthRet['oauth_token_secret'];

Here is an example to pull the last 4 tweets from a user:
$UserName='TheUserName';
$Result=json_decode(SendTwitterRequest('https://api.twitter.com/1.1/statuses/user_timeline.json', Array('screen_name'=>$UserName, 'count'=>4)));
if(isset($Result->{'errors'}))
	throw new Exception($Result->{'errors'}[0]->{'message'});
$Tweets=Array();
foreach($Result as $Tweet)
	$Tweets[]=$Tweet->text;

print implode('<br>', $Tweets);
Setting the time zone through a numeric offset
They never make it easy

I had the need today to be able to set the current time zone for an application in multiple computer languages by the hourly offset from GMT/UTC, which turned out to be a lot harder than I expected. It seems most time zone related functions, at least in Linux, expect you to use full location strings to set the current time zone offset (i.e. America/Chicago).


After a lot of research and experimenting, I came up with the following results. All of these are confirmed working in Linux, and most or all of them should work in Windows too.

Language Format Note Format for GMT+5 Format for GMT-5
C Negate GMT-5 GMT5
Perl Negate GMT-5 GMT5
SQL Requires Sign +5:00 -5:00
PHP Negate, Requires Sign Etc/GMT-5 Etc/GMT+5

And here are examples of using this in each language. The “TimeZone” string variable should be a 1-2 digit integer with an optional preceding negative sign:
Language Example
C
#include <stdio.h> //snprintf
#include <stdlib.h> //setenv, atoi
#include <time.h> //tzset

...

char Buffer[10];
snprintf(Buffer, 10, "GMT%i", -atoi(TimeZone));
setenv("TZ", Buffer, 1);
tzset();
		
Perl
use POSIX qw/tzset/;
$ENV{TZ}='GMT'.(-$TimeZone);
tzset;
		
SQL [Query string created via Perl]
$Query='SET time_zone="'.($TimeZone>=0 ? '+' : '').$TimeZone.':00"';
		
PHP
date_default_timezone_set('Etc/GMT'.($TimeZone<=0 ? '+' : '').(-$TimeZone));
		
Something I feel JavaScript really got right
Language design is a PITA though... so bleh

One thing I always really miss when working in other dynamic languages that aren’t JavaScript is the ability to access known (non dynamic) members of an associative array/object/hash (called a hash from here on out) through just a single dot. This matches C’s syntax of accessing struct members, as opposed to being forced into using array syntax which is harder to read IMO in languages like PHP and Perl. For example...


Creating a hash in:
JavaScript:var Hash={foo:1, bar:2};
Perl:my %Hash=(foo=>1, bar=>2);
PHP:$Hash=Array('foo'=>1, 'bar'=>2);

Accessing a Hash’s member:
JavaScript:Hash.foo or Hash['foo']
Perl:$Hash{foo};
PHP:$Hash['foo']

The reason this is preferable to me is it can make code like the following
Zones[Info['Zone']]['DateInfo'][Info['Date']]['UniqueEnters']+=Info['Count'];
much more readable by turning it into the following
Zones[Info.Zone].DateInfo[Info.Date].UniqueEnters+=Info.Count;
UTF8 BOM
When a good idea is still considered too much by some

While UTF-8 has almost universally been accepted as the de-facto standard for Unicode character encoding in most non-Windows systems (mmmmmm Plan 9 ^_^), the BOM (Byte Order Marker) still has large adoption problems. While I have been allowing my text editors to add the UTF8 BOM to the beginning of all my text files for years, I have finally decided to rescind this practice for compatibility reasons.

While the UTF8 BOM is useful so that editors know for sure what the character encoding of a file is, and don’t have to guess, they are not really supported, for their reasons, in Unixland. Having to code solutions around this was becoming cumbersome. Programs like vi and pico/nano seem to ignore a file’s character encoding anyways and adopt the character encoding of the current terminal session.

The main culprit in which I was running into this problem a lot with is PHP. The funny thing about it too was that I had a solution for it working properly in Linux, but not Windows :-).

Web browsers do not expect to receive the BOM marker at the beginning of files, and if they encounter it, may have serious problems. For example, in a certain browser (*cough*IE*cough*) having a BOM on a file will cause the browser to not properly read the DOCTYPE, which can cause all sorts of nasty compatibility issues.

Something in my LAMP setup on my cPanel systems was removing the initial BOM at the beginning of outputted PHP contents, but through some preliminary research I could not find out why this was not occurring in Windows. However, both systems were receiving multiple BOMs at the beginning of the output due to PHP’s include/require functions not stripping the BOM from those included files. My solution to this was a simple overload of these include functions as follows (only required when called from any directly opened [non-included] PHP file):

<?
/*Safe include/require functions that make sure UTF8 BOM is not output
Use like: eval(safe_INCLUDETYPE($INCLUDE_FILE_NAME));
where INCLUDETYPE is one of the following: include, require, include_once, require_once
An eval statement is used to maintain current scope
*/

//The different include type functions
function safe_include($FileName)	{ return real_safe_include($FileName, 'include'); }
function safe_require($FileName)	{ return real_safe_include($FileName, 'require'); }
function safe_include_once($FileName)	{ return real_safe_include($FileName, 'include_once'); }
function safe_require_once($FileName)	{ return real_safe_include($FileName, 'require_once'); }

//Start the processing and return the eval statement
function real_safe_include($FileName, $IncludeType)
{
	ob_start();
	return "$IncludeType('".strtr($FileName, Array("\\"=>"\\\\", "'", "\\'"))."'); safe_output_handler();";
}

//Do the actual processing and return the include data
function safe_output_handler()
{
	$Output=ob_get_clean();
	while(substr($Output, 0, 3)=='?') //Remove all instances of UTF8 BOM at the beginning of the output
		$Output=substr($Output, 3);
	print $Output;
}
?>

I would have like to have used PHP’s output_handler ini setting to catch even the root file’s BOM and not require include function overloads, but, as php.net puts it “Only built-in functions can be used with this directive. For user defined functions, use ob_start().”.

As a bonus, the following bash command can be used to find all PHP files in the current directory tree with a UTF8 BOM:

grep -rlP "^\xef\xbb\xbf" . | grep -iP "\.php\$"

[Edit on 2015-11-27]
Better UTF8 BOM file find code (Cygwin compatible):
 find . -name '*.php' -print0 | xargs -0 -n1000 grep -l $'^\xef\xbb\xbf'
And to remove the BOMs (Cygwin compatible):
find . -name '*.php' -print0 | xargs -0 -n1000 grep -l $'^\xef\xbb\xbf' | xargs -i perl -i.bak -pe 'BEGIN{ @d=@ARGV } s/^\xef\xbb\xbf//; END{ unlink map "$_$^I", @d }' "{}"
Simpler remove BOMs (not Cygwin/Windows compatible):
find . -name '*.php' -print0 | xargs -0 -n1000 grep -l $'^\xef\xbb\xbf' | xargs -i perl -i -pe 's/^\xef\xbb\xbf//' "{}"
Live Streaming SHOUTcast through Flash
The one time I decide to look online before trying it out myself

A client of mine wanted their website to have an applet that played streaming music from a SHOUTcast server. The easy solution would have been to just embed a Windows Media Player applet into the page, but that would only work for IE.

I thoroughly searched the web and was unable to find a Flash applet (or other solution) that already did this (and actually worked). Most of the information I was finding was people having problems getting this kind of thing working in Flash with no answer provided. After giving up on finding a resolution online, I decided to load up Flash and see what I could find from some tinkering.

Quite frankly, I’m shocked people were having so many problems with this. I started an ActionScript 2.0 project and put in the following code, and it worked right away in Flash CS3 (v9.0) with no problem:

var URL="http://example.shoutcast.castledragmire.com:1234/" //The URL to the SHOUTcast server
var MySound:Sound=new Sound(this);
MySound.loadSound(URL,true);

Unfortunately, once I exported the Flash applet and loaded it up in my browsers, it was no longer working. After a few minutes of poking around, I had a hunch that the SHOUTcast host might be sending different data depending on the [Browser’s] User Agent. I changed Firefox’s User Agent to “Flash” through a Firefox add-on (User Agent Switcher), and it worked :-D.

Once again, unfortunately, this was not a viable solution because I couldn’t have every user who visited the client’s web page change their browser User Agent string :-). The quickest solution at this point to the problem was to just create a passthrough script that grabbed the live stream on their server and passed it to the client. The following is the PHP script I used for this:

$streamname='example.shoutcast.castledragmire.com';
$port      ='1234';
$path      ='/';

header('Content-type: audio/mpeg');
$sock=fsockopen($streamname,$port);
fputs($sock, "GET $path HTTP/1.0\r\n");
fputs($sock, "Host: $streamname\r\n");
fputs($sock, "User-Agent: WinampMPEG/2.8\r\n");
fputs($sock, "Accept: */*\r\n");
fputs($sock, "Connection: close\r\n\r\n");
fpassthru($sock);
fclose($sock);

The final two steps to get this working were:
  1. Setting the Flash Applet’s URL variable to the PHP file
  2. Turning off PHP output buffering for the file. This can only be done through Apache or the php.ini depending on the server setup. This is very important, as if it’s on, the data will never get sent to the user.

The only problem with this method is that it taxes the server that is passing the data through, especially since it uses PHP... This kind of thing could very easily be done in C though (as a matter of fact, I will be writing a post on something very close to that very soon).

Alamo Draft House Schedule List
Simple information reorganization example

After discovering the Alamo Draft House’s coolness a few months ago, I’ve been trying to watch what they’re playing to make sure I catch anything I might want to see on the big screen. Unfortunately, it is not easy to get a good quick idea of all the movies playing from their calendar because it shows movies per day with showtimes, making the list repetitive and crowded with extra information.

I decided to throw together a real quick PHP script that would parse their data so I could organize it however I wanted. The final result can be viewed here. The code is as follows:

//The list of calendar pages in format TheaterName=>URL
$PagesToGrab=Array(
	'Ritz'=>'http://www.originalalamo.com/Calendar.aspx?l=2',
	'Village'=>'http://www.originalalamo.com/Calendar.aspx?l=3',
	'South Lamar'=>'http://www.originalalamo.com/Calendar.aspx?l=4'
);

foreach($PagesToGrab as $Name => $URL) //Grab the movies for each theater
{
	print "<b>$Name</b><br>"; //Output the theater name
	$TheHTML=file_get_contents($URL); //Grab the HTML
	$ShowList=Array(); //This will contain the final list of shows and what days they are on
	
	preg_match_all('/<td class="day">.*?<\/td>/', $TheHTML, $DayMatches); //Extract all table cells containing a day's info
	foreach($DayMatches[0] as $DayInfo) //Loop over each day's info
	{
		//Determine the day of month
		preg_match('/<a class=\"daynumber\" title=".*?, (.*?),/', $DayInfo, $DayOfMonth);
		$DayOfMonth=$DayOfMonth[1];
		
		//Determine all the shows for the day
		preg_match_all('/<span class="show"><a href=".*?">(.*?)<\/a>/', $DayInfo, $AllShows);
		foreach($AllShows[1] as $Show)
		{
			$Show=preg_replace('/^\s+|\s+$/', '', $Show); //Remove start and end of line whitespace
			if(!isset($ShowList[$Show])) //If show has not yet been added to overall list, add it
				$ShowList[$Show]=Array();
			$ShowList[$Show][]=$DayOfMonth; //Add this day as a time for the show
		}
	}
	
	//Output the shows and their days
	print '<table>';
	foreach($ShowList as $ShowName => $Days)
		print "<tr><td>$ShowName</td><td>".implode(', ', $Days).'</td></tr>';
	print '</table><br><br>';
}	
<? PageFooter(); ?>
</body></html>
PHP file inclusion weirdness
PHP has its own quirks too
In PHP, you cannot include files in parent directories “../” from a file that has already been included from another file in a different directory. This has been a nuisance for a long time.

Here is a test case: (Files followed by their code)

/test1.php (This is the file that is called directly by the browser/apache)
<?
//This all works fine
print 'test1'_start;
require('foo/bar/test2.php');
print 'test1_end';
?>

/foo/bar/test2.php
<?
print 'test2_start';
require('blah/test3.php'); //This works fine because the include file it is in a subdirectory, not a parent directory of test2.php
require('../test4.php'); //This does not call fine (an error is thrown by php) because it is in a parent directory relative to test2.php, which was already included from the parent file (test1.php) in another directory (/). To fix this, use 'foo/test4.php'
print 'test2_end';
?>

/foo/bar/blah/test3.php
<? print 'test3'; ?>

/foo/test4.php (This file is not reached by this example without the fixes mentioned in either the comment in test2.php, or below)
<? print 'test4'; ?>

The obvious method to fix this whole conundrum is to always set all includes relative to one root path, and then make sure that path is always used with the set_include_path function if your parent file is not in the root directory. For example:
set_include_path(get_include_path().':/PATH_TO_ROOT_DIRECTORY');

Another method would be to write a require/include wrapper that calculates paths from the current directory whenever a parent path “../” is used. Here is an example of this:
function _require($IncludeFile, $CurrentFile)
{
	$CurrentPath=preg_replace('/[^\/]+$/', '', $CurrentFile); //Remove the filename to get the current path
	require(realpath("$CurrentPath$IncludeFile"));
}

This method is called with 2 parameters: the relative path from the current include file to the file you want to include, and __FILE__
For example, line 4 of “/foo/bar/test2.php” above would now be:
_require('../test4.php', __FILE__);

The first line of the _require function could also be removed by using the __DIR__ constant (instead of __FILE__) which was added in PHP 5.3.0.
XML Problems in PHP
I hate debugging other peoples’ libraries :-\

We recently moved one of our important web server clients to a newly acquired server (our 12th server at ThePlanetThePlanet [Used to be called EV1Servers, and before that RackShack], one of, if not the largest, server-farm hosting company in the states). A bad problem cropped up on his site in the form of a PHP script (CaRP) that deals with parsing XML.

The problem was that whenever the XML was parsed and then returned, all XML entities (escaped XML characters like “&gt;” “&lt;” and “&quot;”) were removed/deleted. I figured the problem had to do with a bad library, as the code worked perfectly on our old server, and the PHP settings on both were almost identical, but I wasn’t sure which one. After an hour or two of manipulating the code and debugging, I narrowed it down to which XML function calls had the problem, and that it was definitely not the scripts themselves. The following code demonstrates the problem.

$MyXMLData='<?xml version="1.0" encoding="iso-8859-1"?><description>&lt;img test=&quot;a&quot;</description>';
$MyXml=xml_parser_create(strtoupper('ISO-8859-1'));
xml_parser_set_option($MyXml,XML_OPTION_TARGET_ENCODING,'ISO-8859-1');
xml_parse_into_struct($MyXml, $MyXMLData, $MyData);
print htmlentities($MyData[0]['value']);
On the server with the problem, the following was outputted:
img test=a
while it should have outputted the following:
<img test="a"

I went with a hunch at this point and figured that it might be the system’s LibXML libraries, so I repointed them away from version 2.7.1, which appears to be buggy, to an older version that was also on the system, 2.6.32. And low and behold, things were working again, yay :-D.

Technical data (This is a cPanel install): In “/opt/xml2/lib/” delete the symbolic links “libxml2.so” & “libxml2.so.2” and redirect them as symbolic links to “libxml2.so.2.6.32” instead of “libxml2.so.2.7.1”.

Data Format Conversion
Moving from Point A to Point B

I am often asked to transfer data sets into MySQL databases, or other formats. In this case, I’ll use a Microsoft Excel file without line breaks in the fields to MySQL as an example. While there are many programs out there to do this kind of thing, this method doesn’t take too long and is a good example use of regular expressions.


First, select all the data in Excel (ctrl+a) and copy (ctrl+c) it to a text editor with regular expression support. I recommend EditPad Pro as a very versatile and powerful text editor.

Next, we need to turn each row into the format “('FIELD1','FIELD2','FIELD3',...),”. Four regular expressions are needed to format the data:

SearchReplaceExplanation
'\\'Escape single quotes
\t','Separate fields and quote as strings
^('Start of row
$'),End of row
From there, there are only 2 more steps to complete the query.
  • Add the start of the query: “INSERT INTO TABLENAME VALUES”
  • End the query by changing the last row's comma “,” at the very end of the line to a semi-colon “;”.

For example:
a	b	c
d	e	f
g	h	i
would be converted to
INSERT INTO MyTable VALUES
('a','b','c'),
('d','e','f'),
('h','h','i');

Sometimes queries may get too long and you will need to separate them by performing the “2 more steps to complete the query” from above.


After doing one of these conversions recently, I was also asked to make the data searchable, so I made a very simple PHP script for this.

This script lets you search through all the fields and lists all matches. The fields are listed on the 2nd line in an array as "SQL_FieldName"=>"Viewable Name". If the “Viewable Name” contains a pound sign “#” it is matched exactly, otherwise, only part of the search string needs to be found.

<?
$Fields=Array('ClientNumber'=>'Client #', 'FirstName'=>'First Name', 'LastName'=>'Last Name', ...); //Field list
print '<form method=post action=index.php><table>'; //Form action needs to point to the current file
foreach($Fields as $Name => $Value) //Output search text boxes
	print "<tr><td>$Value</td><td><input name=\"$Name\" style='width:200px;' value=\"".
		(isset($_POST[$Name]) ? htmlentities($_POST[$Name], ENT_QUOTES) : '').'"></td></tr>';//Text boxes w/ POSTed values,if set
print '</table><input type=submit value=Search></form>';

if(!isset($_POST[key($Fields)])) //If search data has not been POSTed, stop here
	return;
	
$SearchArray=Array('1=1'); //Search parameters are stored here. 1=1 is passed in case no POSTed search parameter are ...
                           //... requested so there is at least 1 WHERE parameter, and is optimized out with the MySQL preprocessor anyways.
foreach($Fields as $Name => $Value) //Check each POSTed search parameter
	if(trim($_POST[$Name])!='') //If the POSTed search parameter is empty, do not use it as a search parameter
	{
		$V=mysql_escape_string($_POST[$Name]); //Prepare for SQL insertion
		$SearchArray[]=$Name.(strpos($Value, '#')===FALSE ? " LIKE '%$V%'" : "='$V'"); //Pound sign in the Viewable Name=exact ...
			//... value, otherwise, just a partial patch
	}
//Get data from MySQL
mysql_connect('SQL_HOST', 'SQL_USERNAME', 'SQL_PASSWORD');
mysql_select_db('SQL_DATABASE');
$q=mysql_query('SELECT * FROM TABLENAME WHERE '.implode(' AND ', $SearchArray));

//Output retrieved data
$i=0;
while($d=mysql_fetch_assoc($q)) //Iterate through found rows
{
	if(!($i++)) //If this is the first row found, output header
	{
		print '<table border=1 cellpadding=0 cellspacing=0><tr><td>Num</td>'; //Start table and output first column header (row #)
		foreach($Fields as $Name => $Value) //Output the rest of the column headers (Viewable Names)
			print "<td>$Value</td>";
		print '</tr>'; //Finish header row
	}
	print '<tr bgcolor='.($i&1 ? 'white' : 'gray')."><td>$i</td>"; //Start the data field's row. Row's colors are alternating white and gray.
	foreach($Fields as $Name => $Value) //Output row data
		print '<td>'.$d[$Name].'</td>';
	print '</tr>'; //End data row
}

print ($i==0 ? 'No records found.' : '</table>'); //If no records are found, output an error message, otherwise, end the data table
?>
Outputting directory contents in PHP
Rebuilding the wheel
A friend just asked me to write a PHP function to list all the contents of a directory and its sub-directories.
Nothing special here... just a simple example piece of code and boredom...
function ListContents($DirName)
{
	print '<ul>';
	$dir=opendir($DirName);
	while($file=readdir($dir))
		if($file!='.' && $file!='..')
		{
			$FilePath="$DirName/$file";
			$IsDir=is_dir($FilePath);
			print "<li>$file [".($IsDir ? 'D' : number_format(filesize($FilePath), 0, '.', ',')).']';
			if($IsDir)
				ListContents($FilePath);
			print '</li>';
		}
	closedir($dir);
	print '</ul>';
}
It wouldn’t be a bad idea to turn off PHP’s “output buffering” and on “implicit flush” when running something like this for larger directories.
Example output for “ListContents('c:\\temp');”:
  • A.BMP [230]
  • Dir1 [D]
    • codeblocks-1.0rc2_mingw.exe [13,597,181]
    • Dir1a [D]
      • DEBUGUI.C [25,546]
  • Dir2 [D]
    • Dir3 [D]
      • HW.C [12,009]
      • INIFILE.C [9,436]
    • NTDETECT.COM [47,564]


    I decided to make it a little nicer afterwards by bolding the directories, adding their total size, and changing sizes to a human readable format. This function is a lot more memory intensive because it holds data in strings instead of immediately outputting.
    function HumanReadableSize($Size)
    {
    	$MetricSizes=Array('Bytes', 'KB', 'MB', 'GB', 'TB');
    	for($SizeOn=0;$Size>=1024 && $SizeOn<count($MetricSizes)-1;$SizeOn++) //Loops until Size is < a binary thousand (1,024) or we have run out of listed Metric Sizes
    		$Size/=1024;
    	return preg_replace('/\\.?0+$/', '', number_format($Size, 2, '.', ',')).' '.$MetricSizes[$SizeOn]; //Forces to a maximum of 2 decimal places, adds comma at thousands place, appends metric size
    }
    
    function ListContents2($DirName, &$RetSize)
    {
    	$Output='<ul>';
    	$dir=opendir($DirName);
    	$TotalSize=0;
    	while($file=readdir($dir))
    		if($file!='.' && $file!='..')
    		{
    			$FilePath="$DirName/$file";
    			if(is_dir($FilePath)) //Is directory
    			{
    				$DirContents=ListContents2($FilePath, $DirSize);
    				$Output.="<li><b>$file</b> [".HumanReadableSize($DirSize)."]$DirContents</li>";
    				$TotalSize+=$DirSize;
    			}
    			else //Is file
    			{
    				$FileSize=filesize($FilePath);
    				$Output.="<li>$file [".HumanReadableSize($FileSize).']</li>';
    				$TotalSize+=$FileSize;
    			}
    		}
    	closedir($dir);
    	$RetSize=$TotalSize;
    	$Output.='</ul>';
    	return $Output;
    }
    
    Example output for “print ListContents2('c:\\temp', $Dummy);”:
    • A.BMP [230 Bytes]
    • Dir1 [12.99 MB]
      • codeblocks-1.0rc2_mingw.exe [12.97 MB]
      • Dir1a [24.95 KB]
        • DEBUGUI.C [24.95 KB]
    • Dir2 [0 Bytes]
      • Dir3 [20.94 KB]
        • HW.C [11.73 KB]
        • INIFILE.C [9.21 KB]
      • NTDETECT.COM [46.45 KB]


      The memory problem can be rectified through a little extra IO by calculating the size of a directory before its contents is listed, thereby not needing to keep everything in a string.
      function CalcDirSize($DirName)
      {
      	$dir=opendir($DirName);
      	$TotalSize=0;
      	while($file=readdir($dir))
      		if($file!='.' && $file!='..')
      			$TotalSize+=(is_dir($FilePath="$DirName/$file") ? CalcDirSize($FilePath) :  filesize($FilePath));
      	closedir($dir);
      	return $TotalSize;
      }
      
      function ListContents3($DirName)
      {
      	print '<ul>';
      	$dir=opendir($DirName);
      	$TotalSize=0;
      	while($file=readdir($dir))
      		if($file!='.' && $file!='..')
      		{
      			$FilePath="$DirName/$file";
      			$IsDir=is_dir($FilePath);
      			$FileSize=($IsDir ? CalcDirSize($FilePath) : filesize($FilePath));
      			$TotalSize+=$FileSize;
      			print '<li>'.($IsDir ? '<b>' : '').$file.($IsDir ? '</b>' : '').' ['.HumanReadableSize($FileSize).']';
      			if($IsDir) //Is directory
      				$TotalSize+=ListContents3($FilePath);
      			print '</li>';
      		}
      	closedir($dir);
      	print '</ul>';
      }
      
      Example output: for “ListContents3('c:\\temp');”:
      • A.BMP [230 Bytes]
      • Dir1 [12.99 MB]
        • codeblocks-1.0rc2_mingw.exe [12.97 MB]
        • Dir1a [24.95 KB]
          • DEBUGUI.C [24.95 KB]
      • Dir2 [0 Bytes]
        • Dir3 [20.94 KB]
          • HW.C [11.73 KB]
          • INIFILE.C [9.21 KB]
        • NTDETECT.COM [46.45 KB]


        Of course, after all this, my friend took the original advice I gave him before writing any of this code, which was that using bash commands might get him to his original goal much easier.