<<

. 43
( 132 .)



>>

only contain a couple of dozen lines of code each. This is because just about every-
thing is written in reusable functions.
So once again the important thing is to understand the functions kept in /book/
functions/basic.php and /book/guestbook2k/functions.php, as well as the startup
code run in /book/guestbook2k/header.php.



Code Breakdown
As mentioned in the previous section, the vast majority of the work of this applica-
tion is done in functions, and these functions are kept in files that will be included
in the pages called from the browser.

From functions/basic.php
The following are the main functions from basic.php that we™ll be using in this
application. We™ll cover other functions later on, and they™re all briefly documented
in Appendix F. The functions are grouped by their general purpose, and that™s the
order in which we™ll go through them.

GENERAL UTILITY FUNCTIONS
(OR, “IF ONLY PHP HAD A FUNCTION TO . . .”)
Here are some utility functions that we find helpful:

ARRAY_KEY_VALUE() With the advent of the new “superglobal” PHP variables like
$_POST, you™ll be getting most of the values you use in your code from associative
arrays ($_GET, $_POST, and so on). The trouble is that if a particular key hasn™t been
Chapter 8: Guestbook 2003, the (Semi-)Bulletproof Guestbook 237

defined in the array, trying to access it causes an error ” well, strictly speaking, a
warning, but it™s a good idea to eliminate those, too. You could type something like

$country = isset($my_array[˜country™]) ? $my_array[˜country™] : ˜™;

over and over again, but that isn™t great because if $my_array[˜country™] is
set to NULL, isset() will return FALSE, and that may not be the behavior you
want ” after all, the key is present in the array. A better choice would be
array_key_exists(), which returns TRUE if the key exists, no matter what its con-
tents. But typing lines like the following repeatedly is no fun at all:

$country = array_key_exists(˜country™, $my_array) ?
$my_array[˜country™] : ˜™;

The following function will help:

function array_key_value($arr=™™, $name=™™, $default=™™)
{
// cast in case $arr is an object
$arr = (array)$arr;
if (!is_array($name))
{
if (array_key_exists($name,$arr))
$default = $arr[$name];
return $default;
}
$results = array();
foreach ($name as $n)
{
if (array_key_exists($n,$arr))
{
$results[] = $arr[$n];
}
else
{
$results[] = $default;
}
}
return $results;

}

You can see this function used on almost every page of the application. For
example, the sign.php page contains several lines that look like this:

$email = array_key_value($_POST,™email™);
238 Part III: Simple Applications

When array_key_value() is called from the preceding example, the first if
block will be ignored. In the second if block, the array_key_exists() function
checks if a key by name of email exists in the $_POST array. If it does, the value of
$_POST[˜email™] will be assigned to $email when the function returns a value. If
the email key does not exist, $email will contain an empty string.

IS_ASSOC() Sometimes you need to know not just if a variable is an array, which
the PHP function is_array() can tell you, but if it™s an associative array ” in other
words, do all the elements have named keys? The is_assoc() function exists for
this situation.

function is_assoc($a)
{
if (is_array($a) || is_object($a))
{
$nkeys = array_filter(array_keys($a),™is_numeric™);
if (empty($nkeys))
{
return TRUE;
}
}
return FALSE;
}

DEFENSIVE AND TEXT-HANDLING FUNCTIONS
A shocking amount of programming work has to do with the manipulation of
strings. It seems there™s always text to be chopped up, stuck together, searched for,
or formatted. This section deals with some of the text-processing functions avail-
able in PHP.

CHARSET() This function sends out an HTTP header that explicitly sets the
character-set-encoding value for the page to ISO-8859-1:

function charset($charset=™™,$mimetype=™™)
{
if (empty($charset))
{
$charset = ˜ISO-8859-1™;
}
if (empty($mimetype))
{
$mimetype = ˜text/html™;
}
header(“Content-Type: $mimetype; charset=$charset”);
}
Chapter 8: Guestbook 2003, the (Semi-)Bulletproof Guestbook 239

If the character set is left undefined, it can be much more difficult to detect and
prevent hacks into your scripts (by looking for < and > characters, for example). If
you have access to the php.ini file for your site, you can uncomment the
˜default_charset™ value there to the same effect. You can find more information
about this topic at the following sites:

http://www.cert.org/tech_tips/malicious_code_mitigation.html
http://www.apache.org/info/css-security/encoding_examples.html

CLEANUP_TEXT() This function goes a long way toward making sure we don™t
insert malicious text into our database.

function cleanup_text ($value=™™, $preserve=™™, $allowed_tags=™™)
{
if (empty($preserve))
{
$value = strip_tags($value, $allowed_tags);
}
$value = htmlspecialchars($value);
return $value;
}

This function accomplishes two things. First, it removes all HTML tags. The
strip_tags() function takes care of that. We can indicate tags we want to keep
with the third argument ($allowed_tags). For instance, if we want to allow bold
and italic tags, the second argument to strip_tags() can be a string like this:
<b><i>. If we want to leave tags as they are, we can indicate this with a non-empty
value in the second argument, $preserve.
Then htmlentities() changes characters like ampersands (&) and double quotes
to their equivalent HTML entities (&amp; and &quot;, respectively). After being run
through this little function, your text is ready to be inserted in the database.

REVERSE_CLEANUP_TEXT() So we™ve run all the text from your users through
cleanup_text() before storing it in our database, for safety™s sake. Now, though,
we need to get that text back out of the database and display it on a Web page. If
we did allow some HTML tags to be included, we™ll need to reverse the effects of
cleanup_text(), or instead of seeing this ”

My mom won™t let me watch The Exorcist tonight!

” you™ll see this:

My mom won™t let me watch <i>The Exorcist</i> tonight!
240 Part III: Simple Applications

function reverse_cleanup_text ($value)
{
static $reverse_entities = NULL;
if ($reverse_entities === NULL)
{
$reverse_entities = array_flip(
get_html_translation_table(HTML_ENTITIES)
);
}
return strtr($value,$reverse_entities);

}

The HTML translation table HTML_ENTITIES is a list of all the special characters
that have HTML-entity equivalents. Then we use array_flip() to turn it around,
so that strtr() can go through the string and replace each HTML entity it finds
with the special single character it represents_ &lt; to <™ &amp; to &, and so on.
(We save the modified translation table in a static variable, so subsequent calls to
this function won™t have to recreate it.)

MAKE_PAGE_TITLE() For most pages, we use the same text in the <title> that
appears in an HTML heading <h1>. But some characters are inappropriate for the
<title> tag. For instance, if we set $page_title to “Jos&eacute;™s Review of
<i>The Exorcist</i>”, within a rendered <h1> tag we™d see the correct value,
Jos©™s Review of The Exorcist. But the title of the browser window will show
Jos©™s Review of <i>The Exorcist</i>. To avoid this, we can use this little
function:

function make_page_title ($title=™™)
{
return reverse_cleanup_text(cleanup_text($title));
}

FROM /GUESTBOOK2K/HEADER.PHP
Once again, this file will be included in every page in this application. It includes
the functions.php and globals.php files, where we™ll keep all the functions and
global variables specific to this application. In addition, the first few lines of this
application will see to a few details. For instance, it sets the PHP include_path
configuration variable to cover the /book/functions directory. (If you can edit the
php.ini file for your installation, you can set include_path in there and remove
this code.)

// make sure that the current directory and book/functions
// are in the include path
Chapter 8: Guestbook 2003, the (Semi-)Bulletproof Guestbook 241

// make sure that the book/functions directory is in the include path


//realpath turns a relative path to an absolute one, and
//DIRECTORY_SEPARATOR is pre-defined PHP constant that on Unix will
//return a slash(/) and on Windows a backslash(\)
$funcdir = realpath(˜..™.DIRECTORY_SEPARATOR.™functions™);
$include_path = ini_get(˜include_path™);
if (strpos($include_path, $funcdir) === FALSE)
{
// the only time there™s a semicolon in the include path is on Windows.
// (far as i know, at least...)
$ps = strchr($include_path, ˜;™) ? ˜;™ : ˜:™;
ini_set(˜include_path™,$include_path.$ps.$funcdir);
}


require_once(˜basic.php™);


// set the character encoding
charset();


// display all errors and warnings, but not notices
error_reporting(E_ALL ^ (E_NOTICE | E_USER_NOTICE));


require_once(˜globals.php™);
require_once(˜functions.php™);


mysql_dbconnect();


A few words about including files. You™ll notice that we use two different func-
tions to do this, require_once() and include_once(). They work exactly the
same way, except that ” as the name implies ” require_once() won™t take no for
an answer. If it can™t find the file it™s trying to include, the script will fail right
there. The include_once() function will just issue a warning and move on.
We also generally prefer the include_once() and require_once() functions
over include() and require(). Again, as the names imply, the difference is that
the _once functions will only include a file if it hasn™t already been included at
some point. This enables us to put calls to a file like header.php in all of our files,
even if some of them might end up including others.

FROM /GUESTBOOK2K/GLOBALS.PHP
In the following code, we have included something interesting: a constant, here
named DEFAULT_LIMIT. A constant is like a variable in that it contains a value (in
this instance 2). However, that value cannot be changed by a simple assignment; in
fact, once a constant has been defined with the define() function, it can™t be
242 Part III: Simple Applications

changed at all. Constants do not run into the same scope problems that are encoun-
tered with variables, so you can use them within functions without having to pass
them in as arguments or worry about declaring globals. After you run the

<<

. 43
( 132 .)



>>