Com4tzone



 

 

 

Dynamic generation of static webpages

Sometimes you may need to generate a static version of your dynamically-generated web pages. For instance, you might have to produce a web-cd, or you might want to offer on your website a compressed version of some content to be downloaded and read off-line. In some cases, you could need to show a work-in-progress of a web site you are developing to someone who doesn't have a connection to the Internet (yes, sometimes it happens). Or you might want to publish your web pages at an Internet site which doesn't offer you server-side programming (such as most of free hosting sites).

In these circumstances, what you have to produce is a lot of static web pages, viewable by a user who doesn't have a webserver running, or hosted where you can't control a dynamic generation.

What if you wanted to use PHP to produce those pages, taking advantages from it, with minimum effort? Here below I'll describe something you could do quite easily. Extra ideas on this base will be welcome.

You'll need only some tools you probably already have: a webserver, PHP, and a browser that you can control in batch mode, such as lynx (present in most Linux distributions, and freely available for other platforms at http://lynx.browser.org).
 

 
comments
The idea

The idea is simple. You develop your website at home, using all PHP features you like, such as inclusion of text files, extensive use of pre-defined and user-declared functions, classes, templates, and so on.

Once you are satisfied with your work, you could use your preferred browser to view all your pages, one-by-one, and save them in a directory of your hard disk, changing file extensions from .php4 to .html. But this approach would be quite time-consuming, expecially if you have several pages and need to re-generate them quite often.

Using lynx is a good alternative, since it can fetch the pages from the webserver and return them into the standard output. You could then write a batch procedure (a shell script) such as

lynx -source http://localhost/dynpages/index.php4 > C:\Documents\StaticPages\mywebsite\index.html
lynx -source http://localhost/dynpages/page2.php4 > C:\Documents\StaticPages\mywebsite\page2.html
lynx -source http://localhost/dynpages/page3.php4 > C:\Documents\StaticPages\mywebsite\page3.html
lynx -source http://localhost/dynpages/page4.php4 > C:\Documents\StaticPages\mywebsite\page4.html
The example here is for Windows, under other platforms you should obviously change the path after the > operator.

This way, when you want to regenerate all your pages, you just need to launch your batch file.

If you have a number of files, and/or their names change often, you could think about using PHP to generate the lines of your batch file:

<?php

define('SLASH', '\'); //* this is for use under Win9x;
//* under linux, use '/'.

define('CRLF', chr(13).chr(10));

$httpbase="http://localhost/samplesite"; //* local URL for your pages

$dyndir="E:/xitami/webpages/samplesite"; //* (do not use backslashes)
//* where your webserver keeps your pages

$staticdir="C:/windows/desktop/samplewebsite/generatedpages/online";
//* where you want your pages to be put

$parameters="?outputas=online"; //* parameters you might need

function changeslashes($path) {
return str_replace("/", SLASH, $path); //* useful under Windows
};

function subdir($path) {

global $dyndir;

if ($dyndir!=$path) {
return substr($path, strlen($dyndir)-strlen($path));
}
else {
return '';
};

};

function destination($path) {

global $staticdir;

$dest=$staticdir . subdir($path);

$dest=changeslashes($dest); //* useful under Win
return $dest;

};

function searchdir($dirname) {

global $httpbase, $parameters;
$handle=opendir($dirname);
while ($file=readdir($handle)) {
if ($file !='.' && $file !='..') {
if (is_dir($dirname.'/'.$file)) {

if (!is_dir(destination($dirname).'/'.$file)) {
echo "mkdir " . destination($dirname). SLASH . "$file" . CRLF;
}
searchdir($dirname.'/'.$file);
} else {
if (substr($file, -5)==".php4") {
$newname=substr($file, 0, strlen($file)-5).'.html';
echo "lynx -source $httpbase" . subdir($dirname) .
"/$file$parameters > " . destination($dirname) . SLASH . "$newname" . CRLF;
} else {
echo "copy " . changeslashes($dirname) .
SLASH. "$file " . destination($dirname) . SLASH. "$file" . CRLF;
}
}
}
closedir($handle);
};

searchdir($dyndir);
?>

The script is recursive, and will generate a batch file to generate all your PHP pages (making them normal HTML files) and to just copy all other files.

You might capture the output of this PHP script into a batch file and run it at once. For instance:

PHP -q generatepages.php4 > generatepages.bat
generatepages.bat
This can be a batch procedure on its own, and so you need only a command to generate all the pages of your web site.

I wrote and tested this script under Windows. It should also work under Linux, provided that you change something here and there (you wouldn't need to change slashes to back slashes, you would use cp instead of copy, etc.).

As you may notice, you can encode in the URL address (the query string) one or more parameters, as usual. I often take advantage of this, using parameters to control which stylesheet to use, or which content to display on my pages.

To make this clear, I'll show you the values I give to the outputas parameter:

online / to use when you generate static pages to be published at an internet site (links have to get .html extension, links to directories have no file name added);

offline / to use when you generate static pages to be viewed off line, with no webserver involved (links have to get .html extension, links to directories have to be added index.html);

print / to use when you want to print the pages you are generating (I choose a different stylesheet, and write remote URLS in brackets after linked text)
Since the value of parameters is available for all the functions you call in your webpage, you can make good use of this.

While you develop your website, you will normally use the extension .php3 or .php4 for your files. If you write the code for a link straight in your HTML code, then you would have broken links when you generate .html files (since the name of the files have changed).

As a workaround for this, I normally use this trick: instead of writing the normal code for a link

<A HREF="page2link.php4">linked text</A>

I call a PHP function

<?PHP makelink("page2link", "linked text"); ?>

which, by default, outputs exactly the same HTML code as above, but that could output something different when given specific parameters in the query string.

<?php

function makelink($linkedtext, $linkhref, $kind=0, $fragment="", $title="", $target="_top") {
//*$kind values: 0 means internal; 1 means external;

global $outputas;

$class = ($kind==0)? 'internal' : 'external';

if ($kind==0) {

if (substr($linkhref, -1)=='/') {
if ($outputas=='offline') {
$linkhref .= "index.html"; //* you can't have links to dirs when offline
}
}
else {
if ($outputas=='online' || $outputas=='offline') {
$linkhref.='.html';
}
else {
$linkhref.='.php4';
}
};
};

echo "<A HREF=\"$linkhref$fragment\" CLASS=\"$class\" TITLE=\"$title\" TARGET=\"$target\">";
echo "$linkedtext</A>";

if ($outputas=='print' && $kind==1) {
echo " <SMALL>[$linkhref]</SMALL>";
}; //* for printed things, I want to write URLs in brackets after links
};

?>

Another little thing. Since your generated files could be in different directories, you could need to differentiate the path for some resources (for instance, the stylesheet, or the home page).

For this things, you can guess at which level is your page in the hierarchy of directories, and then produce a relative path to be used (for instance, "../../") when needed.

You'll just have to write this on the top of your pages:

<?php

$level=sizeof(split('/', $PHP_SELF))-3;
$rp = $level>0 ? str_repeat("../", $level) : './';
?>
       

Send your suggestions and comments to web@com4tzone.dk
Com4tzone • web developer digest © 2019-2023
Copenhagen, Denmark