Auto Linking

by @jehiah on 2004-10-07 16:17UTC
Filed under: All , HTML , Javascript

There are three parts to the auto-link problem for a site (Simple Coding of links, Maping Titles and Urls, and Redirecting Requests). Read on for some of my ideas on tackling the problem, and a few solutions.

Simple Link Creation

The first part of the problem deals with actually creating the html links in an easy method (one that does not require where in the site navigation the destination page is). One options would be to link to always create a simple link /a and have a javascript function which rewrites that to /go_autolink.php?page=page_title via the document onload event. While this might suck for long pages which are slow to finish loading (as links will not work till rewritten), it will work. Another problem with this solution is that spiders will not be able to crawl your site. You will need to make sure you have a complete site index or similar page with real links so they can still index your content. Note: You will need to fire the autolink() function at onload for this to work.

Rewrite Link Code

<script language="javascript">
/* Author: Jehiah Czebotar October 7, 2004
 * http://www.jehiah.com/
 * Function autolink(), Function mangleURL(), Function getDomain()
 *
 * Feel free to use this script under the terms of the GNU General Public
 * License, as long as you do not remove or alter this notice.
 */

function autolink()
{
	var base_url = getDomain(); 

Note: this is where you would change your starting link, and the structure of your destination link.

	var auto_link_text = "a"; // link that exists in the page (really "/a")
	var prefix = "go_autolink.php?page=";  // format we want the link to end up as
	// get all page links
	var all_links = document.getElementsByTagName("a")
	// loop through links and find the 'autolinks'
	for (var i=0;i < all_links.length;i++)
	{
		// debugging : 
		// alert(all_links[i].href);		
		// if this link is an autolink
		if (all_links[i].href == base_url + auto_link_text)
			// href == prefix & mangle(H_getText(this))
			all_links[i].href = base_url + prefix + mangleURL(H_getText(all_links[i]));
	} 	
	return true;
};

function getDomain(){
    myregexp = new RegExp("(http|https|file|ftp)://[^/]*/");
    var m = myregexp.exec(location.href);
    if (m == null)
        return "";
    else
        return m[0];
}

function mangleURL(str)
{
	// strip concecutive spaces
	str = str.replace(/s{2,}/g," ");
	//trim leading space
	str = str.replace(/^s/,"");
	// autocase ?
            str = str.toLowerCase();
	// replace spaces with underscore
	str = str.replace(/s/g,"_");
	return str;
};

/* Author: Mihai Bazon, September 2002
 * http://students.infoiasi.ro/~mishoo
 *
 * Function H_getText()
 *
 * Feel free to use this script under the terms of the GNU General Public
 * License, as long as you do not remove or alter this notice.
 */

function H_getText(el) {
	var text="";
	for (var i=el.firstChild;i!=null;i=i.nextSibling) {
		if(i.nodeType==3) {
			text += i.data;
		} 
		else if (i.firstChild!=null) { 
			text+=H_getText(i);
		}
	}
	return text;
};
</script>

Mapping Titles and URLs

The middle step of auto-linking titles to page urls is to gather the data needed to map between the two. I’ll leave you to come up with a good way to manage the page links between title and the real url.

I created a simple database to store the two, and added a few entries manually. Below is my table structure.

Table Structure

CREATE TABLE autolink_map (
    ID int(11) NOT NULL auto_increment,
    title varchar(255) default NULL,
    url varchar(255) default NULL,
    PRIMARY KEY  (ID)
) TYPE=MyISAM;

A sample entry for my site would be

INSERT INTO autolink_map VALUES ('javascript_isdefined','/archive/javascript-isdefined-function');

Link Resolution

The third part of the problem is intelligently resolving a link from the auto link title to an actuall page (which may be virtually anywhere). If you are lucky and are the czar for your site, you might be able to get the page to be the same as the auto-link url and skip the maping all together (wiki’s would be a good case for that). For the rest of us, we have to use some finess. For this illustration i’ll require an exact match, though you could probably come up with weighted expressions to handle close matches with soundex and such.

go_autolink.php

Make a Database Connection

<?php
// make a db connection
$hostname_dbcon = "localhost";
$database_dbcon = "database_name";
$username_dbcon = "username";
$password_dbcon = "password";
$dbcon = mysql_pconnect($hostname_dbcon, $username_dbcon, $password_dbcon) or die(mysql_error());

Query the database

// lookup in the database
mysql_select_db($database_dbcon, $dbcon);
$query_records = "SELECT url FROM autolink_map WHERE title = '".stripslashes($_GET["page"])."' ";
$records = mysql_query($query_records, $dbcon) or die(mysql_error());
$row_records = mysql_fetch_assoc($records);
$totalRows_records = mysql_num_rows($records);

Give the browser a redirect responce.

// give the redirect
if ($totalRows_records == 1)
        {header('location:'.$row_records['url']);}
else
        {header('location:/page_not_found.php');}
?>

Subscribe via RSS ı Email
Jehiah Czebotar