Python Programming, news on the Voidspace Python Projects and all things techie.


The Laws of Tagging

emoticon:videocam Debian users can use a package called debtags to browse through their 16769 possible different binary packages.

This program uses principles developed by a mathematician called Shiyali Ramamrita Ranganathan in the nineteen-thirties. This was originally intended for classifying library books. If we adapt his principles for website contents instead :

  1. Websites are for use.
  2. Every viewer his or her page.
  3. Every page its viewer.
  4. Save the time of the viewer.
  5. The website is a growing organism.

Now obviously number 5 doesn't apply to every website - hopefully it applies to Voidspace though Very Happy .

See the section on faceted classification about his PMEST system of classification :

what the object is primarily about. This is considered the "main facet."
the material of the object.
the processes or activities that take place in relation to the object.
where the object happens or exists.
when the object occurs.

I'm not convinced this has a great deal of relevance to tagging webpages - other than indirectly. I think the real problem is - how do best relate pages to each other so that the user is best able to get at the information they need. A good system will be easy to use - but also prompt users to explore information that may be helpful to them. It will do more than just take them to the specific information they are looking for, but present other pages that may be useful.

It has been suggested that a 3D representation of a site - with tags acting as nodes linking to many pages. Not only is this difficult to represent in a conventional browser, but you probably need multi-dimensions to visualise all the possible links. Hmmm... Tagging (multi-categories) with an effective search mechanism seems like the best solution so far.

Mind you - by the time you see the bandwagon... it's too late....

Like this post? Digg it or it.

Posted by Fuzzyman on 2005-07-25 13:01:04 | |



Imagine This

emoticon:lighton As mentioned in my previous blog, I'm creating a plugin system for rest2web. The first plugin I'm working on is a gallery plugin. This is because my gallery is badly out of date and I'm not going to manually update it.

I know there are lot's of gallery programs already out there. My personal favourite is GallerPy. The trouble is that all the ones I've seen generate pages dynamically. This means that the server needs Python and PIL installed. I don't see the point of regenerating the same pages over and over again. I'd also like to make it easy to add a description to the images, specify the order they appear, etc. In true hacker tradition I'm writing my own.

This is the first time I've used PIL - and I'm amazed at how easy it makes it to do my simple task.

The following snippet goes through a directory and creates a thumbnail of each image. It also saves the details of each image in a ConfigObj. This allows you to add a description to each image. If you're using ConfigObj 4 you can also specify the order.

import os
from configobj import ConfigObj
# image imports
import Image

# config data
thumb_size = (150, 150)
dir = 'gallery/'
thumb_dir = 'gallery/thumbnails'
thumb_prefix = 'tn_'
config_name = 'gallery.ini'
if not os.path.isdir(thumb_dir):
    # for backwards compatibility with ConfigObj3
    config = ConfigObj(config_name, flatfile=False)
except TypeError:
    # no such thing as a flatfile in ConfigObj 4
    config = ConfigObj(config_name)
# now go through all the images in the directory
the_list = os.listdir(dir)
for image in the_list:
    path = os.path.join(dir, image)
    # just in case :-)
    if os.path.isdir(path):
    name, ext = os.path.splitext(image)
    # create the image object
    im =
    size = im.size
    force = False
        # do we already have details of the image ?
        this = config[image]
    except KeyError:
        # No !
        config[image] = {}
        this = config[image]
        this['title'] = name.replace('_', ' ').title()
        this['description'] = ''
        this['size'] = size
        this['filesize'] = os.path.getsize(path)
        x = int(this['size'][0])
        y = int(this['size'][1])
        if not (x, y) == size:
            # the image has changed since last time (size is different)
            this['size'] = size
            this['filesize'] = os.path.getsize(path)
            force = True
    thumb_name = os.path.join(thumb_dir, thumb_prefix + name + ext)
    if not os.path.isfile(thumb_name) or force:
        # this creates the thumbnail
        # easy hey !
        im =
    # let's save the details of the thumbnail as well
    this['thumb_size'] = im.size
    this['thumbnail'] = thumb_prefix + name + ext
# we have now created all the thumbnails and saved the config file

Like this post? Digg it or it.

Posted by Fuzzyman on 2005-07-25 11:02:23 | |

Categories: ,



emoticon:html Tagging is certainly an interesting subject. The aim is to relate the content of various pages in a website, in a way that makes it easier for users to find content they want.

Traditional search engines don't fulfil this need, because not everyone knows what they are looking for. What is needed is a way of organising the pages/content so that users can see what is available.

The conventional hierarchical 'sitemap' also doesn't fulfil this. A sitemap is organised like a tree and only allows pages to belong to one category. As soon as it is in one branch of the tree, a page can't be in another. Sometimes the choice is arbitrary - and if the user heads down one branch they may miss the information they are looking for because it is arbitrarily located on another branch of the tree.

The current best solution for this is 'tagging' that allows you to mark content as being in several categories at the same time.

rest2web now supports [1] an additional keyword 'tags' that allows you to specify a list of tags for each page. Currently you have to use it yourself though. I'm using it as the meta keywords information for each page. This doesn't help users find the page - but does help with SEO. A friend of mine has recently demonstrated the power of SEO by becoming the first result in google for all sorts of interesting searches [2].

There is a tagging spec developed by Techorati. It relies on marking a link as a tag - with the link pointing to a reference for the tag. It's always nice to use standards, but I'm not sure about this one.

I'd like to develop an application that allows dynamic searching of tags [3]. This will initially be a simple CGI without AJAX. A plugin to rest2web will create a data file for all the pages - which the CGI will search, allowing you to specify several tags and see all the pages which have all of those tags. It will need some kind of tag cloud. I'd rather the link associated with the tag pointed to the search application. Hmm.....

In order to allow this rest2web needs to collate the tag information for all the pages in a site. I'm creating a plugin system that will make this easy(ish) to implement. The first plugin will be a gallery - because the gallery section of my website badly needs updating.

[1]The version now in SVN
[2]E.g. cyberpunk articles - which is pretty impressive.
[3]This will be in addition to my normal hierarchical sitemap. Having rest2web build this automatically for me is going to require another plugin.

Like this post? Digg it or it.

Posted by Fuzzyman on 2005-07-25 09:20:09 | |

Categories: , ,


Happy Anniversary

Laughing Actually I missed it by about three weeks. I've just been browsing my old blog and discovered the entry from July 3rd 2003 - 2 years ago.

I've started to learn Python for a new Atlantis Project - I'll let you know as soon as I can do anything interesting with it !!

So I've been programming in Python for just over two years now - my first programming after a break of ten years. (When I used to program the Amiga, sigh - those were the days).

Like this post? Digg it or it.

Posted by Fuzzyman on 2005-07-24 15:39:35 | |


Hosted by Webfaction