Python Programming, news on the Voidspace Python Projects and all things techie.

#24

Beauty in Simplicity

I'm converting over the contents of my website using docutils. There's hundreds of pages of badly marked up contents, of which I want to keep a fair proportion. This is from the days when I was Learning HTML.

Luckily using reST I can just copy and paste the text contents from the browser. Because reST is so simple, only the minimum of markup is needed. The next step is to use the html_parts function to generate the HTML, and insert it (along with doc title and breadcrumbs [1]) into a template.

I'm still learning about reST though, and I needed to do something that looked like a blockquote. To look right it needed to be indented - and reST uses indentation as part of it's syntax. I trawled through the docs trying to find out how to markup a passage to be displayed as a blockquote. Eventually I came across this gem :

A text block that is indented relative to the preceding text, without markup indicating it to be a literal block, is a block quote.

In other words - just leaving it alone caused it to be marked up correctly. Nice one.

The advantage of this approach is that restyling in the future will be much easier - I just modify the template, or the HTML generation tools, and run it over the document tree. I'm not sure yet if I'll autogenerate the indexes, I'll still need to write the descriptions even if I do.

Even including images and blocks with different CSS classes is easy in reST. There's very little I won't be able to do with it.

[1]Ain't that a cute name for navigation trails. I only recently came across it.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-04-08 17:24:11 | |

Categories: ,


#23

Get It While it's Hot

Get your luvverly fresh updates.... Yes it's that time of year again. Spring is in the air and things are changing over at the Voidspace Python Emporium [1].

Lots of updates, big and little, to various of the Python modules here. There'll be an announcement over at comp.lang.announce shortly, but here's a summary.

Last but not least, there's a new plugin for Firedrop. It's called FireMail - and it allows you to send your blog posts as emails. I've also updated FireSpell so that you can configure the language (amongst other improvements). See the docs (and download them) over in the Firedrop thingy.

[1]Sorry about that, couldn't help it.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-04-07 14:32:45 | |

Categories: ,


#22

Aargh....

Not a good time for this to happen, the FTP server of my website is truncating files. This means my blog files are all cut short. Normally it wouldn't matter - except Daily Python URL have just featured two of my blog entries. About 600 visitors will have just decided my blog is rubbish !!

Website Woes

Unfortunately my host wasn't too gracious when I complained about the problem. He (they ? I'm actually not sure) claimed the problem was my end. They don't handle problems very well - but on the other hand I only paid them $49 for the year and I'm using almost 20 gig of bandwidth every month. They also installed Python 2.3.4 and various extensions for me. Finding a professional host that offers all that will cost me a minimum of 5 times this cost.

Luckily the problem might be resolving itself. A customer of TBS might be hiring me to build his website and a custom gallery application (which I'll be able to open source). I may well be able to piggy back on the hosting package I arrange for him.

Whilst We're On the Subject

There has been some discussion on the docutils mailing list about using reST to build websites. I'm getting close to needing to convert a lot of articles to the new design - so I'll have to build something like this. Watch this space :-)

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-04-06 21:57:31 | |

Categories:


#21

Back to the Days of Mosaic

The Problem

Our internet at work is heavily restricted. Very heavily - it's whitelist only, which means 1% of 1% of 1% of the web. We have a single computer enabled for unrestricted web access.

As well as a huge inconvenience, this is a great challenge. There are various possible solutions - all requiring access to a server 'on the outside'. As we are white list only this will take a little bit of stealth and time. In the meanwhile I've found an unusual, temporary, solution.

Now all google domains are allowed (which thankfully gives me access to comp.lang.python). Unfortunately, attempting to fetch pages from the google cache via the web interface usually fails. It accesses them by IP address - which IPCop blocks. Mercifully when I use the excellent pygoogle interface to the google web api it does return the cached page (if google has it). This gave me an idea.

Breaking Out in 50 Lines of Code

I've hacked up an implementation of a proxy server based on SimpleHTTPServer. Run it and set your browser proxy settings to localhost:8000. Any pages requested are then fetched from the google cache.

Retro Internet

Because the google API limits us to the number of fetches we can make (1000 per day) we restrict the pages we fetch to pages that google is likely to have cached. This means pages that are plain text or html - forget javascript, CSS, images, etc. This is for those in similar situations - or those who want a real retro internet experience. This is back to the days of mosaic. Not bad for many pages though. Here's www.python.org through the proxy. Even better, when I checked PlanetPython.org - it was yesterday's version, blimey.

python.org through the google proxy

Here's the Code

If you exclude docstrings, comments, and docstrings, there's actually 45 lines of code following. Update: it's now slightly longer - possible 50 lines - as it now bypasses the google cache for anything in google domains, this allows you to do google searches !

# 2005/04/05
# v0.1.3
# googleCacheServer.py

# A simple proxy server that fetches pages from the google cache.

# Homepage : http://www.voidspace.org.uk/python/index.shtml

# Copyright Michael Foord, 2004 & 2005.
# Released subject to the BSD License
# Please see http://www.voidspace.org.uk/python/license.shtml

# For information about bugfixes, updates and support, please join the Pythonutils mailing list.
# http://groups.google.com/group/pythonutils/
# Comments, suggestions and bug reports welcome.
# Scripts maintained at http://www.voidspace.org.uk/python/index.shtml
# E-mail fuzzyman@voidspace.org.uk

"""
This is a simple implementation of a proxy server that fetches web pages
from the google cache.

It is based on SimpleHTTPServer.

It lets you explore the internet from your browser, using the google cache.
See the world how google sees it.

Alternatively - retro internet - no CSS, no javascript, no images, this is back to the days of MOSAIC !

Run this script and then set your browser proxy settings to localhost:8000

Needs google.py (and a google license key).
See http://pygoogle.sourceforge.net/
and http://www.google.com/apis/

Tested on Windows XP with Python 2.3 and Firefox/Internet Explorer
Also reported to work with Opera/Firefox and Linux

Because the google api will only allow 1000 accesses a day we limit the file types
we will check for.

A single web page may cause the browser to make *many* requests.
Using the 'cached_types' list we try to only fetch pages that are likely to be cached.

We *could* use something like scraper.py to modify the HTML to remove image/script/css URLs instead.

Some useful suggestions and fixes from 'vegetax' on comp.lang.python
"""


import google
import BaseHTTPServer
import shutil
from StringIO import StringIO       # cStringIO doesn't cope with unicode
import urlparse


__version__ = '0.1.0'

cached_types = ['txt', 'html', 'htm', 'shtml', 'shtm', 'cgi', 'pl', 'py'
                'asp', 'php', 'xml']
# Any file extension that returns a text or html page will be cached
google.setLicense(google.getLicense())
googlemarker = '''<i>Google is not affiliated with the authors of this page nor responsible for its content.</i></font></center></td></tr></table></td></tr></table>\n<hr>\n'''
markerlen = len(googlemarker)

import urllib2
# uncomment the next three lines to over ride automatic fetching of proxy settings
# if you set localhost:8000 as proxy in IE urllib2 will pick up on it
# you can specify an alternate proxy by  passing a dictionary to ProxyHandler
##proxy_support = urllib2.ProxyHandler({})
##opener = urllib2.build_opener(proxy_support)
##urllib2.install_opener(opener)

class googleCacheHandler(BaseHTTPServer.BaseHTTPRequestHandler):
    server_version = "googleCache/" + __version__
    cached_types = cached_types
    googlemarker = googlemarker
    markerlen = markerlen
    txheaders = { 'User-agent' : 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)' }

    def do_GET(self):
        f = self.send_head()
        if f:
            self.copyfile(f, self.wfile)
            f.close()

    def send_head(self):
        """Only GET implemented for this.
        This sends the response code and MIME headers.
        Return value is a file object, or None.
        """

        print 'Request :', self.path # traceback to sys.stdout
        url_tuple = urlparse.urlparse(self.path)
        url = url_tuple[2]
        domain = url_tuple[1]
        if domain.find('.google.') != -1:   # bypass the cache for google domains
            req = urllib2.Request(self.path, None, self.txheaders)
            self.send_response(200)
            self.send_header("Content-type", 'text/html')
            self.end_headers()
            return urllib2.urlopen(req)

        dotloc = url.rfind('.') + 1
        if dotloc and url[dotloc:] not in self.cached_types:
            return None     # not a cached type - don't even try

        print 'Fetching :', self.path # traceback to sys.stdout
        thepage = google.doGetCachedPage(self.path) # XXXX should we check for errors here ?
        headerpos = thepage.find(self.googlemarker)
        if headerpos != -1:
            pos = self.markerlen + headerpos
            thepage = thepage[pos:]

        f = StringIO(thepage)       # turn the page into a file like object

        self.send_response(200)
        self.send_header("Content-type", 'text/html')
        self.send_header("Content-Length", str(len(thepage)))
        self.end_headers()
        return f

    def copyfile(self, source, outputfile):
        shutil.copyfileobj(source, outputfile)


def test(HandlerClass = googleCacheHandler,
         ServerClass = BaseHTTPServer.HTTPServer):
    BaseHTTPServer.test(HandlerClass, ServerClass)


if __name__ == '__main__':
    test()

Shhhhh Don't Tell Anyone

Of course if my sysadmin sees this... it'll stop working :-)

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-04-05 11:39:16 | |

Categories: , ,


#20

You Have Outgoing Mail

What's that quote - any sufficiently advanced program evolves the ability to send email ? - well firedrop just became that advanced.

The new version with plugins is now released - and I've just finished FireMail. FireMail adds the ability to email an entry to a specified e-mail address - as HTML or text. This was something I really missed from Blogger.

In the new version Python source colouring is built in to Firedrop. Here's the email function I use (along with a function from the new Python Cookbook to construct the html email) :

def mailme(to_email, msg, from_email=None, host='mail', port=25, user=None, password=None, html=True, subject=None):
    """Email function for an smtp server.
    Needs hostname, username, password etc.
    Takes either a single to_email or a list, and a single from_email
    Can either receive an HTML email (output by 'createhtmlmail') or will build message headers
    Pass in html=False keyword for it to build headers.
    """

    head = "To: %s\r\n" % ','.join(to_email)
    if from_email is not None:
        head = head + ('From: %s\r\n' % from_email)
    if not html:
        if subject is not None:
            head = head + ("Subject: %s\r\n\r\n" % subject)
    msg = head + msg

    server = smtplib.SMTP(host, port)
    if user:
        server.login(user, password)
    server.sendmail(from_email, to_email, msg)
    server.quit()

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-04-04 10:59:47 | |

Categories: , ,


#19

First Time Flasher

This weekend I flashed a BIOS for the first time. It was for Aidan who had killed his 20gigabyte hard drive. Bereft of the solitaire that she played on his computer, his gran bought him a nice new 120gig hard drive. Needless to say his six year old motherboard went into a sulk and refused to boot with it plugged in.

Now, flashing a BIOS is a nerve wracking process. Get it wrong and you can't use the computer at all. Thankfully gigabyte [1] (manufacturers of the motherboard) made it almost problem free. Their website is very easy to navigate and it was easy to find the right BIOS image, even for a piece of hardware that is pretty obsolete. They also had clear instructions on how to do the upgrade.

I say almost problem free..... when we ran the BIOS upgrade program from a DOS boot disk it warned us that the BIOS image we had didn't match the hardware. Luckily according to their instructions the next screen ought to give us an option to backup the existing BIOS - so I merrily hit continue... which proceeded to flash the BIOS anyway !! Thankfully it was the correct image and the new hard drive is successfully installed.

The Panda Replies

I had a very nice reply to my blog from Josh Yelon, who is the maintainer for Panda3D. He says that although Panda3D doesn't natively support open source modelling tools, there is work afoot on a blender exporter and a program called milkshape 3D might be able to produce DirectX "X" files which can be converted. It's not over good news, but if the blender support gets there then I might have another look at blender. Last time I looked the UI seemed so unintuitive that I threw my hands up in horror. 3D modelling needn't be over complex [2] - but blender clearly states that they're not interested in newbies. Oh well.

It seems that Josh saw my blog entry through getting some referrals from it. This is probably because this blog is now listed at Planet Python. If I was to stick with tradition, I ought to start making lots of off topic and irrelevant blogs ;-) I suppose I haven't done too badly by starting with the tales of my BIOS exploits.

Starting to Cook

On a quick flick through I wasn't overly impressed with the Python Cookbook. Some of the early recipes covered dealing with whitespace through the aString.strip() method - and such complexities. However now that I'm getting into it I'm finding it extremely useful. It's already solved a problem I had with binary uploads being terminated on windows (need to switch to binary upload mode first !), and a neat recipe on sending HTML emails. This is becoming the new Firemail extension for Firedrop - so I can email new posts to Void-Shockz like I could with blogger. Whilst we're on the Firedrop subject.... the new version (with plugins) is now available over at http://zephyrfalcon.org/download

[1]I would give you a link to their website, but I'm not online at home where I'm typing this blog. Offline blogging is one of the great features of Firedrop of course Smile
[2]At least Real3D on the Amiga was nice to use Smile An old program Extreme 3D (for the PC) wasn't bad either.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-04-03 15:09:57 | |

Categories: ,


Hosted by Webfaction

Counter...