Python Programming, news on the Voidspace Python Projects and all things techie.

#67

A Tale of a Cheap Host

emoticon:warning It's a good job I was aware of the risks, but my cheepo host Sapphiresoft.co.uk has just stiffed me. Despite their appalling lack of technical ability, my site was down surprisingly little during the time I was with them. A couple of weeks ago their site went down - and all the sites they hosted. The outage lasted about a week - and I jumped ship.

I've probably lost a few emails, and a bit of data that's generated dynamically from my web apps. I kept a very good backup of everything though. Their site is back up now, but my account never came back online. When I asked why they claimed I was running an IRC bot (on a cheepo shared hosting account !). I asked for my data back, and got this reply :

Terms are Terms and not to obey them is not our Fault, how serious it can affect us is the point of concern here and that is why with an immediate effect your account has been suspended and can't be restored. You are entitled to the fine of $500 if you are ready to clear the fine we can give your 24 hrs temporary access to your data to backup it. Also, make sure that mostlikely within next 48 hrs your account will be completely removed from the servers.

I think the legal term for that is extortion !

Due to circumstances, I had little choice but to use a very cheap host when I set up with them. Thanks to a website I've put together for someone else I now have a bit more choice in replacement host. The options seem to be :

  • A more reliable cheap host, like streamline.net [1]
  • A managed account like Python-hosting.com, that allows me to experiment with a web application framework
  • A virtual server account from someone like unixshell.com, where I have to install and configure everything

Hmmm.......

[1]And use the time to learn how to use the facilities a virtual server would offer.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-07-01 08:50:03 | |

Categories:


#66

A Word from the Editor

emoticon:bugs I'm now mainly using SPE as my IDE. A chap called Nicola Larosa [1] has forced me to clean up my coding style (no long lines) - so I needed to complete my move from IDLE (which has no hard right edge) to a proper IDE (tabbed multiple files, with integrated shell). SPE is very nice - I've used it before, but never had the impetus to switch over properly. The fact that it loads quicker than Komodo [2] was the deciding factor in choosing SPE.

I struggled with SPE at first. I didn't realise that enter behaved differently from return. With enter, no indentation is done - but it looked to me like sometimes indentation would work, and sometimes it wouldn't.... One quick change of habits later, and all is fine again.

I still miss project management [3] - but the file browser is nearly as good. Another thing I'd like to see for the future - integration with SVN !

SPE also has Kiki built in to it, for working with regular expressions. Unfortunately It's not much use to me. I usually edit regexes in VERBOSE mode - over several lines. This makes it quicker to test them with the shell, where I can just paste several lines Crying or Very sad .

My only other minor gripes with SPE (so far) - is that it attempts auto completion inside triple quotes. It's amazing how often you use the period (triggering auto completion) when typing comments Laughing . As far as I can tell, it also lacks a way of converting between line endings [4].

Anyway, that has also sounded far more negative than I intended. The bottom line is that I've chosen an IDE - and SPE has come out on top Very Happy . No program is perfect, and one of the great advantages of SPE is that it is under constant development, and Stani is very open to suggestions. It's no surprise it's the editor that's bundled with Movable Python. As soon as the next version of SPE is released, I think it's time to bring Movable Python up to date with the latest releases of Python.

One Last Thing

I've been listening to the audio version of the ITC interview with Guido, about the development of Python and the Python community. It's very interesting - stick this in your IPod and chew on it Laughing . Part I is a 24.5mb mp3, Part II is a bit longer - and I'm in the process of downloading it now.

One of the really interesting parts, is when the guy who introduces Guido explains some of the things he's doing with Python. He's doing an application that uses medusa, PIL, and reportlab... hmm.... So medusa is still useful hey, last time I looked at medusa it was full of code written for Python 1.5 Wink [5].

Another Last Thing

The experimental replacement for ConfigObj is growing. The initial implementation was in about 160 lines of code. I'm now up to about 300, but the core has hardly changed at all [6]. I'm adding string interpolation, reading in list values, proper error handling (etc etc). I'm finding it quicker to work my way through the list of features than I expected.


[1]He's been helping me with rest2web. He actually cleaned up a lot of the code himself.
[2]Although I like the wavy underlines that highlight errors and warnings in Komodo. If I used the debugger more that might swing it too. Stani is looking for a debugger to integrate with SPE, if he can't find one he's threatening to write one himself...
[3]Which Komodo has.
[4]This means, when I get a file with mixed line endings I have to use another tool Confused .
[5]I am looking for an easy way into web frameworks. Someone told me that Medusa plus Quixote was probably the best way to start. Eventually I'd like to get Twisted.
[6]It quite elegantly handles nested sections.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-07-01 08:19:52 | |

Categories:


#65

Regular Expressions

Evil or Very Mad Regular expressions are evil, right ?

Well, one of the reasons they can be so intimidating is that they look like Perl.. Bad Grin - I mean they're hard to read. Well they don't have to be that hard to read.

I accidentally stumbled across the onlamp article Five Habits for Successful Regular Expressions. He lays out a few guidelines, which includes using the non greedy operators and also using the re.VERBOSE setting to make your regular expressions readable. Sound advice. My new config file reader uses regular expressions. The one for detecting a section marker line looks like :

#
sectionmarker = re.compile(r'''^
(\s*)                   # indentation
\[\s*
(                       # section name
    (?:".+?")|          # double quotes
    (?:'.+?')|          # single quotes
    (?:[^'"].*?)        # unquoted
)
\s*\]
(\s*\#.*)?$             # optional comment
'''
, re.VERBOSE)

This pulls out the indentation level, section name, and any comment in three groups.

Warning

You have to remember to escape any # symbols you use when in verbose mode....

It took me a while to track that one down. Laughing

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-06-28 13:22:53 | |

Categories:


#64

Other News and Stuff

Cool There is now a mailing list for rest2web development. This will probably consist of myself and Nicola Larosa discussing the ways to implement tagging and site maps and improve the support for multiple translations Very Happy . See :

http://lists.sourceforge.net/mailman/listinfo/rest2web-develop

The latest release of rest2web can also now be downloaded from Sourceforge [1].

Thanks to the Google summer of code, there will now be two developers working on Wax over the summer. Fantastic news. For those who don't know, Wax is a very friendly GUI built on wxWidgets.

Pythonware-Daily-URL has had a slow few weeks. During this time I've discovered the joys of the Del.icio.us Python Tags - this is interesting Python stuff, as discovered by Python users across the globe. Let other people do the hunting for you Smile .

And in another delightful snippet, it turns out that google highly rates my Python programs. Search for useful python programs to see what I mean... Wink

[1]Word from Sourceforge is that they see moving over to SVN as a priority. That's also very good news for open source projects.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-06-28 13:10:37 | |

Categories:


#63

Blimey, That Was Quick

emoticon:exclaim The slow progression towards ConfigObj 4 is actually happening. I now have a working implementation of a config file reader that can read config files with nested sections. Subsections are nested by indentation.

The initial reader is done in about 160 lines of code and is quite elegant. It's a re-write from the ground up and uses a couple of regexes to do it's parsing, which makes it much faster than ConfigObj 3.

There is a huge list of features to add now - not least of which is the write and writein methods.

Many thanks to Nicola Larosa for helping me work through the specs. As usual, actually creating the reader helped me hammer out the details. The new version will break backwards compatibility in several ways. The first of these may be controversial - we're gaining case sensitivity. The new implementation, as well as adding nested sections, allows us to lose the empty section and the difference between flatfiles and non-flatfiles. You just have a straight dictionary interface to all config files.

The new version will read all current config files without change. You may need to change your code to use it though....

For those who can't wait this is the reader (it doesn't currently extract the comment (or lists) from values - it just separates keyword from value) :

# configreader.py
# 26th June 2005
# An experimental config file reader that supports
# nested sections in config files.

# Will become ConfigObj 4

# Copyright Michael Foord, 2004 & 2005.
# Released subject to the BSD License
# Please see http://www.voidspace.org.uk/python/license.shtml

# For information about bugfixes, updates and support, please join the
# Pythonutils mailing list.
# http://groups.google.com/group/pythonutils/
# Comments, suggestions and bug reports welcome.
# Scripts maintained at http://www.voidspace.org.uk/python/index.shtml
# E-mail fuzzyman@voidspace.org.uk

import os
import sys
import re
# This is a workaround for python 2.2 (Which doesn't have basestring)
try:
    basestring
except NameError:
    basestring = (unicode,str)


class Section(dict):
    """
    A dictionary like object that represents a section in a config file.
    """

    def __init__(self, parent, indent, main, *args, **kwargs):
        """
        parent is the section above (or None)
        indent is the indent level of this section
        main is the main ConfigObj
        """

        dict.__init__(self, *args, **kwargs)
        self.parent = parent
        self.main = main
        self.indent = indent
        self.sequence = []


class ConfigObj(Section):
    #
    indent_dict = { ' ' : '\t', '\t' : ' ' }
    #
    def __init__(self, infile, options=None):
        if options is None:
            options = {}
        Section.__init__(self, self, 0, self)
        #
        keyword = re.compile(r'''^
        (\s*)                   # indentation
        (                       # keyword
            (?:".+?")|
            (?:'.+?')|
            (?:[^'"\s=][^=\s]*?)
        )
        \s*=\s*                 # divider
        (.*)$                   # value
        '''
, re.VERBOSE)
        #
        sectionmarker = re.compile(r'''^
        (\s*)                   # indentation
        \[\s*((?:".+?")|        # section name - double quotes
            (?:'.+?')|          # single quotes
            (?:[^'"].*?))       # unquoted
        \s*\]
        (\s*\#.*)?$               # optional comment
        '''
, re.VERBOSE)
        #
        if isinstance(infile, basestring):
            self.filename = infile
            self.filepath = os.path.abspath(infile)
            # XXXX generator instead ?
            infile = open(self.filepath).readlines()
        #
        # strip trailing '\n' from lines
        infile = [line.rstrip() for line in infile]
        #
        # initialise a few variables
        self.errors = []
        this_section = self
        self.indent_type = None
        #
        maxline = len(infile) - 1 # 2 ?
        self.index = -1
        while self.index < maxline:
            self.index += 1
            line = infile[self.index]
            sline = line.strip()
            # do we have anything on the line ?
            if not sline or sline.startswith('#'):
                continue
            mat = sectionmarker.match(line)
#            print sline, mat
            if mat is not None:
                # is a section line
                indent, sec_name, comment = mat.groups()
                try:
                    self.check_indent(indent)
                except SyntaxError:
                    raise SyntaxError, 'Mixed indentation line %s.' % self.index
                indent_len = len(indent)
                if indent_len == this_section.indent:
                    # the new section is in the parent of the current section
                    parent = this_section.parent
                elif indent_len > this_section.indent:
                    # the new section is *in* the current section
                    parent = this_section
                else:
                    # the new section is dropping back to a previous level
                    try:
                        parent = self.match_indent(this_section,
                                    indent_len).parent
                    except SyntaxError:
                        raise SyntaxError, 'Indent level doesn\'t' \
                                            'match at line %s.' % self.index
                if parent.has_key('sec_name'):
                    raise SyntaxError, 'Duplicate section' \
                                            'name at line %s.' % self.index
                this_section = Section(parent, indent_len, self)
                parent[sec_name] = this_section
                parent.sequence.append(sec_name)
                continue
            #
            mat = keyword.match(line)
#            print sline, mat
            if mat is not None:
                # is a keyword value
                # value will include any inline comment
                indent, key, value = mat.groups()
                try:
                    self.check_indent(indent)
                except SyntaxError:
                    raise SyntaxError, 'Mixed indentation line %s.' % self.index
                indent_lev = len(indent)
                if this_section.indent != indent_lev:
                    try:
                        this_section = self.match_indent(this_section,
                                                            indent_lev)
                    except SyntaxError:
                        raise SyntaxError, 'Indent level doesn\'t' \
                                            'match at line %s.' % self.index
                    else:
                        if this_section.has_key(keyword):
                            raise SyntaxError, 'Duplicate keyword' \
                                            'name at line %s.' % self.index
                this_section[key] = value

    def check_indent(self, indent):
        """
        Check for consistent indentation.
        Raise a ``SyntaxError`` if indentation is incorrect or mixed.
        Set the ``self.indent_type`` attribute if not already set.
        """

        if self.indent_type is not None:
            if self.indent_dict[self.indent_type] in indent:
                raise SyntaxError, 'Mixed indentation line %s.' % self.index
        elif ' ' in indent and '\t' in indent:
            raise SyntaxError, 'Mixed indentation line %s.' % self.index
        elif indent:
            self.indent_type = indent[0]

    def match_indent(self, this_section, indent_len):
        """
        Given a section and an indent level,
        walk back through the sections parent to see if the indent level
        matches a previous section.

        Return a marker to the right section,
        or raise a SyntaxError.
        """

        sect = this_section
        while indent_len < sect.indent:
            if sect is sect.parent:
                # we've reached the top level already
                raise SyntaxError
            sect = sect.parent
        if sect.indent == indent_len:
            return sect
        # indentation didn't match - too bad
        raise SyntaxError


if __name__ == '__main__':
    testconfig1 = """
key1= val
key2= val
[lev 1]
key1= val
key2= val
[lev1b]
key1= val
key2= val
    [lev2]
    key1= val
    [lev2-section1]
    key1= val
[lev1c]
    [lev2]
        [lev2-section1]
    key1= val
"""


    testconfig2 = """
key1 = val1
key2 = val2
key3 = val3
[section 1] # comment
keys11 = val1
keys12 = val2
keys13 = val3
[section 2]
keys21 = val1
keys22 = val2
keys23 = val3
    [section 2 sub 1]
    fish = 3
"""


    print ConfigObj(testconfig1.split('\n'))
    print
    print ConfigObj(testconfig2.split('\n'))


##################################################

"""
Need code to check if the root section has an indentation > 0
    (default 'this_section' has indent=0 set)
Feature ? (or limitation) - a keyword can't have the same name as a section.
TODO
    list values
    String Interpolation
    write method
    writein method
    configspec
    validate
    make ``Section`` an ordered dict - that follows the sequence attribute
    Store comments from the same line
    Store comments from the line above (several lines of comments ?)
    Preserve start and end comments.
    Allow triple quoted multiple line values.
    Allow multiline comments.
    infile can be a filename, a list of lines, a file, or a StringIO
        (any bounded iterable ? - could we allow a string ?)
    Store errors and re-raise *after* parsing
    Unicode support - (needed ?)
    Remove (and preserve ?) unicode signatures (BOM)
    A walker method - which will walk all keywords and/or values and perform
                        an operation on them, e.g. unicode, or 'string-escape'
                        could be written as part of validate support anyway
    Preserve amount of whitespace around divider ?
    Preserve quoting used around keywords and arguments ?

OPTIONS TODO
    configspec
    fileerror
    createempty
    stringify
    lists
    interpolation
    encoding
    backup_encoding
    (newline, force_return, default)

INCOMPATIBLE CHANGES
    Case sensitive
    The only valid divider is '='
    We've removed line continuations with '\'
    No recursive lists in values
    No empty section.
    No distinction between flatfiles and non flatfiles.

ISSUES

"""

Caution!

This parser works with the two example configs here. I haven't yet built proper tests for it - so it's possible it still has some basic bugs.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-06-28 12:43:22 | |

Categories:


#62

The March of Progress

emoticon:dove I didn't have too productive a weekend, unfortunately it was full of all sorts of meetings sigh. Smile

I did manage to finish an update to restweb. This is available for direct download, and via SVN. [1]

Sourceforge will be updated soon [2].

We're now up to version 0.3.0. The changes since the last announced release are :

Version 0.3.0       2005/06/27
Code refactored and better commented.
    Thanks to Nicola Larosa for input.
Minor bugfix - an encoding was missing.
Added stylesheet to docutils options override.

Version 0.2.3       2005/06/25
Code style cleanup with help from Nicola Larosa.
Start of the refactoring (some code is simpler internally)
``uservalues`` now compatible with reST.
docs updated appropriately.

The major changes are the code cleanup and refactoring. If you want to understand how rest2web works, it should now be possible to follow it in the code [3]!

From the users point of view, the biggest change is that you can now use uservalues with reST markup. This does have limitations that I might try and find a way round. See the uservalues page for the details.

ConfigObj 4

As promised a while ago I've actually starting working on a ground up rewrite of ConfigObj. The biggest change is that it will support nested config files - using indentation for nesting the sections.

I've worked out a spec with Nicola Larosa. There will be several backward incompatible changes, and the full feature set (write and writein methods, validation etc) is going to take a while. I ought to have a working config reader running in a few days though.

[1]The SVN repository also has the Pythonutils dependencies.
[2]Flipping work internet restrictions, grumble Sad .
[3]No more passing data structures in and out of the restprocessor Very Happy .

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-06-27 09:31:22 | |

Categories:


#61

With Added Schwing......

emoticon:worship There is a major re-schwing going on here at Voidspace. Justin is weaving his CSS magic. At some point I'll put up a proper link to fuschiashock.co.uk for him.

My Voidspace email and the Pythonutils mail list are still down, but everything else is funky.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2005-06-25 15:12:41 | |

Categories:


Hosted by Webfaction

Counter...