API Documentation for listquote

Autogenerated API Docs

Contents

This is an autogenerated API Doc for the module "listquote".

It was generated on: Monday, January 01 06:46 PM.

Having written modules to handle turning a string representation of a list back into a list (including nested lists) and also a very simple CSV parser, I realised I needed a more solid set of functions for handling lists (comma delimited lines) and quoting/unquoting elements of lists.

The test stuff provides useful examples of how the functions work.

Functions

csvread

def csvread(infile):


Given an infile as an iterable, return the CSV as a list of lists.

infile can be an open file object or a list of lines.

If any of the lines are badly built then a ``CSVError`` will be raised.
This has a ``csv`` attribute - which is a reference to the parsed CSV.
Every line that couldn't be parsed will have ``[]`` for it's entry.

The error *also* has an ``errors`` attribute. This is a list of all the
errors raised. Error in this will have an ``index`` attribute, which is the
line number, and a ``line`` attribute - which is the actual line that
caused the error.

Example of usage :

.. raw:: html

    {+coloring}

    handle = open(filename)
    # remove the trailing '

System Message: WARNING/2 (C:\Python Projects\modules in progress\ApiDoc\output\listquote.txt, line 57)

Literal block ends without a blank line; unexpected unindent.
' from each line
the_file = [line.rstrip('
') for line in handle.readlines()]

csv = csvread(the_file)

{-coloring}

>>> a = '''"object 1", 'object 2', object 3
...     test 1 , "test 2" ,'test 3'
...     'obj 1',obj 2,"obj 3"'''
>>> csvread(a.splitlines())
[['object 1', 'object 2', 'object 3'], ['test 1', 'test 2', 'test 3'], ['obj 1', 'obj 2', 'obj 3']]
>>> csvread(['object 1,'])
[['object 1']]
>>> try:
...     csvread(['object 1, "hello', 'object 1, # a comment in a csv ?'])
... except CSVError, e:
...     for entry in e.errors:
...         print entry.index, entry
0 Value is badly quoted: ""hello"
1 Comment not allowed :
object 1, # a comment in a csv ?

csvwrite

def csvwrite(inlist, stringify=False):


Given a list of lists it turns each entry into a line in a CSV.
(Given a list of lists it returns a list of strings).

The lines will *not* be ``

System Message: WARNING/2 (C:\Python Projects\modules in progress\ApiDoc\output\listquote.txt, line 93)

Literal block ends without a blank line; unexpected unindent.

`` terminated.

Set stringify to True (default is False) to convert entries to strings before creating the line.

If stringify is False then any non string value will raise a TypeError.

Every member will be quoted using elem_quote, but no escaping is done.

Example of usage :

{+coloring} # escape each entry in each line (optional) for index in range(len(the_list)): the_list[index] = [quote_escape(val) for val in the_list[index]] # the_file = csvwrite(the_list) # add a '

System Message: WARNING/2 (C:\Python Projects\modules in progress\ApiDoc\output\listquote.txt, line 115)

Block quote ends without a blank line; unexpected unindent.
' to each line - ready to write to file
the_file = [line + '

System Message: WARNING/2 (C:\Python Projects\modules in progress\ApiDoc\output\listquote.txt, line 117)

Definition list ends without a blank line; unexpected unindent.

' for line in the_file]

{-coloring}
>>> csvwrite([['object 1', 'object 2', 'object 3'], ['test 1', 'test 2', 'test 3'], ['obj 1', 'obj 2', 'obj 3']])
['"object 1", "object 2", "object 3"', '"test 1", "test 2", "test 3"', '"obj 1", "obj 2", "obj 3"']
>>> csvwrite([[3, 3, 3]])
Traceback (most recent call last):
TypeError: Can only quote strings. "3"
>>> csvwrite([[3, 3, 3]], True)
['3, 3, 3']

elem_quote

def elem_quote(member, nonquote=True, stringify=False, encoding=None):


Simple method to add the most appropriate quote to an element - either single
quotes or double quotes.

If member contains ``

System Message: WARNING/2 (C:\Python Projects\modules in progress\ApiDoc\output\listquote.txt, line 142)

Literal block ends without a blank line; unexpected unindent.
`` a QuoteError is raised - multiline values

can't be quoted by elem_quote.

If nonquote is set to True (the default), then if member contains none of '," []()#; then it isn't quoted at all.

If member contains both single quotes and double quotes then all double quotes (") will be escaped as &mjf-quot; and member will then be quoted with double quotes.

If stringify is set to True (the default is False) then non string (unicode or byte-string) values will be first converted to strings using the str function. Otherwise elem_quote raises a TypeError.

If encoding is not None and member is a byte string, then it will be decoded into unicode using this encoding.

>>> elem_quote('hello')
'hello'
>>> elem_quote('hello', nonquote=False)
'"hello"'
>>> elem_quote('"hello"')
'\'"hello"\''
>>> elem_quote(3)
Traceback (most recent call last):
TypeError: Can only quote strings. "3"
>>> elem_quote(3, stringify=True)
'3'
>>> elem_quote('hello', encoding='ascii')
u'hello'
>>> elem_quote('\n')
Traceback (most recent call last):
QuoteError: Multiline values can't be quoted.
"
"

lineparse

def lineparse(inline, options=None, **keywargs):

A compatibility function that mimics the old lineparse.

Also more convenient for single line use.

Note: It still uses the new LineParser - and so takes the same keyword arguments as that.

>>> lineparse('''"hello", 'goodbye', "I can't do that", 'You "can" !' # a comment''')
(['hello', 'goodbye', "I can't do that", 'You "can" !'], '# a comment')
>>> lineparse('''"hello", 'goodbye', "I can't do that", 'You "can" !' # a comment''', comment=False)
Traceback (most recent call last):
CommentError: Comment not allowed :
"hello", 'goodbye', "I can't do that", 'You "can" !' # a comment
>>> lineparse('''"hello", 'goodbye', "I can't do that", 'You "can" !' # a comment''', recursive=False)
(['hello', 'goodbye', "I can't do that", 'You "can" !'], '# a comment')
>>> lineparse('''"hello", 'goodbye', "I can't do that", 'You "can" !' # a comment''', csv=True)
Traceback (most recent call last):
CommentError: Comment not allowed :
"hello", 'goodbye', "I can't do that", 'You "can" !' # a comment
>>> lineparse('''"hello", 'goodbye', "I can't do that", 'You "can" !' ''', comment=False)
['hello', 'goodbye', "I can't do that", 'You "can" !']
>>> lineparse('')
('', '')
>>> lineparse('', force_list=True)
([], '')
>>> lineparse('[]')
([], '')
>>> lineparse('()')
([], '')
>>> lineparse('()', force_list=True)
([], '')
>>> lineparse('1,')
(['1'], '')
>>> lineparse('"Yo"')
('Yo', '')
>>> lineparse('"Yo"', force_list=True)
(['Yo'], '')
>>> lineparse('''h, i, j, (h, i, ['hello', "f"], [], ([]),), k''')
(['h', 'i', 'j', ['h', 'i', ['hello', 'f'], [], [[]]], 'k'], '')
>>> lineparse('''h, i, j, (h, i, ['hello', "f"], [], ([]),), k''', recursive=False)
Traceback (most recent call last):
BadLineError: Line is badly built :
h, i, j, (h, i, ['hello', "f"], [], ([]),), k
>>> lineparse('fish#dog')
('fish', '#dog')
>>> lineparse('"fish"#dog')
('fish', '#dog')
>>> lineparse('(((())))')
([[[[]]]], '')
>>> lineparse('((((,))))')
Traceback (most recent call last):
BadLineError: Line is badly built :
((((,))))
>>> lineparse('hi, ()')
(['hi', []], '')
>>> lineparse('"hello", "",')
(['hello', ''], '')
>>> lineparse('"hello", ,')
Traceback (most recent call last):
BadLineError: Line is badly built :
"hello", ,
>>> lineparse('"hello", ["hi", ""], ""')
(['hello', ['hi', ''], ''], '')
>>> lineparse('''"member 1", "member 2", ["nest 1", ("nest 2", 'nest 2b', ['nest 3', 'value'], nest 2c), nest1b]''')
(['member 1', 'member 2', ['nest 1', ['nest 2', 'nest 2b', ['nest 3', 'value'], 'nest 2c'], 'nest1b']], '')
>>> lineparse('''"member 1", "member 2", ["nest 1", ("nest 2", 'nest 2b', ['nest 3', 'value'], nest 2c), nest1b]]''')
Traceback (most recent call last):
BadLineError: Line is badly built :
"member 1", "member 2", ["nest 1", ("nest 2", 'nest 2b', ['nest 3', 'value'], nest 2c), nest1b]]

list_stringify

def list_stringify(inlist):

Recursively rebuilds a list - making sure all the members are strings.

Can take any iterable or a sequence as the argument and always returns a list.

Useful before writing out lists.

Used by makelist if stringify is set.

Uses the str function for stringification.

Every element will be a string or a unicode object.

Doesn't handle decoding strings into unicode objects (or vice-versa).

>>> list_stringify([2, 2, 2, 2, (3, 3, 2.9)])
['2', '2', '2', '2', ['3', '3', '2.9']]
>>> list_stringify(None)
Traceback (most recent call last):
TypeError: iteration over non-sequence
>>> list_stringify([])
[]

FIXME: can receive any iterable - e.g. a sequence >>> list_stringify('') [] >>> list_stringify('Hello There') ['H', 'e', 'l', 'l', 'o', ' ', 'T', 'h', 'e', 'r', 'e']

makelist

def makelist(inlist, listchar='', stringify=False, escape=False, encoding=None):

Given a list - turn it into a string that represents that list. (Suitable for parsing by LineParser).

listchar should be '[', '(' or ''. This is the type of bracket used to enclose the list. ('' meaning no bracket of course).

If you have nested lists and listchar is '', makelist will automatically use '[' for the nested lists.

If stringify is True (default is False) makelist will stringify the inlist first (using list_stringify).

If escape is True (default is False) makelist will call quote_escape on each element before passing them to elem_quote to be quoted.

If encoding keyword is not None, all strings are decoded to unicode with the specified encoding. Each item will then be a unicode object instead of a string.

>>> makelist([])
'[]'
>>> makelist(['a', 'b', 'I can\'t do it', 'Yes you "can" !'])
'a, b, "I can\'t do it", \'Yes you "can" !\''
>>> makelist([3, 4, 5, [6, 7, 8]], stringify=True)
'3, 4, 5, [6, 7, 8]'
>>> makelist([3, 4, 5, [6, 7, 8]])
Traceback (most recent call last):
TypeError: Can only quote strings. "3"
>>> makelist(['a', 'b', 'c', ('d', 'e'), ('f', 'g')], listchar='(')
'(a, b, c, (d, e), (f, g))'
>>> makelist(['hi\n', 'Quote "heck\''], escape=True)
'hi&mjf-lf;, "Quote &mjf-quot;heck\'"'
>>> makelist(['a', 'b', 'c', ('d', 'e'), ('f', 'g')], encoding='UTF8')
u'a, b, c, [d, e], [f, g]'

quote_escape

def quote_escape(value, lf='&mjf-lf;', quot='&mjf-quot;'):


Escape a string so that it can safely be quoted. You should use this if the
value to be quoted *may* contain line-feeds or both single quotes and double
quotes.

If the value contains ``

System Message: WARNING/2 (C:\Python Projects\modules in progress\ApiDoc\output\listquote.txt, line 354)

Literal block ends without a blank line; unexpected unindent.
`` then it will be escaped using lf. By

default this is &mjf-lf;.

If the value contains single quotes and double quotes, then all double quotes will be escaped using quot. By default this is &mjf-quot;.

>>> quote_escape('hello')
'hello'
>>> quote_escape('hello\n')
'hello&mjf-lf;'
>>> quote_escape('hello"')
'hello"'
>>> quote_escape('hello"\'')
"hello&mjf-quot;'"
>>> quote_escape('hello"\'\n', '&fish;', '&wobble;')
"hello&wobble;'&fish;"

quote_unescape

def quote_unescape(value, lf='&mjf-lf;', quot='&mjf-quot;'):

Unescape a string escaped by quote_escape.

If it was escaped using anything other than the defaults for lf and quot you must pass them to this function.

>>> quote_unescape("hello&wobble;'&fish;",  '&fish;', '&wobble;')
'hello"\'\n'
>>> quote_unescape('hello')
'hello'
>>> quote_unescape('hello&mjf-lf;')
'hello\n'
>>> quote_unescape("'hello'")
"'hello'"
>>> quote_unescape('hello"')
'hello"'
>>> quote_unescape("hello&mjf-quot;'")
'hello"\''
>>> quote_unescape("hello&wobble;'&fish;",  '&fish;', '&wobble;')
'hello"\'\n'

simplelist

def simplelist(inline):

Parse a string to a list.

A simple regex that extracts quoted items from a list.

It retains quotes around elements. (So unquote each element)

>>> simplelist('''hello, goodbye, 'title', "name", "I can't"''')
['hello', 'goodbye', "'title'", '"name"', '"I can\'t"']

FIXME: This doesn't work fully (allows some badly formed lists): e.g. >>> simplelist('hello, fish, "wobble" bottom hooray') ['hello', 'fish', '"wobble"', 'bottom hooray']

unquote

def unquote(inline, fullquote=True, retain=False):

Unquote a value.

If the value isn't quoted it returns the value.

If the value is badly quoted it raises UnQuoteError.

If retain is True (default is False) then the quotes are left around the value (but leading or trailing whitespace will have been removed).

If fullquote is False (default is True) then unquote will only unquote the first part of the inline. If there is anything after the quoted element, this will be returned as well (instead of raising an error).

In this case the return value is (value, rest).

>>> unquote('hello')
'hello'
>>> unquote('"hello"')
'hello'
>>> unquote('"hello')
Traceback (most recent call last):
UnQuoteError: Value is badly quoted: ""hello"
>>> unquote('"hello" fish')
Traceback (most recent call last):
UnQuoteError: Value is badly quoted: ""hello" fish"
>>> unquote("'hello'", retain=True)
"'hello'"
>>> unquote('"hello" fish', fullquote=False)
('hello', ' fish')

Classes

Class BadLineError

class BadLineError(ListQuoteError):

A line is badly built.

Class CSVError

class CSVError(ListQuoteError):

The CSV File contained errors.

Class CommentError

class CommentError(BadLineError):

A line contains a disallowed comment.

Class LineParser

class LineParser(object):
    def __init__(self, options=None, **keywargs):

An object to parse nested lists from strings.

Attributes

  • liststart = {'(': ')', '[': ']'}
  • quotes = ["'", '"']

Methods

feed

def feed(self, inline, endchar=None):

Parse a single line (or fragment).

Uses the options set in the parser object.

Can parse lists - including nested lists. (If recursive is False then nested lists will cause a BadLineError).

Return value depends on options.

If comment is False it returns outvalue

If comment is True it returns (outvalue, comment). (Even if comment is just '').

If force_list is False then outvalue may be a list or a single item.

If force_list is True then outvalue will always be a list - even if it has just one member.

List syntax :

  • Comma separated lines a, b, c, d

  • Lists can optionally be between square or ordinary brackets
    • [a, b, c, d]
    • (a, b, c, d)
  • Nested lists must be between brackets - a, [a, b, c, d], c

  • A single element list can be shown by a trailing quote - a,

  • An empty list is shown by () or []

Elements can be quoted with single or double quotes (but can't contain both).

The line can optionally end with a comment (preeded by a '#'). This depends on the comment attribute.

If the line is badly built then this method will raise one of :

CommentError, BadLineError, UnQuoteError

Using the csv option is the same as setting :

'recursive': False
'force_list': True
'comment': False

reset

def reset(self, options=None, **keywargs):

Reset the parser with the specified options.

Class ListQuoteError

class ListQuoteError(SyntaxError):

Base class for errors raised by the listquote module.

Class QuoteError

class QuoteError(ListQuoteError):

This value can't be quoted.

Class UnQuoteError

class UnQuoteError(ListQuoteError):

The value is badly quoted.