Python Snippets and Recipes![]()
This is the page for various snippets, examples, and smaller python modules. If you have ever seen the Activestate Python Cookbook then you will know exactly what to expect here. In fact several of these recipes appear there, and a couple even made into the print version. Check out the Python Index Page for links to my Python CGIs, modules, and a few full blown programs.
A Simple HTML ParserVersion 1.3.0 6th September 2004Scraper is a class to parse HTML files. It contains methods to process the 'data portions' of an HTML page, and also the tags. These can be overridden to implement your own HTML processing methods in a subclass. This class does most of what HTMLParser.HTMLParser does - except without choking on bad HTML. It uses the regular expression and a chunk of logic from sgmllib.py (standard Python distribution). It is most useful when you want to modify a part of a page - and guarantee that the rest of the html will be unmodified by your parser. Download it here : Reducing Boilerplate with a MixinMay 2007Python rich comparison has always annoyed me, at least having to provide all the rich comparison methods has always annoyed me. Python Rich Comparison, is an article showing how CPython differs from IronPython in this respect. It also implements a 'rich comparison mixin' class, so that you can have all of them just by providing two methods (instead of six). You can download the mixin class and tests here: pathutils 0.2.6 November 8th 2007cgiutils 0.3.5 November 26th 2005These two modules contain functions and constants for 'general' use. They are both included as part of the Voidspace Pythonutils Package . Pathutils contains various functions for working with files and paths. This includes (amongst other things) :
See the pathutils Homepage. CGI utils has (guess what... It includes (amongst other things) :
See the cgiutils Homepage. Download them here :
The zip file has text versions of the docs for these two modules. Exploring Code Objects and BytecodeAnonymousCodeBlock is a function which transforms the body of a function into a code object which can be executed in the current scope, rather than creating a new scope. This is a an approximation of Ruby's anonymous code blocks for Python. It was created as part of the article Anonymous Code Blocks in Python, which is an exploration of code objects, byte-code and the Python scoping rules. If you're interested in Python byte-code and the Python VM, this article is a good introduction to the subject. You will need the BytePlay Module to follow the article or use the AnonymousCodeBlock function. Exploring MetaclassesMetaclasses have a reputation for being deep Python black magic. Actually you can understand the basics of them, and use them, quite straightforwardly. My article Metaclasses Made Easy explains the principles and takes you through a couple of simple metaclasses. It ends up with a Metaclass factory function, which can automatically decorate every method in a class. This could be useful for profiling for example. The actual example shown is used to remove the need to explicitly declare self. This is a silly use, but it can be adapted for other purposes (and the MetaClassFactory can be used directly). If you want to use the Selfless example, you will need the Byteplay module again. Responding to Error 401Version 1.0.1 3rd December 2004This code is actually a tutorial on http BASIC authentication. It demonstrates what basic authentication is, and shows two ways of handling it from python code. It is also a nice example of using urllib2. The code fills in quite a few gaps in the urllib2 documentation, particularly on handling errors. Download it here : This piece of example code has also been made into an article over at Article on Basic Authentication.
Handling Cookies When Fetching Web PagesVersion 1.0.2 14th February 2005cookielib is a module new to Python 2.4. It is for client side handling of cookies when making requests for web pages (using urllib2). Prior to Python 2.4 it existed as an extension module called ClientCookie. cookielib isn't a drop in replacement for ClientCookie though. cookielib is easier to use because it works directly with urllib2. This example demonstrates using cookielib and ClientCookie. It also shows code, that will work with either cookielib or ClientCookie (or even neither). This is useful if your script might be used on machines with different versions of Python. cookielib itself will work with urllib2 to fetch webpages for you. Using CookieJar instances it will save cookies sent with the pages you fetch. Further fetches will result in the cookies being handled in the same way a browser does. The cookies can even be saved between sessions. For fetching pages from some servers, proper cookie handling is essential. cookielib makes this basically transparent to the programmer. This code acts as a useful intro to the ClientCookie module, which at the time of writing had gaps in the documentation regarding the LWPCookieJar class used to save cookies. Download it here : This piece of example code has also been made into an article over at Article on cookielib and ClientCookie. See the World Through the Eyes of GoogleVersion 0.1.7 26th August 2005This is a simple implementation of a proxy server that fetches web pages from the google cache. It's based on SimpleHTTPServer and lets you explore the internet from your browser, using the google cache. See the world how google sees it. Alternatively, it's retro internet - no CSS, no javascript, no images, this is back to the days of MOSAIC ! To use, run this script and then set your browser proxy settings to localhost:9080. I've tested it on Windows XP with Python 2.3 and Firefox/IE and also had reports of it working with Opera/Firefox on Linux. It needs the google.py module (and a google license key). Some useful suggestions and fixes were given by 'vegetax' on comp.lang.python, and Lee Joramo at the Python Cookbook. Download it here : googleCacheServer.py (5k) How Well Ranked is Your Website ?Version 0.1.0 18th April 2005googlerank is another small recipe that uses the google api. You give it a domain (by setting a variable), and then a set of search terms via stdin. It then does google searches until a result from your domain is found (so long as it is within the first 200 results - 20 searches). This tells you how far down the list your site appears for given search terms. This is a useful tool for working with website design - it's also very interesting. It produces results that look like :
Download it here : googlerank.py (5k) A Simple Server with SSI ProcessingVersion 0.4.1 15th September 2005This is another simple server implementation, this one based on CGIHTTPServer. It does everything that CGIHTTPServer does, with some SSI processing as well. So far it handles include and flastmod, but adding additional instructions would be easy. Simply drop it in a directory and it will serve webpages, with that directory as the root directory. Any pages that end in .shtml or .shtm, it does SSI processing on. It can serve CGI scripts from locations with spaces in the path, and can also serve CGI scripts from subdirectories of the cgi-bin folder - windoze only fixes. As another added bonus it can serve files from two locations. This means you can keep a single folder with your latest edits in, and have it check that folder first. It can also run in proxy mode. This is useful for offline testing - where you want it to ignore any requests except those for the localhost domain. Download it here : For buying techie books, science fiction, computer hardware or the latest gadgets: visit The Voidspace Amazon Store. If you're looking for a new techie job, try the Voidspace Tech Job Board. This is part of the Hidden Network of technology and programming jobs.
Last edited Fri Feb 15 13:42:08 2008. Counter... |
|||||||||||||||
|
Blogads
Follow me on: Tech Jobs |
|||||||||||||||