repoze.catalog and ZODB beginners example – part 1

repoze.catalog and ZODB beginners example – part 1

Summary

The first of two posts which illustrate how to use repoze.catalog alongside ZODB

What’s ZODB ?

To quote the ZODB home page :

“The ZODB is a native object database, that stores your objects while allowing you to work with any paradigms that can be expressed in Python. Thereby your code becomes simpler, more robust and easier to understand”

What’s repoze.catalog ?

repoze.catalog is one of a number of frameworks which can be used to supply indexing for ZODB for those circumstances where accessing objects stored in ZODB would otherwise be unacceptably slow

Intended Audience

I’m assuming that readers of this post have a basic familiarity with ZODB. If you don’t there are lots of good resources out there of which ‘Example Driven ZODB‘ is a good example.

What’s the purpose of this post ?

For any reasonably experienced Python programmer using ZODB and repoze.catalog is pretty straightforward. Unfortunately a new user of repoze.catalog cannot find an example on the repoze.catalog site which shows both how to catalog items and save them into ZODB. This is understandable as repoze.catalog is not only for use with ZODB but I thought it was worthwhile doing a specific example for that scenario and that’s what on this page.

Where repoze.catalog helps ZODB

Because of the nature of ZODB it’s easy to access objects by the value they’re keyed on but otherwise it’s a question of a sequential search.

So instances of a class that look like this :

class Country(Persistent):
    def __init__(self, pop):
        self.name = name
        self.population = pop

Might be saved into the `root` property of a ZODB `connection` object like this :

root['un']['nz'] = Country('New Zealand', 4000000)

But subsequently if we wanted to obtain `Country` instances on the basis of population the key doesn’t help us at all and a scan of all `Country` objects would be necessary, like this :

for cou in root['un'].itervalues():
    if cou.population > 1000000:
        print cou.name

When we use repoze.catalog to catalogue a ZODB database we specify properties of objects that will be saved in ZODB and which interest us and by which will subsequently want to find the objects.

repoze.catalog allows us quickly and easily search for objects with property values that interest us.

Example Code

Here’s my example code and underneath I’ll expand a little more on what each part does:

'''
Demonstrates how to use repoze.catalog to catalogue objects
being stored in ZODB. This example has the catalog and ZODB
database as seperate repositories
'''
from repoze.catalog.catalog import FileStorageCatalogFactory
from repoze.catalog.catalog import ConnectionManager

from repoze.catalog.indexes.field import CatalogFieldIndex
from repoze.catalog.indexes.text import CatalogTextIndex

import transaction
from persistent import Persistent
from BTrees.OOBTree import OOBTree

from myzodb import MyZODB

factory = FileStorageCatalogFactory('../data/mdcatalog.db', 'mycatalog')

_initialized = False

def initialize_catalog():
    '''
    Create a repoze.catalog instance and specify
    indices of intereset

    NB: Use of global variable
    '''
    global _initialized
    if not _initialized:
        # create a catalog
        manager = ConnectionManager()
        catalog = factory(manager)
        # set up indexes
        catalog['names'] = CatalogTextIndex('name')
        catalog['populations'] = CatalogFieldIndex('population')
        # commit the indexes
        manager.commit()
        manager.close()
        _initialized = True

class City(Persistent):
    '''Represents a City by name and population'''
    def __init__(self, cityname, citypop):
        self.name = cityname
        self.population = citypop
    def __str__(self):
        return "%s  (Pop: %s)" % \
                (self.name, \
                str(self.population))

if __name__ == '__main__':
    initialize_catalog()
    manager = ConnectionManager()
    catalog = factory(manager)
    myzodbinstance = MyZODB('../data/mdzdb.fs')
    myzodbinstance.dbroot['cities'] = OOBTree()

    #For ease of demonstration set up a local dict
    #containing a number of `City` instances keyed
    #by a unique integer
    cities = {
        1:City('Windhoek', 322500),
        2:City('Pretoria', 525387),
        3:City('Nairobi', 3138295),
        4:City('Maputo', 1244227),
        5:City('Jakarta', 10187595),
        6:City('Canberra', 358222),
        7:City('Wellington', 393400),
        8:City('Santiago', 5428590),
        9:City('Buenos Aires', 2891082),
    }
    #Iterate over our local dict and for each
    #element generate the catlog entry for
    #repoze.catalog and add the corresponding
    #instance to the ZODB database we are
    #cataloguing
    for docid, doc in cities.items():
        catalog.index_doc(docid, doc)
        myzodbinstance.dbroot['cities'][docid] = doc
        transaction.commit()

    myzodbinstance.close()

Example Step by Step

Here’s a breakdown on what’s happening in the above example

Initialize repoze.catalog

initialize_catalog()
manager = ConnectionManager()
catalog = factory(manager)

The `initialize_catalog` function creates a repoze.catalog instance and initializes two indices : `names` and `populations`. These index the `name` and `population` properties of any objects indexed with the repoze.catalog instance just created

Make our ZODB database ready for use

myzodbinstance = MyZODB('../data/mdzdb.fs')

`MyZODB` is a convenience class which wraps up the instantiation of a ZODB database instance and provides : `storage`; `db`;`connection`; and `dbroot` properties to help the programmer interact with the ZODB database, connection, storage objects. `MyZODB` also provides a close method to cleanly close the ZODB database, connection and storage.

`MyZODB` is not explicitly included in the above example but it looks like this :

from ZODB import FileStorage, DB
class MyZODB(object):
    '''Manage the state of a ZODB FileStorage connection'''
    def __init__(self, path):
        self.storage = FileStorage.FileStorage(path)
        self.db = DB(self.storage)
        self.connection = self.db.open()
        self.dbroot = self.connection.root()
    def close(self):
        self.connection.close()
        self.db.close()
        self.storage.close()</pre>

Create a sub-tree in ZODB for our `City` objects

Now we have an instance of `MyZODB` we can treat the `dbroot` property (which corresponds to the ZODB `dbroot` property of the ZODB `connection` object) as a plain old dictionary and assign a value to it under some key of our choosing, for our example because we’re going to save a set of `City` objects we’ve chosen ‘cities’.

At this stage we just assign an instance of `OOBTree` to that ‘cities’ key. An OOBTree instance acts like a dictionary but, when a lot of elements are within it, works much more efficiently for the purposes of ZODB.

myzodbinstance.dbroot['cities'] = OOBTree()

Create a set of `City` objects

Now we pause for a moment and make ourselves a set of `City` objects and put them into a dictionary for later use.

What’s significant here is that the key used to save each `City` instance is a unique integer which has no specific meaning in itself, we’ll see why in a moment.

cities = {
1:City('Windhoek', 322500),
2:City('Pretoria', 525387),
3:City('Nairobi', 3138295),
4:City('Maputo', 1244227),
5:City('Jakarta', 10187595),
6:City('Canberra', 358222),
7:City('Wellington', 393400),
8:City('Santiago', 5428590),
9:City('Buenos Aires', 2891082),
}

Save `City` objects to ZODB and index them

Now at last we’re going to do what we’ve come for.

We iterate over our set of `City` instances and for each one we make use of the `index_doc` method of repoze.catalog . Notice that the two arguments are the integer we’ve arbitarily associated with each `City` instance, ‘docid’ in this example, and the `City` instance itself, ‘doc’ in this example. By using the `index_doc` method we update the catalog entries maintained by repoze.catalog

In the same interation we assign the `City` object instance, ‘doc’ to our `OOBTree` (stored under the ‘cities’ key of `dbroot`) using as an index the same integer we’ve just passed to the `index_doc` call.

Finally we make use of the ZODB Transaction manager to commit our changes. Because repoze.catalog is actually a ZODB database inside our single transaction is sufficient to commit both the catalog update and the actual update of the ZODB database.

for docid, doc in cities.items():
    catalog.index_doc(docid, doc)
    myzodbinstance.dbroot['cities'][docid] = doc
    transaction.commit()

Credit where credits due

The example I’ve shown here owes some parts to one of the examples on the repoze.catalog website. The structure of the `myZODB` was taken from the article cited above,  ‘Example Driven ZODB‘ . Lastly I got some useful advice in response to a question I posed on StackOverflow and I’m grateful to the people who provided answers .

In Closing

That’s all there is to it ! In many small scale instances there’s no need to do anything other than use ZODB as it comes and not worry about indexing – machines are fast and many applications deal with relatively small data sets however if you do need it repoze.catalog (or one of the other, similar, cataloguing tools) is a useful way to squeeze more speed out of ZODB.

This has been a very long blog post by my standards so I’m going to show how to access the data indexed under repoze.catalog (and prove that it all actually works !) in a blog post next week.

Droopy : Very simple HTTP file uploads

Droopy : Very simple HTTP file uploads

Summary

Droopy is a mini web server which makes allowing file uploads very easy

I just want to upload this file !

I’ve been writing a system which shares processing tasks across two machines.

Part of this involved shipping an image file from one machine to the other; doing some stuff to the file and then; bringing the file back again.

I was looking around for easy ways to move the files and I found droopy .

To quote the author, Pierre  Duqeusne, “Droopy is a mini Web server whose sole purpose is to let others upload files to your computer”. It’s a single python script so as long you’ve got Python installed starting the server is as simple as this

python droopy --message "Upload the bb images here" --picture 0.jpg --dl 8080

And you’re ready to upload files immediately

Image in screendump courtesy of paloetic via flickr

Because of the `–dl` argument used to launch droopy in my example above you also have the option to download files from the same director you’re uploading to.

How I used it

Uploading Files using Requests and Droopy

My solution was written in python so uploading files to the droopy server was very easy using the excellent requests library

def uploadfile(filepath, uploadurl, fileformelementname="upfile"):
    '''
    This will invoke an upload to the webserver
    on the VM
    '''

    files = {fileformelementname : open(filepath,'rb')}
    r = requests.post(uploadurl, files=files)
    return r.status_code

uploadStatus = uploadfile(currentFile.fullpath, UPLOADURL, "upfile")

Download Files using urllib and Droopy

I can’t now remember why but I decided to do the download using urllib instead of Requests

def downloadfile(filename, dloadurl, outputdirectory):
    '''
    Pull the converted file off the droopy server
    '''

    fullurl = urljoin(dloadurl, filename)
    fulloutputpath = os.path.join(outputdirectory, 'divided', filename)

    urllib.urlretrieve(fullurl, fulloutputpath)

downloadfile(currentFile.outputName, DOWNLOADURL, IMGDIR)

Summary

Droopy provides a very useful, very simple web server for both uploading and downloading files. Combined with Python it makes a very useful facility for moving files around under programmtic control.

Versions

All of the above was done on Python 2.7.x.

Using Lillypond to show the ‘sim’ notation in a score

How can you include ‘sim’ in your score ?

Summary

How to define the ‘sim’ notation when using Lilypond to engrave a score. This is a little off topic for my blog but I found it difficult to find this information myself so I’m hoping others will find it here.

Lilypond and Frescobaldi

Lilypond is a file format allowing you to define a musical score using simple text notation; Lilypond is also the program which takes that text notation and outputs a score in, for instance, Acrobat PDF format. It’s a pretty amazing technology.

In using Lilypond I’ve found it very useful to use the Frescobaldi sheet music text editor which provides support for writing files in the Lilypond format.

What I’ve done

My music theory knowledge isn’t great but I’ve really enjoyed using Lilypond and Frescobaldi to engrave a score for a relative of mine. It’s just an amazing technology and the output is beautiful.

It took a little time to get to grips with some of the ideas although the Lilypond manuals are very good. I did however find one little thing I couldn’t work out how to do and that is the subject of this, rather top heavy post.

‘sim’ notation on a score

Sometimes in scores you want the text ‘sim.’ to appear as shown below in the screen dump.

I looked all over the documentation and there just didn’t seem to be a way of including that. However I visited the Lilypond IRC channel (#lilypond@irc.freenode.net) which you can visit directly by going to the support page and paging down a little.

The good folk on there were able to point me in the right direction. By appending

_\markup{\large \italic "sim."}

to the note I wished ‘sim’ to appear under the right thing was done.

Just to make that a little clearer here are a few bars which include the annotation to create the ‘sim’

As you might imagine the

\large

only controls how large the ‘sim’ appear on the score and I fiddled around with this a little until I found an adjective which produced text that matched the rest of the score. There are a number of options for altering the size of text displayed as can be seen in the Selecting font and font size part of the Lilypond manual.

Crazy little thing called : SimpleHTTPServer

Crazy little thing called : SimpleHTTPServer

Summary

Python brings you a one line web-server !

Simplest webserver ever !

I was logged onto a tiny VPS I use sometimes and thinking it would be useful to have a web-server to help move a few files off it. The thought of installing a ‘real’ web-server seemed a bit much for what would five minutes of use so I started thinking about Python (of course !).

It turns out that you can fire up a (completely unsecured and rather wild-west) web server like this :

python -m SimpleHTTPServer

When you type that command the current directory becomes the root of a web-resource, accessible via port 8000 – rather than the default port 80 we usually use for HTTP, and any files, and sub-directories, in the current directory can be accessed like this

http://mylittlevps.kubadev.com:8000/thefileIwant.txt

and, for stuff in sub-directories, like this

http://mylittlevps.kubadev.com:8000/foo/bar/anotherfileIwant.txt

Don’t say you weren’t warned

There are of course all sort of circumstances where doing this would be a REALLY BAD IDEA ™ but all the same it’s nice to know it’s there if you need it !

optparse: simulating ‘–help’

optparse: simulating ‘–help’

Summary

Getting optparse to show a summary of script options under program control.

What do you use optparse for ?

optparse is a Python standard library for parsing command-line options.

optparse provides a relatively simple way of

* defining a set of command line arguments which the user may enter
* parsing the command line entered and storing the results of the parse

I need help !

By default optparse will respond to a either of :

<yourscript> -h

or

<yourscript> --help

by printing a summary of your scripts options.

What I learned today

In the script I was working on today I wanted to respond to the user not entering any argument at all by printing a summary of all options available (in other words as if they user had entered ‘–help’ or ‘-h’ as an argument).

There is a way you can do this but for some reason it’s not in the list of OptionParser methods shown in the documentation.

print_help()

If you call the method ‘print_help’ as shown in the code below optparse will respond by by printing the same text that would be shown if the user were to enter an argument of ‘–help’.

parser = OptionParser(description=desc, usage=usage)
parser.add_option(  "-i", "--inbox", action="store",  dest="inbox",
metavar="INBOX", help="Location of INBOX")
parser.add_option(  "-o", "--outpath", action="store", dest="outpath",
metavar="PATH", help="PATH to output csv file")
parser.add_option(  "-v", "--verbose", action="store_true",
dest="verbose", help="Show each file processed")

(options, args) = parser.parse_args()

if (options.inbox is None) and (options.outpath is None):
  parser.print_help()
  exit(-1)
elif not os.path.exists(options.inbox):
  parser.error('inbox location does not exist')
elif not os.path.exists(os.path.dirname(options.outpath)):
  parser.error('path to ouput location does not exist')

return options

Using Paramiko to control an EC2 instance

An example of using Paramiko to issue commands to an EC2 instance

Summary

An example of using the Python library Paramiko to ‘remote control’ an EC2 instance .

“Do that, Do this”

Recently I’ve been looking into the use of the Bellatrix library to start, control and stop Amazon EC2 instances (my posts about that are here and here).

Spinning off the side of that I’ve taken a look at the Paramiko module which “implements the SSH2 protocol for secure (encrypted and authenticated) connections to remote machines”.

There’s a good article on beginning to use Paramiko “SSH Programming with Paramiko” by Jesse Noller which I found very helpful but there’s enough stuff I had to change to deal with using EC2 and the controlling Python script running on Windows that I thought it would be worth recording my sample script.

Installing Paramiko

So first off I’d seen the comments about Paramiko maybe needing a special compilation step for installation to Windows but I’m pleased to say that’s not true, I downloaded 1.7.7.1 to my Windows Vista machine, did a quick…

python setup.py install

… and it all went very smoothly, just to be sure I tried out an import …

>>> import paramiko

… no problem.

Starting the EC2 instance

I now needed a server to talk to so I used Bellatrix to spin up a micro instance of Ubuntu like this :

python "C:\bin\installed\Python2.6\Scripts\bellatrix" start --security_groups mySecGrp ami-3e9b4957 mykeypair

The arguments you can see here are :

  • “mySecGrp” is a Security Group I’ve previously setup via the AWS Management Console;
  • ‘ami-3e9b4957’ is the AMI of Ubuntu 10.04 (Lucid Lynx); and
  •  ‘mykeypair’ is the name of a Key Pair that, again, I’ve previously setup via the AWS Management Console.

When you run that command you get an output that looks like this :

C:\Users\Richard Shea>python "C:\bin\installed\Python2.6\Scripts\bellatrix" start --security_groups mySecGrp ami-3e9b4957 mykeypair
2012-04-19 21:27:03,135 INFO starting EC2 instance...
2012-04-19 21:27:03,180 INFO ami:ami-3e9b4957 type:t1.micro key_name:mykeypair security_groups:mySecGrp new size:None
2012-04-19 21:27:07,657 INFO starting image: ami-3e9b4957 key mykeypair type t1.micro shutdown_behavior terminate new size None
2012-04-19 21:27:08,555 INFO we got 1 instance (should be only one).
2012-04-19 21:27:08,556 INFO tagging instance:i-f1234567 key:Name value:Bellatrix started me
2012-04-19 21:27:12,361 INFO instance:i-f1234567 was successfully tagged with: key:Name value:Bellatrix started me
2012-04-19 21:27:12,361 INFO getting the dns name for instance: i-f1234567 time out is: 300 seconds...
2012-04-19 21:27:34,173 INFO DNS name for i-f1234567 is ec2-10-20-30-40.compute-1.amazonaws.com
2012-04-19 21:27:34,173 INFO waiting until instance: i-f1234567 is ready. Time out is: 300 seconds...
2012-04-19 21:27:34,174 INFO Instance i-f1234567 is running

And the key thing here is that we now now have access to the host name of the EC2 instance we’ve just spun up:

ec2-10-20-30-40.compute-1.amazonaws.com

Talking to the EC2 instance

We’re now ready to send commands to our new instance. Making use of some of Jesses code I was able to write :

import paramiko
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('ec2-107-22-80-32.compute-1.amazonaws.com',
            username='ubuntu',
            key_filename='''mykeypair-ssh2-rsa.openssh''')
stdin, stdout, stderr = ssh.exec_command("uptime;ls -l;touch mickymouse;ls -l;uptime")
stdin.flush()
data = stdout.read().splitlines()
for line in data:
    print line
ssh.close()

Anyone who’s got this far can probably see what’s happening here, but just to be sure :

  1. having instantiated an instance of paramiko.SSHClient we’re able to use our private key file and the address of our EC2 server to start an SSH session.
  2. We then use the exec_command method to submit a string of commands and get back three references to files corresponding to : standard input, standard output and standard error.
  3. By reading through the standard output file we can print locally the output from the commands executed on the EC2 instance.

The Key Thing

As you can see to identify ourselves to the remote server we’re doing a key exchange. Our private key is ‘mykeypair-ssh2-rsa.openssh’. A point worth mentioning here is that generally I logon to EC2 instances using the excellent PuTTY . The private key files used by default by PuTTY are not in the same format as the ones required by Paramiko so as a result when I first tried this I found Paramiko fell over complaining my ‘key_filename’ argument was ‘not a valid dsa private key file’.

PuttyGen to the rescue

Well the great thing is that PuTTY actually comes with a tool PuttyGen which will import a standard PuTTY key file (foo.ppk) – you need to do ‘Conversions’ | ‘Import Key’ and then ‘Conversions’ | ‘Export OpenSSH Key’

Ubuntu Specific

Bear in mind that the way the SSHClient connect method is used above is suitable for an Ubuntu instance as it is by default however you can’t rely on all *nix instance working just that way.

Seeing the Output

Just to close out I’ll show you the output

12:39:45 up  3:12,  0 users,  load average: 0.08, 0.02, 0.01
total 0
total 0
-rw-r--r-- 1 ubuntu ubuntu 0 2012-04-19 12:39 mickymouse
12:39:45 up  3:12,  0 users,  load average: 0.08, 0.02, 0.01

Powerful Stuff

The combination of Bellatrix allowing you to spin up EC2 instances with a single command and Paramiko allowing you send arbitary commands to those servers is powerful stuff and I’m impressed at the work done by their respective developers. Of course Bellatrix can do ‘for free’ what I’ve used Paramiko to do here are part of it’s Provisioning commands but it was an interesting exercise for me to do my own version of that.

Bellatrix Provisioning Commands

A quick guide to built-in Bellatrix provisioning commands

Summary

A table of Bellatrix provisioning commands as at 1.1.2

Bellatrix Provisioning

As mentioned in my previous post Bellatrix allows you to start and provision Amazon Web Services EC2 instances.In this context ‘provision’ refers to the process of installing software, creating directories, changing file permissions etc.

The built in commands are documented here in the official doco but before I found that I’d ripped the cmds.py file apart and produced the following table. It’s a little more succinct than the official version so I think it’s worth publishing here.

Version Alert

For obvious reasons this type of thing is prone to Bellatrix changing. This page is valid as of the 1.1.2 version of Bellatrix.

Provisioning Commands

Command Description
apt_get_install Return the “sudo apt-get install” command
apt_get_update Executes apt-get update.
chmod Applies the chmod command
createSoftLink Creates a new soft link
copy Copy a file using -f so it doesn’t fail if the destination exists
create_django_project Creates a Django project
createVirtualEnv Generate a new Python virtual environment using virtualenv
executeInVirtualEnv Execute a command within a virtualenv environmen
flatCommands Given a list of commands it will return a single one concatenated by ‘&&’ so they will be executed in sequence until any of them fail
install_pip Install pip using apt-get install
install_nginx Install Nginx in Ubuntu using the repo they provide as described in http://wiki.nginx.org/Install#Ubuntu_PPA
installPackageInVirtualEnv Install a Python package into a virtualenv
mkdir Created a new directory. “-p” flag is used so the command generates the same result regardless whether the directory exists or not.
pip_install Install a Python package using pip
sudo Execute a list of commands using sudo
wget Downloads a web resource using wget

Bellatrix + Windows Users – Two things to remember

Bellatrix – Windows Idiosyncracies

Summary

Two things you might find useful if you’re starting to use Bellatrix on Windows.

What’s Bellatrix Again ?

Bellatrix allows you browse and control Amazon EC2 instances using simple commands on your local machine.

For instance :

bellatrix start ami key_name

run on your machine will launch an EC2 instance within Amazon Web Services based on the named ami (Amazon Machine Image) and allow you to contact it using the key_name you have specified.

You’ve been able to do something like this with the Python package Boto for some time but Bellatrix provides a lot of ready to use commands rather than having to write your own scripts.

Bellatrix on Windows Then ?

Location of Home

I’m a bit embarrassed to write this but in the Bellatrix doco which describes what config files you need and where to place them it refers to

<your_home>

I’m familiar with the idea of ‘Home’ on a unix box but in a Windows enviroment ? I wasn’t really sure what it meant (I’ve only been using Windows for 15 years so you can see where the confusion crept in). I managed to persuade myself it was referring to the Bellatrix directory within the Python site-packages directory !

Well here’s the news for the Windows users out there who are as ‘home-challenged’ as I am. It turns out you should putting those config files in a directory located at :

%HOMEPATH%\.bellatrix

which for me is

C:\Users\Richard Shea\.bellatrix

Yes but what about the dots ?

Which brings me onto another windows specific thing – how do you produce a directory with a dot as its first character ? I started off trying to do it via File Explorer and discovered something strange. File Explorer won’t let you specify a directory name that starts with a dot … unless you specify the directory name as having a trailing dot as well … if you do that File Explorer quietly throws away the trailing dot and you have the name you wanted in the first place ! Weird … or perhaps “only on Windows”

In closing

Anyway the good news is I’ve got it up and running and listing my instances ! Next step is to to use Bellatrix to start an EC2 instance and provision it as a Nagare environment. There’ll be a blog post about that by the time I’ve done, I’m sure !

Datejs – Date.isAfter – no such function ?

date.js – missing methods

Summary

Save yourself a lot of pain and ensure you’re using the right version of  Datejs.

Datejs – It’s great !

Let me start by saying that Datejs, an open-source JavaScript Date Library, is great ! It provides a lot of much needed functionality to the area of dates in Javascript.

Weirdly missing methods

Having said that what I want to highlight is that if you use the standard download links offered by the website (as shown below) you’ll get a version of Datejs which is not the most current one.

What’s more it seems that the version you will get contains a number of defects.

I can vouch for this as I’ve spent a couple of hours today wondering why the ‘.isAfter’ and the ‘.compare’ methods didn’t seem to exist in the form they are documented ! Very frustrating !

What You Should Do

Instead of using the download link on the first page to get the date.js file go here instead : http://www.datejs.com/build/date.js. The version number is the same, “1.0 Alpha-1”, but the build date is “2008-05-13”.

At least at the time of writing – it’s the “2008-05-13” build you want.

Credit Where Credits Due

I’m grateful to Ben McIntyre whose post on the Datejs forums alerted me to this problem, I only wish I’d seen it earlier !

Find (Python) File in which object is defined

Every so often I have an object defined in some Python code and I want to work out in which file that object was defined.

Generally this can be done by a bit of grepping but if the object name in question is pretty generic this can be less than productive and it turns out there is a nice ready made way provided for us in the Python Standard Library.

So as an example currently I’m working with Pyro4 and I’m curious to know, for instance, where a particular constant, VERSION, is defined.

>>> import Pyro4
>>> print Pyro4.constants.VERSION
4.8

Well the ‘inspect’ module from the Python standard library allows you do just that, specifically the ‘getfile’ method – http://docs.python.org/library/inspect.html#inspect.getfile

>>> inspect.getfile(Pyro4.constants)
'C:\\Python27\\lib\\site-packages\\pyro4-4.8-py2.7.egg\\Pyro4\\constants.pyc'

As we can now see the VERSION object derives from the constants.pyc at the path given. In this case the definition is actually in the middle of an egg, pyro4-4.8-py2.7.egg,  which  means direct access to the underlying source code is not as straightforward as if it were implemented in plain old python scripts.

Nevertheless we can still make use of another inspect module function, the ‘getsource’ method (http://docs.python.org/library/inspect.html#inspect.getsource) to get some more information about the object we’re interested in as follows:

>>> inspect.getsource(Pyro4.constants)
'"""\nDefinitions of various hard coded constants.
Pyro - Python Remote Objects.  Copyright by Irmen de Jong.
irmen@razorvine.net - http://www.razorvine.net/projects/Pyro"""
# Pyro version\nVERSION = "4.8"
# standard object name for the Daemon object
DAEMON_NAME = "Pyro.Daemon"
# standard name for the Name server itself
NAMESERVER_NAME = "Pyro.NameServer"

… etc, etc.