Category Archives: Python

Crazy little thing called : SimpleHTTPServer

Crazy little thing called : SimpleHTTPServer

Summary

Python brings you a one line web-server !

Simplest webserver ever !

I was logged onto a tiny VPS I use sometimes and thinking it would be useful to have a web-server to help move a few files off it. The thought of installing a ‘real’ web-server seemed a bit much for what would five minutes of use so I started thinking about Python (of course !).

It turns out that you can fire up a (completely unsecured and rather wild-west) web server like this :

python -m SimpleHTTPServer

When you type that command the current directory becomes the root of a web-resource, accessible via port 8000 – rather than the default port 80 we usually use for HTTP, and any files, and sub-directories, in the current directory can be accessed like this

http://mylittlevps.kubadev.com:8000/thefileIwant.txt

and, for stuff in sub-directories, like this

http://mylittlevps.kubadev.com:8000/foo/bar/anotherfileIwant.txt

Don’t say you weren’t warned

There are of course all sort of circumstances where doing this would be a REALLY BAD IDEA ™ but all the same it’s nice to know it’s there if you need it !

optparse: simulating ‘–help’

optparse: simulating ‘–help’

Summary

Getting optparse to show a summary of script options under program control.

What do you use optparse for ?

optparse is a Python standard library for parsing command-line options.

optparse provides a relatively simple way of

* defining a set of command line arguments which the user may enter
* parsing the command line entered and storing the results of the parse

I need help !

By default optparse will respond to a either of :

<yourscript> -h

or

<yourscript> --help

by printing a summary of your scripts options.

What I learned today

In the script I was working on today I wanted to respond to the user not entering any argument at all by printing a summary of all options available (in other words as if they user had entered ‘–help’ or ‘-h’ as an argument).

There is a way you can do this but for some reason it’s not in the list of OptionParser methods shown in the documentation.

print_help()

If you call the method ‘print_help’ as shown in the code below optparse will respond by by printing the same text that would be shown if the user were to enter an argument of ‘–help’.

parser = OptionParser(description=desc, usage=usage)
parser.add_option(  "-i", "--inbox", action="store",  dest="inbox",
metavar="INBOX", help="Location of INBOX")
parser.add_option(  "-o", "--outpath", action="store", dest="outpath",
metavar="PATH", help="PATH to output csv file")
parser.add_option(  "-v", "--verbose", action="store_true",
dest="verbose", help="Show each file processed")

(options, args) = parser.parse_args()

if (options.inbox is None) and (options.outpath is None):
  parser.print_help()
  exit(-1)
elif not os.path.exists(options.inbox):
  parser.error('inbox location does not exist')
elif not os.path.exists(os.path.dirname(options.outpath)):
  parser.error('path to ouput location does not exist')

return options

Using Paramiko to control an EC2 instance

An example of using Paramiko to issue commands to an EC2 instance

Summary

An example of using the Python library Paramiko to ‘remote control’ an EC2 instance .

“Do that, Do this”

Recently I’ve been looking into the use of the Bellatrix library to start, control and stop Amazon EC2 instances (my posts about that are here and here).

Spinning off the side of that I’ve taken a look at the Paramiko module which “implements the SSH2 protocol for secure (encrypted and authenticated) connections to remote machines”.

There’s a good article on beginning to use Paramiko “SSH Programming with Paramiko” by Jesse Noller which I found very helpful but there’s enough stuff I had to change to deal with using EC2 and the controlling Python script running on Windows that I thought it would be worth recording my sample script.

Installing Paramiko

So first off I’d seen the comments about Paramiko maybe needing a special compilation step for installation to Windows but I’m pleased to say that’s not true, I downloaded 1.7.7.1 to my Windows Vista machine, did a quick…

python setup.py install

… and it all went very smoothly, just to be sure I tried out an import …

>>> import paramiko

… no problem.

Starting the EC2 instance

I now needed a server to talk to so I used Bellatrix to spin up a micro instance of Ubuntu like this :

python "C:\bin\installed\Python2.6\Scripts\bellatrix" start --security_groups mySecGrp ami-3e9b4957 mykeypair

The arguments you can see here are :

  • “mySecGrp” is a Security Group I’ve previously setup via the AWS Management Console;
  • ‘ami-3e9b4957’ is the AMI of Ubuntu 10.04 (Lucid Lynx); and
  •  ‘mykeypair’ is the name of a Key Pair that, again, I’ve previously setup via the AWS Management Console.

When you run that command you get an output that looks like this :

C:\Users\Richard Shea>python "C:\bin\installed\Python2.6\Scripts\bellatrix" start --security_groups mySecGrp ami-3e9b4957 mykeypair
2012-04-19 21:27:03,135 INFO starting EC2 instance...
2012-04-19 21:27:03,180 INFO ami:ami-3e9b4957 type:t1.micro key_name:mykeypair security_groups:mySecGrp new size:None
2012-04-19 21:27:07,657 INFO starting image: ami-3e9b4957 key mykeypair type t1.micro shutdown_behavior terminate new size None
2012-04-19 21:27:08,555 INFO we got 1 instance (should be only one).
2012-04-19 21:27:08,556 INFO tagging instance:i-f1234567 key:Name value:Bellatrix started me
2012-04-19 21:27:12,361 INFO instance:i-f1234567 was successfully tagged with: key:Name value:Bellatrix started me
2012-04-19 21:27:12,361 INFO getting the dns name for instance: i-f1234567 time out is: 300 seconds...
2012-04-19 21:27:34,173 INFO DNS name for i-f1234567 is ec2-10-20-30-40.compute-1.amazonaws.com
2012-04-19 21:27:34,173 INFO waiting until instance: i-f1234567 is ready. Time out is: 300 seconds...
2012-04-19 21:27:34,174 INFO Instance i-f1234567 is running

And the key thing here is that we now now have access to the host name of the EC2 instance we’ve just spun up:

ec2-10-20-30-40.compute-1.amazonaws.com

Talking to the EC2 instance

We’re now ready to send commands to our new instance. Making use of some of Jesses code I was able to write :

import paramiko
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('ec2-107-22-80-32.compute-1.amazonaws.com',
            username='ubuntu',
            key_filename='''mykeypair-ssh2-rsa.openssh''')
stdin, stdout, stderr = ssh.exec_command("uptime;ls -l;touch mickymouse;ls -l;uptime")
stdin.flush()
data = stdout.read().splitlines()
for line in data:
    print line
ssh.close()

Anyone who’s got this far can probably see what’s happening here, but just to be sure :

  1. having instantiated an instance of paramiko.SSHClient we’re able to use our private key file and the address of our EC2 server to start an SSH session.
  2. We then use the exec_command method to submit a string of commands and get back three references to files corresponding to : standard input, standard output and standard error.
  3. By reading through the standard output file we can print locally the output from the commands executed on the EC2 instance.

The Key Thing

As you can see to identify ourselves to the remote server we’re doing a key exchange. Our private key is ‘mykeypair-ssh2-rsa.openssh’. A point worth mentioning here is that generally I logon to EC2 instances using the excellent PuTTY . The private key files used by default by PuTTY are not in the same format as the ones required by Paramiko so as a result when I first tried this I found Paramiko fell over complaining my ‘key_filename’ argument was ‘not a valid dsa private key file’.

PuttyGen to the rescue

Well the great thing is that PuTTY actually comes with a tool PuttyGen which will import a standard PuTTY key file (foo.ppk) – you need to do ‘Conversions’ | ‘Import Key’ and then ‘Conversions’ | ‘Export OpenSSH Key’

Ubuntu Specific

Bear in mind that the way the SSHClient connect method is used above is suitable for an Ubuntu instance as it is by default however you can’t rely on all *nix instance working just that way.

Seeing the Output

Just to close out I’ll show you the output

12:39:45 up  3:12,  0 users,  load average: 0.08, 0.02, 0.01
total 0
total 0
-rw-r--r-- 1 ubuntu ubuntu 0 2012-04-19 12:39 mickymouse
12:39:45 up  3:12,  0 users,  load average: 0.08, 0.02, 0.01

Powerful Stuff

The combination of Bellatrix allowing you to spin up EC2 instances with a single command and Paramiko allowing you send arbitary commands to those servers is powerful stuff and I’m impressed at the work done by their respective developers. Of course Bellatrix can do ‘for free’ what I’ve used Paramiko to do here are part of it’s Provisioning commands but it was an interesting exercise for me to do my own version of that.

Find (Python) File in which object is defined

Every so often I have an object defined in some Python code and I want to work out in which file that object was defined.

Generally this can be done by a bit of grepping but if the object name in question is pretty generic this can be less than productive and it turns out there is a nice ready made way provided for us in the Python Standard Library.

So as an example currently I’m working with Pyro4 and I’m curious to know, for instance, where a particular constant, VERSION, is defined.

>>> import Pyro4
>>> print Pyro4.constants.VERSION
4.8

Well the ‘inspect’ module from the Python standard library allows you do just that, specifically the ‘getfile’ method – http://docs.python.org/library/inspect.html#inspect.getfile

>>> inspect.getfile(Pyro4.constants)
'C:\\Python27\\lib\\site-packages\\pyro4-4.8-py2.7.egg\\Pyro4\\constants.pyc'

As we can now see the VERSION object derives from the constants.pyc at the path given. In this case the definition is actually in the middle of an egg, pyro4-4.8-py2.7.egg,  which  means direct access to the underlying source code is not as straightforward as if it were implemented in plain old python scripts.

Nevertheless we can still make use of another inspect module function, the ‘getsource’ method (http://docs.python.org/library/inspect.html#inspect.getsource) to get some more information about the object we’re interested in as follows:

>>> inspect.getsource(Pyro4.constants)
'"""\nDefinitions of various hard coded constants.
Pyro - Python Remote Objects.  Copyright by Irmen de Jong.
irmen@razorvine.net - http://www.razorvine.net/projects/Pyro"""
# Pyro version\nVERSION = "4.8"
# standard object name for the Daemon object
DAEMON_NAME = "Pyro.Daemon"
# standard name for the Name server itself
NAMESERVER_NAME = "Pyro.NameServer"

… etc, etc.

The Python dict() constructor

The Python dict() constructor

Summary

How to add key/value pairs to an existing dictionary using the dict() contstructor

Using dict()

I’ve fallen into the habit of building dictionaries in Python using the braces approach, that is :

d1 = {'name':'Jane', 'age':21}

I was reminded today that you can use the conventional constructor method associated with the dict type as follows :

d1 = dict(name='Jane', age=21)

This will produce the same dictionary as the previous example. Notice that name of the keyword arguments (‘name’ and ‘age’) end up being the keys of the dictionary. Notice also that because they are keyword arguments to the function dict() they are not supplied as quoted strings.

What I learnt today

I was looking at some code today and discovered there’s something else the dict() function can do which I didn’t previously know of. If you have an existing dictionary which you wish to add some key/value pairs to you can do this.

#Create d1 from above
d1 = dict(name='Jane', age=21)

#Now produce a new dictionary, d2, based
#upon d1 and with extra key/value pairs

d2=dict(d1, weight=50, shoesize=7)

print d2
{'age': 21, 'shoesize': 7, 'name': 'Jane', 'weight': 50}

Taking it further

Not surprisingly you can use the same technique to modify the existing key/value pairs in a dictionary, like this :

#Create d1 from above
d1 = dict(name='Jane', age=21)

#Now produce a new dictionary, d3, based
#upon d1 with a modified existing key/value pair

d3=dict(d1, name='John')

print d3
{'age': 21, 'name': 'John'}

Where is Django installed ?

Where is Django installed ?

Summary

It’s useful to know where Django is for a number of reasons – customising admin templates for instance.

Todays Learning Point

Django is generally going to be installed in the site-packages directory of the Python directory you’re using. If you’re on an unfamiliar machine however where that is may not be immediately obvious so here’s a way to be sure.

If you need to know where the Django installation is you can do that from within Python quite easily.

>>> import django
>>> print django.__path__
['/usr/lib/python2.5/site-packages/django']

Why __path__

__path__ is a special attribute of Python packages; it’s initialized to hold the name of the directory holding the package’s __init__.py. To put that in blunter terms __path__ is going to tell you where the files that make up the package are – in this case Django.

Django to the world

Django to the world

When your shiny new Django site is invisible to other machines …

Summary

The default setting in Django means your development server is invisible to other machines

Todays Learning Point

When a developer creates a nice new Django site and uses the django-admin.py script :

django-admin.py runserver

to start the development server. By default the development server is responding to requests made on port 8000 on IP address 127.0.0.1 (or the synonym ‘localhost’). As such you’re not going to be able to see that Django site from any other machine.

In most cases this is just what’s needed. The development server is intended for use by the developer only. However there may be circumstances where you want another developer to see your work – or as happened to me today where you are developing on a virtual machine running within your development machine.

If that’s the case there’s a way around it

Specify IP on Server Launch

You can issue a slightly different command when starting the development server

python manage.py runserver 0.0.0.0:8000

The above command will listen on port 8000 of all public IP addresses of the hosting machine and that in turn will mean other machines can access the Django site served through your development server.

AttributeError: ‘str’ object has no attribute ‘digits’

‘str’ object has no attribute ‘digits’

On silly ways you can puzzle yourself – part 412

Summary

How to make Python report that a string object has no attribute ‘digits’

Confusing Yourself – the easy way


Today I was working on a little piece of code which I hadn’t originally written and which looked something like this :

import string
def foo(string):
  for c in string:
    if c in string.digits:
       #do something

As is well known the Python string module contains a number of useful constants one of which is string.digits

>>> import string
>>> print string.digits
0123456789

Missing ‘digits’

My problem was that every time I went to execute this code it got to the reference to string.digits and the Python intepreter would report

AttributeError: 'str' object has no attribute 'digits'

I spent a happy half hour looking backwards and forwards trying to understand why the String module might think it had no attribute ‘digits’ when everything indicated quite clearly it did until I realised what the problem was.

def foo(string):

That argument name ‘string’ had carefully chucked away my previous reference to the String module and as a string has no attribute ‘digits’ the intpreter was quite reasonably complaining !

My defense

In my defense I wouldn’t normally use variable names, like ‘string’,  that come quite that close to commonly used modules but then like I say I didn’t write the original function

… but what I should have done

But then again what I should have done a great deal sooner than I did was to add a couple of lines to the function so that it read :

import string
import pprint
def foo(string):
  for c in string:
    pprint.pprint(dir(string))
    if c in string.digits:
       #do something

which would have output something like this

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getslice__',
 '__gt__',
 '__hash__',
 '__init__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_formatter_field_name_split',
 '_formatter_parser',
 'capitalize',
 'center',
 'count',
 'decode',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'index',
 'isalnum',
 'isalpha',
 'isdigit',
 'islower',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

… and made it pretty clear that things were not as I thought they were.

Sphinx – how to make autodoc really automatic ?

Automating sphinx.ext.autodoc

A tool to help use the ‘autodoc’ facilties of Sphinx.

Summary

Sphinx has a great extension,sphinx.ext.autodoc, which imports the modules you want to document and uses the docstrings as the basis for the documentation. A script I’ve just found makes using sphinx.ext.autodoc even easier.

generate_modules.py

Although the sphinx.ext.autodoc extension reduces the work of creating Sphinx source files a great deal I still found myself having to create a set of files which corresponded to the modules in a project. This was a bit of a bore particularly as I like to start creating documentation early in a project and so modules would come and go during development.

On a number of occasions I wished I had a script to automatically create the source files needed … well it turns out that Etienne Desautels has already done the heavy lifting and written generate_modules.py which does just what I want.

Making use of it

generate_modules.py is completely independent of Sphinx. Just download it from the above location and run it as

>python generate_modules.py --help

Usage: generate_modules.py [options] <package path> [exclude paths, ...]

Note: By default this script will not overwrite already created files.

Options:
 -h, --help            show this help message and exit
 -n HEADER, --doc-header=HEADER
                       Documentation Header (default=Project)
 -d DESTDIR, --dest-dir=DESTDIR
                       Output destination directory
 -s SUFFIX, --suffix=SUFFIX
                       module suffix (default=txt)
 -m MAXDEPTH, --maxdepth=MAXDEPTH
                       Maximum depth of submodules to show in the TOC
                       (default=4)
 -r, --dry-run         Run the script without creating the files
 -f, --force           Overwrite all the files
 -t, --no-toc          Don't create the table of content file

Most of this is pretty self-explanatory.

My index.rst looks like this :

Welcome to pySourceAid's documentation!
=======================================

Contents:
=========

.. toctree::
:maxdepth: 2

overview.rst

Modules
===============

.. toctree::
:maxdepth: 20
:numbered:
:glob:

modules/*

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

And that line in the index.rst

    modules/*

… means that autodoc is going to go looking for source files corresponding to each of the modules in a directory called modules. As a result when I run generate_modules.py I use a command like this

python generate_modules.py --suffix=rst --dest-dir=C:\myproject\docs\source\modules C:\myproject\src

Where the modules are living in C:\myproject\src

Environment

This blog post is based on some work done using : Python 2.6 and Sphinx 1.0.5

Nose’ing out Tests

A Quick ‘nose’ tip

Why your unit tests might not be discovered by nose

Using nose for the first time

Out and about herding unit tests the other day I decided to try nose . nose is a unit test framework which provides test discovery and running facilities for python based unit tests. This seemed like what I needed given I had a lot of tests and I’d heard good things about nose.

Trying it Out

Everything went very well. I downloaded and installed – all very easy. Quickly scanned the usage doco and thought it was all good to go. The only problem was when I did

cd /path/to/project
nosetests

nose couldn’t find any of my tests. All very puzzling

A ‘gotcha’ I might save you from

Well I tried lots of stuff but the bottom line here is that I’d fallen prey to not being a regular expression parser … again !  The fact is that nose will, by default, look for files which match the regex

(?:^|[b_./-])[Tt]est

and what I hadn’t noticed was that isn’t going to find files like MyClassTest.py it will find, for instance, MyClass-Test.py but without that hyphen my precious tests were invisible ! Sadly (for me ) everyone of my tests was in a file named like MyClassTest.py !

More Generally

More generally (and courtesy of a post on StackOverflow by Mark Rushakoff I came across) the following file patterns will match the nose default:

  • TestFoo.py
  • Foo-Test.py
  • Foo_Test.py

but as I discovered this will not

  • FooTest.py

Environment

I discovered this whilst working on nose 0.11.3 under Python 2.6 .