repoze.catalog and ZODB beginners example – part 2
Summary
The second of two posts which illustrate how to use repoze.catalog alongside ZODB. The first post can be seen at: “repoze.catalog and ZODB beginners example – part 1” .
Where we’re up to
In the first post I explained how you can have objects stored within a ZODB database indexed by repoze.catalog and why that was sometimes a good idea. In this post I’m going to demonstrate searching for the previously stored objects using repoze.catalog’s search facilities. If you haven’t read the first post I suggest you read that now because what follows assumes you have.
Finding ZODB objects with repoze.catalog
As discussed in the first post repoze.catalog allows you to index arbitrary properties of the objects you save into a ZODB database and then do complex searches on those properties to extract only the objects you’re interested in.
The example I’m showing here demonstrates how we can search through those objects we added in the example of the last post using a number of criteria.
Example Code
Here’s my example code and underneath I’ll expand a little more on what each part does:
''' Demonstrates how to use repoze.catalog to find objects being stored in ZODB. This example has the catalog and ZODB within the same repository ''' from myzodb import MyZODB from persistent import Persistent from repoze.catalog.catalog import FileStorageCatalogFactory from repoze.catalog.catalog import ConnectionManager from repoze.catalog.query import InRange, Lt class City(Persistent): '''Represents a City by name and population''' def __init__(self, cityname, citypop): self.name = cityname self.population = citypop def __str__(self): return "%s (Pop: %s)" % \ (self.name, \ str(self.population)) def print_all_city_instances(myzodbinst): ''' Pull everything keyed under 'cities' out of the ZODB instance (without any regard to the repoze.catalog cataloguing and print them ''' print "" print "About to dump all City Instances:" for acity in myzodbinst.dbroot['cities'].itervalues(): print acity print "" def print_city_query_results(myzodbinst, res): ''' Use the list of integers returned by a repoze.catalog query to pull elements keyed underneath 'cities' in the ZODB instance which we are using repoze.catalog to catalogue ''' print "" print "Objects stored in ZODB corresponding" print "to the repoze.catalog resultset:" for idx in res: print myzodbinst.dbroot['cities'][idx] print "" if __name__ == '__main__': #Setup access to the repoze.catalog instance factory = FileStorageCatalogFactory('../data/mdcatalog.db', Â 'mycatalog') manager = ConnectionManager() catalog = factory(manager) #Setup access to the ZODB instance containing data #catalogued by the repoze.catalog instance myzodbinstance = MyZODB('../data/mdzdb.fs') #Demonstrate we really have all the Cities print_all_city_instances(myzodbinstance) #Demonstrate use of `Lt` on the `population` index print "" print "*" * 60 print "Looking for 'less than' value on the `population` index" print "Populations less than 1,000,000" numdocs, results = catalog.query(Lt('populations', 1000000)) print "Raw Result: " print (numdocs, [ x for x in results ]) print_city_query_results(myzodbinstance, results) #Demonstrate use of `InRange` on the `population` index print "" print "*" * 60 print "Looking for 'InRange' values on the `population` index" print "Populations between 1,000,000 and 4,000,000" numdocs, results = catalog.query(InRange('populations', 1000000, 4000000)) print "Raw Result: " print (numdocs, [ x for x in results ]) print_city_query_results(myzodbinstance, results)
Example Step by Step
Here’s a breakdown on what’s happening in the above example
Initialize repoze.catalog
factory = FileStorageCatalogFactory('../data/mdcatalog.db', 'mycatalog') manager = ConnectionManager() catalog = factory(manager)
Here we connect to our repoze.catalog repository and instantiate a `catalog` object
Make our ZODB database ready for use
myzodbinstance = MyZODB('../data/mdzdb.fs')
`MyZODB` is a convenience class which wraps up the instantiation of a ZODB database instance and provides : `storage`; `db`;`connection`; and `dbroot` properties to help the programmer interact with the ZODB database, connection, storage objects. `MyZODB` also provides a close method to cleanly close the ZODB database, connection and storage.
`MyZODB` is not explicitly included in the above example but it looks like this :
from ZODB import FileStorage, DB class MyZODB(object): '''Manage the state of a ZODB FileStorage connection''' def __init__(self, path): self.storage = FileStorage.FileStorage(path) self.db = DB(self.storage) self.connection = self.db.open() self.dbroot = self.connection.root() def close(self): self.connection.close() self.db.close() self.storage.close()</pre>
Dump contents of ZODB without using repoze.catalog
The first data access we do in the above example is just a simple dump of every object, held under the key ‘cities’, in our ZODB database. Notice we are not using repoze.catalog at all at this point. By viewing this data we can be sure that the subsequent queries using repoze.catalog do what we think they do.
So we call the function `print_all_city_instances`
print_all_city_instances(myzodbinstance)
which iterates over the ‘cities’ element of the `dbroot` property of the ZODB `connection` to allow us to see everything that’s in the ZODB database.
for acity in myzodbinst.dbroot['cities'].itervalues(): print acity
Our output looks like this :
About to dump all City Instances: Windhoek (Pop: 322500) Pretoria (Pop: 525387) Nairobi (Pop: 3138295) Maputo (Pop: 1244227) Jakarta (Pop: 10187595) Canberra (Pop: 358222) Wellington (Pop: 393400) Santiago (Pop: 5428590) Buenos Aires (Pop: 2891082)
Demonstrating the `Lt` function of repoze.catalog
The next thing that happens in the sample is to make use of the `Lt` function offered by repoze.catalog
numdocs, results = catalog.query(Lt('populations', 1000000))
In the previous post when we initialized our repoze.catalog we created a `populations` index which was associated with the `population` property of our `City` class (take a look at the previous post if you’ve forgotten the details).
Our use of the `Lt` method asks repoze.catalog to find all `City` instances stored in our ZODB database with a population of less than 1,000,000. As you can see we get two objects returned which I’ve named `numdocs` and `results`.
`numdocs` is an integer showing how many instances have been found which meet the criteria.
`results` is a list of integers which are keys used when storing into ZODB those objects which satisfy the search criteria.
We then use our function
print_city_query_results(myzodbinstance, results)
to output the objects found. The resulting output looks like this :
Objects stored in ZODB corresponding to the repoze.catalog resultset: Windhoek (Pop: 322500) Pretoria (Pop: 525387) Canberra (Pop: 358222) Wellington (Pop: 393400)
It’s worth mentioning that whilst there are many comporator methods offered by repoze.catalog.query not all of them are applicable to all index types. In this example of the `Lt` method we are searching on an index, ‘populations’ of type CatalogTextIndex which does offer the `Lt` method but not all do.
Demonstrating the `InRange` function of repoze.catalog
Finally in the sample we show off the `InRange` function offered by repoze.catalog
numdocs, results = catalog.query(InRange('populations', Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 1000000, 4000000))
As with the previous example we utilise the previously created catalog index ‘populations’ to find instances of `City` – in this case those instances that have their `population` property set to a value between 1,000,000 and 4,000,000.
We do this by using the `InRange` method offered by repoze.catalog. As with the `Lt` example above we get two objects returned which I’ve named `numdocs` and `results`.
`numdocs` is an integer showing how many instances have been found which meet the criteria.
`results` is a list of integers which are keys used when storing into ZODB those objects which satisfy the search critiera.
We then use our function
print_city_query_results(myzodbinstance, results)
to output the objects found. The resulting output looks like this :
Objects stored in ZODB corresponding to the repoze.catalog resultset: Nairobi (Pop: 3138295) Maputo (Pop: 1244227) Buenos Aires (Pop: 2891082)
Credit where credits due
As with part one of this two part post the example I’ve shown here owes some parts to one of the examples on the repoze.catalog website and the structure of the `myZODB` was taken from the article cited above, ‘Example Driven ZODB‘ .