Sunday, November 18th, 2012

Been playing around with Object Databases

Heard of a special Database called an Object Database? It has been around for quite sometime now, and has many uses. One benefit is with OOP programming, since it can store actual objects in it's database. You can think of Active Directory or an LDAP server as a sort of object database. It stores many class-like objects with many attributes such as a persons name, phone number, and address information. It can store much more than just that, it basically describes an entire corporate directory of both employees, various contacts, and machines. With an object database, you can simulate such a system.

From this, you may understand the difference between a relational database and an object database. Although you can simulate an object database with a relational database, it won't be ideal due to the large amount of queries and tables necessary. An object is very dynamic, whereas a defined table in a relational database is very static. You can add new attributes on the fly to an object, but you cannot easily add new columns to a table on the fly. This is one advantage to an object database. Another is being able to sort your data in a tree of hierarchy objects, much like a file system on your computer.

One very popular object database in Python is ZODB, the Zope Object Database. However, it requires a lot of dependencies from Zope itself, so in my playing around, I avoided it entirely. Instead, I ended up using Shelve, which ships in the standard Python Library. It essentially provides a persistent dict object. Similar to what ZODB provides. I created a few components for Django to test out an object database through a website, much like how Zope did. A middleware checks the incoming URL to see if an object for it exists in the object database and provides that object to the browser. I equally created a Template Loader to test the loading of templates from an object database, similar to what Zope allows. I store the pre-compiled Django Template object in the database, so there is no need to re-compile it at runtime. In order to make these components work, I needed to make a dict traversable:

from odb.models import odb_dict

def get_object(path):
    if path == '/':
        return odb_dict
    if path[-1] == '/':
        path = path[:-1]
    try:
        o = "odb_dict"
        for e in path[1:].split('/'):
            o += "['%s']" % e
        ob = eval(o)
    except KeyError:
        return None
    return ob

def put_object(path, o):
    if path == '/':
        return False
    if path[-1] == '/':
        path = path[:-1]
    ob = "odb_dict"
    for e in path[1:].split('/')[:-1]:
        ob += "['%s']" % str(e)
    eval(ob)[str(path.split('/')[-1])] = o
    return True

It is very basic and simple, but works wonderfully. I am sure there is a better way to handle it, which will add more speed. However, for the sole purpose of playing around, there is no need for optimizations. Here is the middleware I created to handle the requests:

class TraversalMiddleware(object):
    def process_request(self, req):
        try:
            o = get_object(req.path)
            if hasattr(o, '__getitem__') and 'index' in o:
                o = o['index']
            if hasattr(o, 'render') and callable(o.render):
                return o.render(req)
            elif isinstance(o, HttpResponse):
                return o
            elif isinstance(o, str):
                return HttpResponse(o)
            elif isclass(o) and issubclass(o, Form):
                return FormView.as_view(form_class=o, template_name="odb/form.html")(req)
            elif isinstance(o, Template):
                return HttpResponse(o.render(RequestContext(req,{'GET':req.GET})))
            elif isinstance(o, dict):
                if req.path[-1]!='/':
                    return None
                html = ""
                for k in o.keys():
                    if isinstance(o[k], dict):
                        html += '<li><a href="%s/">%s</a></li>' % (k, k)
                    else:
                        html += '<li><a href="%s">%s</a></li>' % (k, k)
                return HttpResponse("<ul>%s</ul>" % html)
            return None
        except:
            return None

As one can see, it's very messy. It was messier before I made some recent changes. The code near the bottom is generating HTML directly because I was too lazy to create a template while I was playing around. For the most part, this middleware does a lot and detects a large amount of Django objects. I am currently working on a special content object class which has a render() function. This will replace everything seen above with just the call to render(), which will handle the rendering of that particular object. Here is the very short and simple Template Loader:

class TraverseLoader(BaseLoader):
    is_usable = True
    
    def load_template(self, template_name, template_dirs=None):
        template = get_object("/templates/%s" % template_name)
        if template == None:
            raise TemplateDoesNotExist
        return template, None

This does currently hardcode in the template directory, but this will change if I decide to make something of this. The idea of storing the templates in the object database is that it can allow the templates to be edited in a web-based interface, like any other object in the object database. I admired Zope when it came to it's nicely polished web interface, it allows the creation of various objects and file-types. I am planning on replicating this experience in Django. The content object class will support the easy creation of new objects, and the ability to render them on a website. Like in object-oriented programming, these objects will be the main logic for a web application. To extend this object-based system, one would create a new object with the appropriate attributes and logic. At first, it may be more for content management purposes, which is a great way to use an object database. Future plans are to create an Authentication Backend which hooks into the object database, so that user objects can be stored there. This can allow the extending of a user object to be as simple as adding a new attribute to the object itself. Other exciting features are a full ACL system for the database, which will prevent access to specific objects or actions upon them without proper credentials.

Due to implementing a render() function, and using that to render the actual object, I am hoping to add complete object transparency. Meaning, that say you wish to use multiple template engines for different purposes. The render() function will transparently handle everything and provide a usable Template object to Django. This transparency will go a step further and fully provide introspection to automatically generate editing forms based on the object. If a template is assigned to an object, it will automatically generate a context and push it to the template when viewed by end-users. The listing of a directory node in the object database, will automatically render an object list template for that node. Unlike Django's ListView however, this is not limited to a single object type, but will display all objects types in that node with minimal effort. Hooks into the Django relational database will also be provided, which will enable easy creation of querysets to be rendered. This will essentially cut down on coding entirely, as the class-based views will be dynamically created and rendered based on attributes set on the object. Since these objects can be dynamically edited via an easy to use web-based interface, anybody can create querysets or utilize any of Django's class-based views with no Python programming knowledge whatsoever.

In the end, this idea will be more targeted towards current users of competing CMS software, rather than Django programmers. However, without Django programmers to extend the objects to new heights, this idea won't really lift far beyond the objects which will be provided when I complete such an ambitious Django app. I am hoping that this idea will take off and provide a great starting point for users of existing CMS products whom are looking for something a bit more powerful.

Comment #1: Posted 1 year, 9 months ago by Sean Upton

That phrase "a lot of dependencies" -- I do not think it means what you think it means. ZODB has few Zope dependencies, maybe just zope.interface and transaction (not really zope-specific anymore) package distributions, IIRC. And ability to work on a wider variety of Python platforms is coming soon with some reorganization (see ZODB-dev).

Comment #2: Posted 1 year, 9 months ago by Carlos de la Guardia

It takes all of 40 seconds to set up a virtualenv with ZODB installed, including the whole lot of six dependencies, only one of which has the word zope in the name and none of which could be considered "dependencies of zope itself" in the sense that you imply.

You get this:

- Transparent persistence for Python objects.
- Full ACID-compatible transaction support (including savepoints).
- History/undo ability.
- Efficient support for binary large objects (BLOBs).
- And more.

Saying that shelve provides something "similar to what zodb provides" is...well, a stretch.

Comment #3: Posted 1 year, 9 months ago by Kevin Veroneau

I do believe I will move over my development efforts to ZODB, as in all reality, the dependencies aren't all that much. Considering how much I gain from using ZODB over Shelve, it makes sense to move over to ZODB. If this project ends up being used for something, I'd rather it use ZODB than Shelve. Thank you for all your comments.

The dependencies when installed using Debian Squeeze's APT, is 10. A good amount of the packages contain "zope", and some of them make me wonder, like python-zdaemon. I guess I just like to keep a lean machine, I'll install over PyPI instead.

PyPI had very similar dependencies, including zdaemon and the compiling of BTree. Total of 7 dependencies using PyPI. In my book, anything over 5 I consider a lot.

Comment #4: Posted 1 year, 9 months ago by alex dante

I'm a big fan of the shove module (http://pypi.python.org/pypi/shove) which is effectively shelve with other back ends, included ZODB. But for any real usage, I always rely on ZODB.

About Me

My Photo
Names Kevin, hugely into UNIX technologies, not just Linux. I've dabbled with the demons, played with the Sun, and now with the Penguins.




Kevin Veroneau Consulting Services
Do you require the services of a Django contractor? Do you need both a website and hosting services? Perhaps I can help.

If you like what you read, please consider donating to help with hosting costs, and to fund future books to review.

Python Powered | © 2012-2014 Kevin Veroneau