ArcheBuilder wrap up (a.k.a. Crossing the XML ocean in a paper boat)

As I mentioned on the last note, I’m resuming work on ArcheBuilder,
which is a Schema Editor for Archetypes. Archetypes, for those
that don’t know, is a Zope product for generating content classes
complete with accessors, mutators, form validation and view and
edit forms automatically generated from a schema definition 1.

[1] The motivation and the concepts behind all of this were
somewhat based on the upcoming Zope 3 and the way it’s schema is
designed, though the way it actually works is slightly different.

Archetypes Schemas are written in Python, by putting together Fields
and Widgets. Here’s a small Schema definition:

schema = BaseSchema +  Schema((
  StringField('group',
              vocabulary=ARTICLE_GROUPS,
              widget=SelectionWidget(),
              ),
  StringField('blurb',
              searchable=1,
              widget=TextAreaWidget(),
              ),
  TextField('body',
            searchable=1,
            required=1,
            primary=1,
            default_output_type='text/html',
            allowable_content_types=('text/plain',
                                     'text/structured',
                                     'text/restructured',
                                     'text/html',
                                     'application/msword'),
            widget=RichWidget(label='Body'),
            ),
  ),
                            marshall=PrimaryFieldMarshaller(),
                            )

As you can see, its pretty simple to build a schema using this
syntax. Turning this into a usable content class inside Zope is not
much more difficult. It just requires you to put this into a class,
and register the class with Archetypes. Giving that you already
defined your schema on a module-level schema variable, your class
and the registration should look like this:

from Products.Archetypes.public import BaseContent, registerType

class Article(BaseContent):
   """This is a sample article"""

   schema = schema

registerType(Article, 'MyProject')

(Note: Some stuff was deliberately omitted for simplicity, but that’s
more or less what there’s to it)

Back in September, we were all on the Castle Sprint — which was held
in a neat castle in Goldegg, Austria — and I started, together with
Paul Everitt and Godefroid Chapelle, to work on what came to be
ArcheBuilder. The goals are:

  • Build a web-based application, capable of building a schema
    definition to be used later to build a content class for using in
    Zope.
  • Be able to show existing schemas
  • Be able to modify existing (non-fs 2) schemas
  • Be able to create new content classes based on existing
    (Archetypes-based 3) ones.
[2] Non-FS schemas are what we currently call Through the web
Schemas
. This is somewhat related to the idea of storing code
inside ZODB, which was already mentioned by Jeremy, but the only
problem is that we don’t have this feature in a stable way in the
current incarnation of ZODB, so we instead store a XML definition
in the ZODB, which is then processed into a transient Schema instance
when requested.
[3] It may also be possible to include other base classes in the
future, though we may completely punt on that if we judge YAGNI.

To build the infrastructure for this, we created a couple of
registries inside Archetypes. There are registries for:

  • Fields
  • Widgets
  • Validators
  • Types
  • Storage (to be built, just reminded that this one is missing)

And from those registries, we build a big XML file using ZPT. You can
take a look at registry.xml if you are curious.

We also defined a Relax NG schema for this file, to be able to
validate it later. Unfortunately the Relax NG is a bit out of sync
currently, but I will fix it soon.

So, using registry.xml and some XSLT, we built the XUL interface
you saw above, which we called ArcheBuilder. The process is done
using Sarissa, which is a neat Javascript library. Here’s the blurb
from Sarissa homepage:

Sarissa is a JavaScript meta-API. It bridges the gap of DOM XML
extentions between Internet Explorer and Mozilla (or Moz-based)
browsers. It is an effort to provide a common interface for those
extentions, bringing them closer to each other. It was originally
created to protect my sanity ;-)

What we do basically is to read registry.xml, then apply a XSLT to
turn it on XUL. Currently, the editor is not fully complete. It is
able to display the types in registry.xml but not of creating or
modifying anything. The read document is kept in memory and modified
while you make modifications using the UI. Then, later when you want
to export a XML file, we just grab a node of the tree and serialize it
to a file.

Soo…. this brings us to the interesting point, where Python is going
get into the process again. Once we have the schema the way we want it
inside the builder, we will want to export it, as an XML document. I
was thinking about saving one type per file, which would then yield a
usage like this:

from Products.Archetypes.public import BaseContent, registerType, \
     XMLSchema

class Article(BaseContent):
   """This is a sample article"""

   schema = XMLSchema('article.xml')

registerType(Article, 'MyProject')

Note that this is not much different from the above, except that we
are using XMLSchema to read the schema from a file here. Then Paul
suggested using a XPointer syntax to addressing the schema. Something
like:

schema = XMLSchema("mytypes.xml#/types/type[name='Article']")

Which is not bad, but I think that most people would dislike to use
the XPath/XPointer syntax. So I proposed to support this syntax, but
assume that the user is going to use by default the first (or the only
schema) defined in a XML file if no path expression is given.

Now, this brings up an interesting issue. We have in our hand:

  • A XML file
  • A Relax NG schema for validating the file
  • An (optional) expression for selecting the schema inside the XML
    file which we are going to process

And we need to:

  • Validate the file
  • Select the right node if an expression was given
  • Turn the node into a Schema class instance, with all the fields and
    widgets and properties, like the schema defined for Article at
    the top of this page.

I was thinking about two choices: One was to do all of this with the
libxml2 python binding, and the other was using libxml2 for validating
and selecting the node, and then using SAX for the XML -> Schema
instance transformation. I think that it may even be possible to do it
using XSLT, but I don’t like the idea that much. What is your opinion?
What is the best way to do it? Do you like the overall idea? Leave
your comments!

Advertisement

One thought on “ArcheBuilder wrap up (a.k.a. Crossing the XML ocean in a paper boat)

  1. You asked…
    I’ve got some experience with XML technologies, but little with python xml binding, so excuse the comments if they are somewhat ignorant.
    The overall idea is sound. I see value in being able to specify a node expression as well as defaulting to the root.
    I do not see an advantage to use SAX processing for instance transformation as the XML files should be relatively small in size. Benefits of using SAX are execution time and small memory footprint. I am unsure if your functionality will realize either benefit. Additionally, using straight libxml2 may simplify the process for newcomers like myself. My comment is doubly applied to introducing XSLT. I could do it, but rather not have the added complexity.
    Now with all this said, I assume the creation of the instance is design time and not runtime. If I am wrong, then by all means take the benefits of using SAX. Plone/Zope has enough of system resource footprint.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.