Alzabo - A data modelling tool and RDBMS-OO mapper |
Alzabo - A data modelling tool and RDBMS-OO mapper
Cannot be summarized here.
Alzabo is a program and a suite of modules, with two core functions. Its first use is as a data modelling tool. Through either a schema creation interface or a perl program, you can create a set of schema, table, column, etc. objects to represent your data model. Alzabo is also capable of reverse engineering your data model from an existing system.
Its second function is as an RDBMS to object mapping system. Once you
have created a schema, you can use the
Alzabo::Runtime::Table
and
Alzabo::Runtime::Row
classes to access its
data. These classes offer a high level interface to common operations
such as SQL SELECT, INSERT, DELETE, and UPDATE commands.
A higher level interface can be created through the use of the
Alzabo::MethodMaker
module. This module
takes a schema object and auto-generates useful methods based on the
tables, columns, and relationships it finds in the module. The code
is generates can be integrated with your own code quite easily.
To take it a step further, you could then aggregate a set of rows from different tables into a larger container object which could understand the logical relationship between these tables.
The Alzabo::Runtime::Row
objects support
the use of a caching system. Caching modules are included with the
distribution. However, you may substitute any caching system you like
provided it has the appropriate method interface.
Alzabo has a lot of documentation. If you are primarily interested in using Alzabo as an RDBMS-OO wrapper, much of the documentation can be skipped. This assumes that you will create your schema via the schema creation interface or via reverse engineering.
Here is the suggested reading order:
The section for your RDBMS:
the Alzabo::Runtime::Schema manpage - The most important parts here are those
related to loading a schema and connecting to a database. Also be
sure to read about the join
method.
the Alzabo::Runtime::Table manpage - This contains most of the methods used to
fetch rows from the database, as well as the
insert
method.
the Alzabo::Runtime::Row manpage - The row objects are how data is updated,
deleted, and retrieved from the database. Its also important to read
the section on its import
, as
this is used to determine how caching is done (for now).
the Alzabo::Runtime::Cursor manpage - The most important part of the documentation here is the HANDLING ERRORS section.
the Alzabo::Runtime::RowCursor manpage - A cursor object that returns only a single row.
the Alzabo::Runtime::JoinCursor manpage - A cursor object that returns multiple rows.
the Alzabo::MethodMaker manpage - One of the most useful parts of Alzabo. This module can be used to auto-generate methods based on the structure of your schema.
the Alzabo::ObjectCache manpage - This describes how to select the caching modules you want to use. It contains a number of scenarios and descibes how they are affected by caching. If you plan on using Alzabo in a multi-process environment (such as mod_perl) this is very important.
the Alzabo::Exceptions manpage - Describes the nature of all the exceptions used in Alzabo.
the Alzabo::QuickRef manpage - A quick reference for the various methods of the Alzabo objects.
Every schema keeps track of whether it has been instantiated or not.
A schema that is instantiated is one that exists in an RDBMS backend.
This can be done explicitly by calling the schema's
create
method. It is also
implicitly set when a schema is created as the result of reverse engineering.
Instantiation has several effects. The most important part of this is
to realize that once a schema is instantiated, the way it generates
SQL for itself changes. Before it is instantiated, if you ask it to
generate SQL via
Alzabo::Create::Schema->make_sql
,
it will generate the set SQL statements that are needed to create the
schema in the RDBMS.
After is instantiated, the schema will instead generate the SQL necessary to convert the version in the RDBMS backend to match the object's current state. This can be though of as a SQL 'diff'.
While this feature is quite useful, it can be confusing too. The most
surprising aspect of this is that if you create a schema via reverse engineering and then call
Alzabo::Create::Schema->make_sql
,
you will not get any SQL. This is because the schema knows that it is
instantiated and it also knows that it is the same as the version in
the RDBMS, so no SQL is necessary.
The way to deal with this is to call the
Alzabo::Create::Schema->set_instantiated
method with a false value. Use this method with care.
In Alzabo, data is returned in the form of a row object. This object can be used to access the data for an individual row.
Unless you are retrieving a row via a unique identifier (usually its
primary key), you will be given a cursor
object. This is quite similar to how DBI
uses statement handles
and is done for similar reasons.
The first thing you'll want to do is create a schema. The easiest way to do this is via the included web based schema creation interface, which requires the HTML::Mason package from CPAN (www.cpan.org) to run.
This interface can be installed during the normal installation process, and you will be prompted as to whether or not you want to use it.
The other way to create a schema is via a perl script. Here's the beginning of such a script:
use Alzabo::Create::Schema;
eval { my $s = Alzabo::Create::Schema->new( name => 'foo', rdbms => 'MySQL' );
my $table = $s->make_table( name => 'some_table' );
my $a_col = $table->make_column( name => 'a_column', type => 'int', nullable => 0, sequenced => 0, attributes => [ 'unsigned' ] );
$table->add_primary_key($a_col);
my $b_col = $table->make_column( name => 'b_column', type => 'varchar', length => 240, nullable => 0 );
$table->make_index( columns => [ { column => $b_col, prefix => 10 } ] );
...
$s->save_to_file; };
if ($@) { handle exceptions }
Alzabo uses exceptions as its error reporting mechanism. This means
that pretty much all calls to its methods should be wrapped in
eval{}
. This is less onerous than it sounds. In general, there's
no reason not to wrap all of your calls in one eval, rather than each
one in a seperate eval. Then at the end of the block simply check the
value of $@
. See the code of the included HTML::Mason
based
interface for examples.
Also see the Alzabo::Exceptions
documentation,
which lists all of the different exception used by Alzabo.
Alzabo is a powerful tool but as with many powerful tools it can also be a bit overwhelming at first. The easiest way to understand some of its basic capabilities is through some examples.. Let's first assume that you've created the following schema:
TABLE: Movie movie_id tinyint -- primary key title varchar(200) release_year year
TABLE: Person person_id tinyint -- primary key name varchar(200) birthdate date birthplace_location_id tinyint -- foreign key to location
TABLE: Job job_id tinyint -- primary key job varchar(200) -- something like 'actor' or 'director'
TABLE: Credit movie_id tinyint -- primary key part 1, foreign key to movie person_id tinyint -- primary key part 2, foreign key to person job_id tinyint -- primary key part 3, foreign key to job
TABLE: Location location_id tinyint -- primary key location varchar(200) -- 'New York City' or 'USA' parent_location_id tinyint -- foreign key to location
This is a vastly scaled down version of the 90+ table database that Alzabo was written to support.
First of all, let's do something simple. Let's assume I have a person_id value and I want to find all the movies that they were in and print the title, year of release, and the job they did in the movie. Here's what it looks like:
my $schema = Alzabo::Runtime::Schema->load_from_file( name => 'movies' );
my $person_t = $schema->table('Person'); my $credit_t = $schema->table('Credit'); my $movie_t = $schema->table('Movie'); my $job_t = $schema->table('Job');
# returns a row representing this person. my $person = $person_t->row_by_pk( pk => 42 );
# all the rows in the credit table that have the person_id of 42. my $cursor = $person->rows_by_foreign_key( foreign_key => $person_t->foreign_keys_by_table($credit_t) );
print $person->select('name'), " was in the following films:\n\n"; while (my $credit = $cursor->next_row) { # rows_by_foreign_key returns a RowCursor object. We immediately # call its next_row method, knowing it will only have one row (if # it doesn't then our referential integrity is in trouble!) my $movie = $credit->rows_by_foreign_key( foreign_key => $credit_t->foreign_keys_by_table($movie_t) )->next_row;
my $job = $credit->rows_by_foreign_key( foreign_key => $credit_t->foreign_keys_by_table($job_t) )->next_row;
print $movie->select('title'), " released in ", $movie->select('release_year'), "\n"; print ' ', $job->('job'), "\n"; }
A more sophisticated version of this code would take into account that a person can do more than one job in the same movie.
The method names are admittedly verbose but the end result code is significantly simpler to read than the equivalent using raw SQL and DBI calls.
Let's redo the example to use
Alzabo::MethodMaker
;
# I'm assuming that the pluralize_english subroutine pluralizes # things as one would expect. use Alzabo::MethodMaker( schema => 'movies', all => 1, name_maker => \&method_namer );
my $schema = Alzabo::Runtime::Schema->load_from_file( name => 'movies' );
# instantiates a row representing this person. my $person = $schema->Person->row_by_pk( pk => 42 );
# all the rows in the credit table that have the person_id of 42. my $cursor = $person->Credits;
print $person->name, " was in the following films:\n\n"; while (my $credit = $cursor->next_row) { my $movie = $credit->Movie;
my $job = $credit->Job;
print $movie->title, " released in ", $movie->release_year, "\n"; print ' ', $job->job, "\n"; }
Let's assume that we've been passed a hash of values representing an update to the location table. Here's a way of making sure that that this update won't lead to a loop in terms of the parent/child relationships.
sub update_location { my $self = shift; # this is the row object my %data = @_; if ( $data{parent_location_id} ) { my $parent_location_id = $data{parent_location_id}; my $location_t = $schema->table('Location'); while ( my $location = eval { $location_t->row_by_pk( pk => $parent_location_id ) } ) { die "Insert into location would create loop" if $location->select('parent_location_id') == $data{location_id};
$parent_location_id = $location->select('parent_location_id'); } } }
Once again, let's rewrite the code to use
Alzabo::MethodMaker
:
sub update_location { my $self = shift; # this is the row object my %data = @_; if ( $data{parent_location_id} ) { my $location = $self; while ( my $location = eval { $location->parent } ) { die "Insert into location would create loop" if $location->parent_location_id == $data{location_id}; } } }
Each subclass of Alzabo::SQLMaker is capable of exporting functions
that allow you to use all the SQL functions that your RDBMS provides.
These functions are normal Perl functions. They take as argument
normal scalars (strings and numbers), Alzabo::Column
objects, or
the return value of another SQL function. They may be used to select
data via the
Alzabo::Runtime::Table->function
method. They may also be used as part updates, inserts, and where
clauses, any place that is appropriate.
Examples:
use Alzabo::SQLMaker::MySQL qw(MAX NOW PI);
my $max = $table->function( function => MAX( $table->column('budget') ), where => [ $table->column('country'), '=', 'USA' ] );
$table->insert( values => { create_date => NOW() } );
$row->update( pi => PI() );
my $cursor = $table->rows_where( where => [ $table->column('expire_date'), '<=', NOW() ] );
my $cursor = $table->rows_where( where => [ LENGTH( $table->column('password'), '<=', 5 ] );
The documentation for the Alzabo::SQLMaker subclass for your RDBMS will contain a detailed list of all exportable functions.
In MySQL, there are a number of various types of integers. The type TINYINT can hold values from -128 to 127. But what if have more than 127 movies? And if that's the case we might have more than 127 people too.
For safety's sake, it might be best to make all of the primary key integer columns INT columns instead. And while we're at it we want to make them UNSIGNED as well, as we don't need to insert negative numbers into these columns.
You could break out the RDBMS manual (because you probably forgot the
exact ALTER TABLE syntax you'll need). Or you could use Alzabo. Note
that this time it is an
Alzabo::Create::Schema
object, not
Alzabo::Runtime::Schema
.
my $schema = Alzabo::Create::Schema->load_from_file( name => 'movies' );
foreach my $t ( $schema->tables ) { foreach my $c ( $t->columns ) { if ( $c->is_primary_key and lc $c->type eq 'tinyint' ) { $c->set_type('int'); $c->add_attribute('unsigned'); } } } $schema->create( user => 'user', password => 'password' ); $schema->save_to_file;
Alzabo aims to be as cross-platform as possible. To that end, RDBMS specific operations are contained in several module hierarchies.
The first, the Alzabo::Driver::*
hierarchy, is used to handle
communication with the database. It uses DBI
and the appropriate
DBD::*
module to handle communications. It provides a higher level
of abstraction than DBI
, requiring that the RDBMS specific modules
implement methods to do such things as create databases or return the
next value in a sequence.
The second, the Alzabo::RDBMSRules::*
hierarchy, is used during
schema creation in order to validate user input such as schema and
table names. It also generates SQL to create the database or turn one
schema into another (sort of a SQL diff). Finally, it also handles
reverse engineering an existing database.
The this, the Alzabo::SQLMaker::*
hierarchy, is used to generate
SQL and handle bound parameters for select, insert, update, and delete
operations.
The RDBMS to be used is specified when creating the schema. Currently, there is no easy way to convert a schema from one RDBMS to another, though this is a future goal.
Alzabo does not provide support for all possible MySQL features but is trying to get there. Missing features currently include table type and transactions.
Postgres support in Alzabo is currently missing several features.
First, reverse engineering does not handle constraints (including foreign keys). This will change in the future.
Second, reverse engineering cannot determine from the existence of a sequence that the sequence is meant to be used for a particular column unless the sequence was created as a result of assigning the serial type to a column.
Third, there is no support for large objects. This was considered but given that 7.1 now supports rows larger than 32K it was determined that supporting large objects was not worth the amount of effort reqiured.
The general design of Alzabo is as follows.
There are objects representing the schema, which contains table objects. Table objects contain column, foreign key, and index objects. Column objects contain column definition objects. A single column definition may be shared by multiple columns, but has only one owner.
This is a diagram of these inheritance relationships:
Alzabo::* (::Schema, ::Table, ::Column, ::ColumnDefinition, ::ForeignKey, ::Index) / \ is parent to / \ Alzabo::Create::* Alzabo::Runtime::*
This a diagram of how objects contain other objects:
Schema - makes--Alzabo::SQLMaker subclass object (many) / \ contains contains--Alzabo::Driver subclass object (1) | \ Table (0 or more) Alzabo::RDBMSRules subclass object (1) / \ (* Alzabo::Create::Schema only) / \ contains-------------------- / \ \ / \ \ ForeignKey Column (0 or more) Index (0 or more) (0 or more) | contains | ColumnDefinition (1)
Note that more than one column _may_ share a single definition object
(this is explained in the
Alzabo::Create::ColumnDefinition
documentation). This is only relevant if you are writing a schema
creation interface.
Alzabo::Driver
Alzabo::SQLMaker
Alzabo::RDBMSRules
Second they are used to generate SQL for creating and updating the database and its tables.
And finally, they also handle the reverse engineering of an existing database.
Alzabo::ObjectCache
, Alzabo::ObjectCache::Store::*
and Alzabo::ObjectCache::Sync::*
Alzabo::ObjectCache::Store
hierarchy are used to store row objects
persistently. The modules in Alzabo::ObjectCache::Sync
are used to
make sure that multiple instances of the same row objects in different
processes don't step on each other's toes.
Alzabo::Runtime::Row
Alzabo::Runtime::Table
,
Alzabo::Runtime::RowCursor
, and
Alzabo::Runtime::JoinCursor
objects.
It is the sole interface by which actual data is retrieved, updated,
or deleted in a table.
Alzabo::Runtime::JoinCursor
and Alzabo::Runtime::RowCursor
Alzabo::Config
Alzabo::ChangeTracker
Alzabo::Create::Schema
.
Alzabo::MethodMaker
Alzabo::Exceptions
Alzabo::Util
There are several reasons for doing this:
Copyright (c) 2000-2001 Dave Rolsky
All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Dave Rolsky, <autarch@urth.org>
Alzabo - A data modelling tool and RDBMS-OO mapper |