DBIx::DataModel - Classes and UML-style Associations on top of DBI |
insert_into_...
DBIx::DataModel - Classes and UML-style Associations on top of DBI
Declare the schema
use DBIx::DataModel; DBIx::DataModel->Schema('MySchema'); # MySchema is now a Perl package
Declare the tables with
(Perl name, DB name, primary key column(s))
.
Each table then becomes a Perl package.
MySchema->Table(qw/Employee Employee emp_id/); MySchema->Table(qw/Department Department dpt_id/); MySchema->Table(qw/Activity Activity act_id/);
Declare associations or compositions in UML style
( [table1 role1 multiplicity1 join1], [table2...]
).
MySchema->Composition([qw/Employee employee 1 /], [qw/Activity activities * /]); MySchema->Association([qw/Department department 1 /], [qw/Activity activities * /]);
Declare a n-to-n association, on top of the linking table
MySchema->Association([qw/Department departments * activities department/]); [qw/Employee employees * activities employee/]);
Declare ``column types'' with some handlers ..
# date conversion between database (yyyy-mm-dd) and user (dd.mm.yyyy) MySchema->ColumnType(Date => fromDB => sub {$_[0] =~ s/(\d\d\d\d)-(\d\d)-(\d\d)/$3.$2.$1/}, toDB => sub {$_[0] =~ s/(\d\d)\.(\d\d)\.(\d\d\d\d)/$3-$2-$1/}, validate => sub {$_[0] =~ m/(\d\d)\.(\d\d)\.(\d\d\d\d)/}); # 'percent' conversion between database (0.8) and user (80) MySchema->ColumnType(Percent => fromDB => sub {$_[0] *= 100 if $_[0]}, toDB => sub {$_[0] /= 100 if $_[0]}, validate => sub {$_[0] =~ /1?\d?\d/});
.. and apply these ``column types'' to some of our columns
Employee->ColumnType(Date => qw/d_birth/); Activity->ColumnType(Date => qw/d_begin d_end/); Activity->ColumnType(Percent => qw/activity_rate/);
Declare a column that will be filled automatically at each update
MySchema->AutoUpdateColumns(last_modif => sub{$ENV{REMOTE_USER}.", ".scalar(localtime)});
For details that could not be expressed in a declarative way, just add a new method into the table class (but in that case, Schema and Table declarations should be in a BEGIN block, so that the table class is defined before you start adding methods to it).
package Activity; sub activePeriod { my $self = shift; $self->{d_end} ? "from $self->{d_begin} to $self->{d_end}" : "since $self->{d_begin}"; }
Declare how to automatically expand objects into data trees
Activity->AutoExpand(qw/employee department/);
use MySchema;
Search employees whose name starts with 'D' (select API is taken from the SQL::Abstract manpage)
my $empl_D = Employee->select({lastname => {-like => 'D%'}});
idem, but we just want a subset of the columns, and order by age.
my $empl_F = Employee->select(-columns => [qw/firstname lastname d_birth/], -where => {lastname => {-like => 'F%'}}, -orderBy => 'd_birth');
Print some info from employees. Because of the 'fromDB' handler associated with column type 'date', column 'd_birth' has been automatically converted to display format.
foreach my $emp (@$empl_D) { print "$emp->{firstname} $emp->{lastname}, born $emp->{d_birth}\n"; }
Same thing, but using method calls instead of direct access to the hashref (must enable AUTOLOAD in the table or the whole schema)
Employee->Autoload(1); # or MySchema->Autoload(1) foreach my $emp (@$empl_D) { printf "%s %s, born %s\n", $emp->firstname, $emp->lastname, $emp->d_birth; }
Follow the joins through role methods
foreach my $act (@{$emp->activities}) { printf "working for %s from $act->{d_begin} to $act->{d_end}", $act->department->name; }
Role methods can take arguments too, like select()
my $recentAct = $dpt->activities({d_begin => {'>=' => '2005-01-01'}}); my @recentEmpl = map {$_->employee([qw/firstname lastname/])} @$recentAct;
Export the data : external helper modules usually expect a full data tree (instead of calling methods dynamically), so we need to expand the objects :
$_->expand('activities') foreach @$empl_D; my $export = {employees => $empl_D}; use Data::Dumper; print Dumper ($export); # export as PerlDump use XML::Simple; print XMLout ($export); # export as XML use JSON; print objToJson($export); # export as Javascript
Select associated tables directly from a database join, in one single SQL statement (instead of iterating through role methods).
my $lst = MySchema->ViewFromRoles(qw/Employee activities department/) ->select([qw/lastname dept_name d_begin/], {d_begin => {'>=' => '2000-01-01'}});
DBIx::DataModel
is a wrapper framework for building Perl
abstractions (classes, objects and datastructures) that interact
with relational database management systems (RDBMS).
Of course the ubiquitous DBI module is used as
a basic layer for communicating with databases; on top of that,
DBIx::DataModel
provides facilities for generating SQL queries,
joining tables automatically, navigating through the results,
converting values, and building complex datastructures so that other
modules can conveniently exploit the data.
There are many other CPAN modules offering similar features, like Class::DBI, DBIx::Class, Alzabo, Tangram, Rose::DB::Object, just to name a few well-known alternatives. Module frameworks in this family are called object-relational mappings (ORMs). The mere fact that they are so numerous demonstrates that there is more than one way to do it, and therefore it is quite unlikely that any ORM would ever cover all possible needs.
A brief discussion of the design space and of other Perl ORMs
is provided in section SEE ALSO of this
manual; for the moment, we will concentrate on introducing
the main concepts and features of DBIx::DataModel
.
Although the basic principles are quite simple, there are many details to discuss, so the documentation is quite long. In an attempt to accomodate for different needs of readers, it has been structured as follows :
DBIx::DataModel
; it is mainly
of interest if you are comparing various ORMs.
The QUICKSTART section summarizes the main
steps to get started with the framework.
The METHOD REFERENCE section is a complete reference
to all methods, divided according to usage steps:
creating a schema, populating it with table and associations,
parameterizing the framework, and finally data retrieval and
manipulation methods.
The OTHER CONSIDERATIONS section discusses
how this framework interacts with its context
(Perl namespaces, DBI layer, etc.).
The INTERNALS section documents the internal
structure of the framework, for programmers who might
be interested in extending it.
The SEE ALSO section briefly discusses
other ORMs.
The TO DO section lists some features that hopefully
will be implemented in a future release.
DISCLAIMER: although already in production in our organization, this code is still in beta, so the API may slightly change in future versions.
This section covers the main motivating principles for proposing yet
another ORM. Read it if you are currently evaluating whether
DBIx::DataModel
is suitable for your context. Skip it and jump to
the QUICKSTART section if you want to directly
start using the framework.
DBIx::DataModel
provides abstractions that help client applications
to automate some common tasks; however, access to lower-level
layers remains open, for cases where detailed operations are needed :
Besides basic SQL data definition statements,
RDBMS often come with their own helper tools for creating or modifying
a database schema (interactive editors for tables,
columns, datatypes, etc.). Therefore
DBIx::DataModel
provides no support in this area,
and assumes that the database schema is pre-existent.
To talk to the database, the framework only needs to know a bare minimum about the schema, namely the table names, primary keys, and UML associations; but no details are required about column names or their datatypes.
Most RDBMS have facilities for checking or ensuring integrity rules :
foreign key constraints, restricted ranges for values, cascaded
deletes, etc. DBIx::DataModel
can also do some validation
tasks, by setting up column types with a validate
handler;
however, it is better advised to exploit data integrity
checks within the RDBMS whenever possible.
Often in ORMs, columns in the table are in 1-to-1 correspondance with attributes in the class; so any transfer between database and memory systematically includes all the columns, both for selects and for updates. Of course this has the advantage of simplicity for the programmer. However, it may be very inefficient if the client program only wants to read two columns from a very_big_table.
Furthermore, unexpected concurrency problems may occur : in a scenario such as
client1 client2 ======= ======= my $obj = MyTable->fetch($key); my $obj = MyTable->fetch($key); $obj->set(column1 => $val1); $obj->set(column2 => $val2); $obj->update; $obj->update;
the final state of the row should theoretically
be consistent for any concurrent execution of client1
and client2
.
However, if the ORM layer blindly updates all columns, instead of just
the changed columns, then the final value of column1
or
column2
is unpredictable.
To diminish the efficiency problem, some ORMs offer the possibility to partition columns into several column groups. The ORM layer then transparently fetches the appropriate groups in several steps, depending on which columns are requested from the client. However, this might be another source of inefficiency, if the client frequently needs one column from the first group and one from the second group.
With DBIx::DataModel
, the client code has very precise control over
which columns to transfer, because these can be specified separately at
each method call. Whenever efficiency is not an issue, one
can be lazy and specify nothing, in which case the SELECT columns will
default to ``*''. Actually, the schema
does not know about column names, except for primary and
foreign keys, and therefore would be unable to transparently
decide which columns to retrieve. Consequently, objects from a
given class may be of variable size :
my $objs_A = MyTable->select(-columns => [qw/c1 c2/], -where => {name => {-like => "A%"}}; my $objs_B = MyTable->select(-columns => [qw/c3 c4 c5/], -where => {name => {-like => "B%"}}; my $objs_C = MyTable->select(# nothing specified : defaults to '*' -where => {name => {-like => "C%"}};
Therefore the programmer has much more freedom and control, but of
course also more responsability : in this example, attempts to access
column c1
in members of @$objs_B
would yield an error.
ORMs often have difficulties to exploit database joins, because joins contain columns from several tables at once. If tables are mapped to classes, and rows are mapped to objects of those classes, then what should be the class of a joined row ? Three approaches can be taken
DBIx::DataModel
.
Relationships are expressed in a syntax designed to closely reflect how they would be pictured in a Unified Modelling Language (UML) diagram. The general form is :
MySchema->Association([$class1, $role1, $multiplicity1, @columns1], [$class2, $role2, $multiplicity2, @columns2]);
yielding for example the following declaration
MySchema->Association([qw/Department department 1 /], [qw/Activity activities * /]);
which corresponds to UML diagram
+------------+ +------------+ | | 1 0..* | | | Department +-------------------------+ Activities | | | department activities | | +------------+ +------------+
This states that there is an association between classes Department
and Activity
, with the corresponding role names (roles are used
to navigate through the association in both directions), and with the
corresponding multiplicities (here an activity corresponds to exactly
one employee, while an employee may have many activities).
In the UML specification, role names and multiplicities are optional (as a matter of fact, many UML diagrams use association names, or even anonymous associations, instead of role names). Here, both role names and multiplicities are mandatory, because they are needed for code generation.
The association declaration is bidirectional, so it will simultaneously add features in both participating classes.
In order to generate the appropriate SQL join statements, the framework needs to know the join column names on both sides; these can be either given explicitly in the declaration, or they are guessed from the primary key of the table with multiplicity 1.
Role names declared in the association are used for a number of purposes : implementing methods for direct navigation, implementing methods for inserting new members into owner objects, and implementing multi-step navigation paths through several assocations, such as in :
$myDepartment->selectFromRoles(qw/activities employee spouse/) ->(-columns => \@someColumns, -where => \%someCriteria);
Information known by the schema about the associations will be used to automatically generate the appropriate database joins. The kinds of joins (INNER JOIN, LEFT OUTER JOIN) are inferred from the multiplicities declared in the association. These can also be explicitly overridden by writing
...->selectFromRoles(qw/activities <=> employee <=> spouse/) # inner joins
...->selectFromRoles(qw/activities => employee => spouse/) # left joins
If referential integrity rules are declared within the RDBMS, then
there is some overlap with what is declared here on the Perl
side. However, it would not be possible to automatically deduce all
association information from database metadata, because the database
does not know about role names and multiplicities. Therefore
DBIx::DataModel
has no ``loader'' facility to automatically
generate a schema.
Compositions are specific kinds of associations, pictured in UML
with a black diamond on the side of the composite class;
in DBIx::DataModel
, those are expressed by calling the
schemas's Composition method instead of Association.
As a result, the composite class will be able to perform
cascaded insertions and deletions on data trees (for example
from structured data received through an external XML or JSON file, and
inserted into the database in a single method call).
The reverse is also true : the composite class is able to automatically call its own methods to gather data from associated classes and build a complete data tree in memory. This is declared through the AutoExpand method and is useful for passing structured data to external modules, like for example XML or JSON exports.
A DBIx::DataModel
schema can declare some column types : these
are collections of handlers (callback functions) for
performing tasks such as data validation or transformation.
Handlers are then attached to specific columns belonging to that column
type.
The handler concept is generic and can be exploited by client programs
according to the application domain. However, some handler names
have a special meaning within the framework :
for example, handlers named fromDB
or toDB
are automatically
called when transfering data from or to the database.
Take for example the ``Percent'' column type shown in the Synopsis :
# 'percent' conversion between database (0.8) and user (80) MySchema->ColumnType(Percent => fromDB => sub {$_[0] *= 100 if $_[0]}, toDB => sub {$_[0] /= 100 if $_[0]}, validate => sub {$_[0] =~ /1?\d?\d/});
Note that this notion of ``type'' is independent from the actual datatypes defined within the database (integer, varchar, etc.). From the Perl side, these are all seen as scalar values. So a column type as defined here is just a way to specify some operations, programmed in Perl, that can be performed on the scalar values.
The default mechanism to access columns within a row is the hashref API:
do_something_with($my_row->{column_name});
However, a method call API can be turned on, which would then yield:
do_something_with($my_row->column_name());
A schema can contain View
declarations, which are
abstractions of SQL statements. This is exactly the
same idea as database views, except that they are implemented
within the ORM, not within the database. Such views
can join several tables, or can specify WHERE
clauses to filter the data. ORM views are useful to
implement application-specific or short-lived requests,
that would not be worth registering persistently within
the database model. They can also be useful if you have
no administration rights in the database.
Of course it is also possible to access database views,
because the ORM sees them as ordinary tables.
Every method involving a SELECT in the database (either when searching rows from a table or collection of tables, or when following associations from an existing row) accepts an number of optional parameters that closely correspond to SQL clauses. The programming interface reuses what is defined in the excellent SQL::Abstract module, with some extensions. Therefore it is possible for example to specify
All these parameters are specified at the statement level, and
therefore may vary between subsequent calls to the same class.
This is in contrast with many other ORMs where the set of columns
or the ordering criteria are specified at schema definition time.
As already stated above, DBIx::DataModel
gives more
freedom to client programs, but also more responsability.
DBIx::DataModel
only depends on DBI and
SQL::Abstract, so
it should be very easy to install even without help of tools
like ppm
, cpan
or cpanp
.
This section will show the main steps to get started
with DBIx::DataModel
. The goal here is conciseness, not
completeness; a full reference will be given in the next sections.
The tutorial is a gentle expansion of the examples given
in the SYNOPSIS, namely a small human resources management
system.
Before starting with DBIx::DataModel
, you should have
installed CPAN modules DBI and SQL::Abstract.
You also need a database management system with a DBD driver.
Use your database modeling tool to create some tables for employees, departments, activities (an employee working in a department from a start date to an end date), and employee skills. If you have no modeling tool, you can also feed something like the following SQL code to the database
CREATE TABLE t_employee ( emp_id INTEGER AUTO_INCREMENT PRIMARY KEY, lastname TEXT NOT NULL, firstname TEXT, d_birth DATE ); CREATE TABLE t_department ( dpt_code VARCHAR(5) PRIMARY KEY, dpt_name TEXT NOT NULL ); CREATE TABLE t_activity ( act_id INTEGER AUTO_INCREMENT PRIMARY KEY, emp_id INTEGER NOT NULL REFERENCES t_employee(emp_id), dpt_code VARCHAR(5) NOT NULL REFERENCES t_department(dpt_code), d_begin DATE NOT NULL, d_end DATE ); CREATE TABLE t_skill ( skill_code VARCHAR(2) PRIMARY KEY, skill_name TEXT NOT NULL ); CREATE TABLE t_employee_skill ( emp_id INTEGER NOT NULL REFERENCES t_employee(emp_id), skill_code VARCHAR(2) NOT NULL REFERENCES t_skill(skill_code), CONSTRAINT PRIMARY KEY (emp_id, skill_code) );
As can be seen from this SQL, we assume that the primary keys
for t_employee
and t_activity
are generated
automatically by the RDBMS. Primary keys for other tables
are character codes and therefore should be supplied by
the client program. We decided to use the suffixes
_id
for auto-generated keys, and _code
for user-supplied
codes.
DBIx::DataModel
needs to acquire some knowledge about
the datamodel. So we first declare a schema :
use DBIx::DataModel; DBIx::DataModel->Schema('HR');
Here we have chosen a simple acronym HR
as the schema name, but it
could as well have been something like Human::Resources
.
The schema now is a Perl class, so we invoke its Table
method to declare the first table within the schema :
HR->Table(qw/HR::Employee t_employee emp_id/);
This creates a new Perl class named HR::Employee
. It could as well
have been simply named Employee
, or any other legal Perl package
name; here the use of the schema name as a prefix is just a design
choice, not an obligation. The second argument t_employee
is the
database table, and the third argument emp_id
is the primary key.
So far nothing is declared about other columns in the table.
Other tables are declared in a similar fashion :
HR->Table(qw/HR::Department t_department dpt_code/); HR->Table(qw/HR::Activity t_activity act_id/); HR->Table(qw/HR::Skill t_skill skill_code/); HR->Table(qw/HR::EmployeeSkill t_employee_skill emp_id skill_code/);
This last declaration has 4 arguments because the primary key ranges over 2 columns.
RDBMS will usually require that dates be in ISO format of shape
yyyy-mm-dd
. Let's assume our users are European and want
to see and enter dates of shape dd.mm.yyyy
. Insert of converting
back and forth within the client code, it's easier to do it at the ORM
level. So we define conversion routines within a ``Date'' column type
HR->ColumnType(Date => fromDB => sub {$_[0] =~ s/(\d\d\d\d)-(\d\d)-(\d\d)/$3.$2.$1/ if $_[0]}, toDB => sub {$_[0] =~ s/(\d\d)\.(\d\d)\.(\d\d\d\d)/$3-$2-$1/ if $_[0]}, validate => sub {$_[0] =~ m/\d\d\.\d\d\.\d\d\d\d/});
and then apply this type to the appropriate columns
HR::Employee->ColumnType(Date => qw/d_birth/); HR::Activity->ColumnType(Date => qw/d_begin d_end/);
Here we just perform scalar conversions; another design choice could be to ``inflate'' the data to some kind of Perl objects.
Observe that ColumnType
is overloaded : when invoked on a schema, it
defines a column type; when invoked on a class, it
applies the column type to some columns.
Now we will declare a binary association between departements and activities:
HR->Association([qw/HR::Department department 1 /], [qw/HR::Activity activities * /]);
The Association
method takes two references to lists of arguments;
in each of them, we find : class name, role name, multiplicity, and
optionally the names of columns participating in the join. Here
column names are not specified, so the method assumes that the join
is on dpt_code
(from the primary key of the class
with multiplicity 1 in the association). Since associations
are symmetric, you could as well supply the two lists in the
reverse order.
The declaration should be read crosswise, like when reading a UML
class diagram : here we are stating that a department may be associated
with several activities; therefore the HR::Department
class will
contain an activities
method which returns an arrayref. Conversely,
an activity is associated with exactly one department, so the
HR::Activity
class will contain a department
method which returns a
single instance of HR::Department
.
The second association could be defined in a similar way; but here we will introduce the new concept of composition.
HR->Composition([qw/HR::Employee employee 1 /], [qw/HR::Activity activities * /]);
This looks exactly like an association declaration; but it states that an activity somehow ``belongs'' to an employee (cannot exist without being attached to an employee, and is often created and deleted together with the employee). In a UML class diagram, this would be pictured with a black diamond on the Employee side. Using a composition instead of an association in this particular example would perhaps be debated by some data modelers; but at least it allows us to illustrate the concept.
A composition declaration behaves in all respects like an association.
The main difference is in insert
and delete
methods, which will
be able to perform more complex operations on data trees : for example
it will be possible in one method call to insert an employee together
with its activities. Compositions also support auto-expansion
of data trees through the AutoExpand method.
Now comes the association between employees and skills, which is a many-to-many association. This happens in two steps: first we declare as usual the associations with the linking table :
HR->Association([qw/HR::Employee employee 1 /], [qw/HR::EmployeeSkill emp_skills * /]);
HR->Association([qw/HR::Skill skill 1 /], [qw/HR::EmployeeSkill emp_skills * /]);
Then we declare the many-to-many association:
HR->Association([qw/HR::Employee employees * emp_skills employee/], [qw/HR::Skill skills * emp_skills skill /]);
This looks almost exactly like the previous declarations, except that the last arguments are no longer column names, but rather role names: these are the sequences of roles to follow in order to implement the association. This example is just an appetizer; more explanations are provided in the reference section.
To use the schema, we first need to supply it with a database connection :
my $dbh = DBI->connect(...); # parameters according to your RDBMS HR->dbh($dbh); # give $dbh handle to the schema
Now we can start populating the database:
my ($bach_id, $berlioz_id, $monteverdi_id) = HR::Employee->insert({firstname => "Johann", lastname => "Bach" }, {firstname => "Hector", lastname => "Berlioz" }, {firstname => "Claudio", lastname => "Monteverdi"});
Observe that several rows can be created at once (of course you get the
same result by calling insert()
several times). According to our
earlier assumptions, keys are generated automatically within the
database, so they need not be supplied here. The return value of the
method is the list of generated ids (provided that your database driver
supports DBI's last_insert_id method).
Similarly, we create some departments and skills (here with explicit primary keys) :
HR::Department->insert({dpt_code => "CPT", dpt_name => "Counterpoint" }, {dpt_code => "ORCH", dpt_name => "Orchestration"});
HR::Skills->insert({skill_code => "VL", skill_name => "Violin" }, {skill_code => "KB", skill_name => "Keyboard"}, {skill_code => "GT", skill_name => "Guitar"},
To perform updates, there is either a class method or an object method. Here is an example with the class method :
HR::Employee->update($bach_id => {firstname => "Johann Sebastian"});
Associations have their own insert methods, named insert_into_*
:
my $bach = HR::Employee->fetch($bach_id); # get single record from prim.key $bach->insert_into_activities({d_begin => '01.01.1695', d_end => '18.07.1750', dpt_code => 'CPT'}); $bach->insert_into_emp_skills({skill_code => 'VL'}, {skill_code => 'KB'});
Compositions implement cascaded inserts from a given data tree :
HR::Employee->insert({firstname => "Richard", lastname => "Strauss", activities => [ {d_begin => '01.01.1874', d_end => '08.09.1949', dpt_code => 'ORCH' } ]});
The select()
method retrieves several records from a class :
my $all_employees = HR::Employee->select; foreach my $emp (@$all_employees) { do_something_with($emp); }
or maybe we want something more specific :
my @columns = qw/firstname lastname/; my %criteria = (lastname => {-like => 'B%'}); my $some_employees = HR::Employee->select(-columns => \@columns, -where => \%criteria, -orderBy => 'd_birth');
From a given object, role methods allow us to get associated objects :
foreach my $emp (@$all_employees) { print "$emp->{firstname} $emp->{lastname} "; my @skill_names = map {$_->{skill_name} }} @{$emp->skills}; print " has skills ", join(", ", @skill_names) if @skill_names; }
Passing arguments to role methods, we can restrict to
specific columns or specific rows, exactly like the
select()
method :
my @columns = qw/d_begin d_end/; my %criteria = (d_end => undef); my $current_activities = $someEmp->activities(-columns => \@columns, -where => \%criteria);
And it is possible to join on several roles at once:
my $result = $someEmp->selectFromRoles(qw/activities department/) ->(-columns => \@columns, -where => \%criteria);
This concludes our short tutorial. More examples are given in the reference section below.
Method names starting with an uppercase letter are meant to be compile-time class methods. These methods will typically be called when loading a module like 'MySchema.pm', and therefore will be executed during the BEGIN phase of the Perl compiler. They instruct the compiler to create classes, methods and datastructures for representing the elements of a database schema.
Method names starting with a lowercase letter are meant to be usual run-time methods, either for classes or for instances.
DBIx::DataModel->Schema($schemaName, %options)
Creates a new Perl class of name $schemaName
that represents a
database schema. That class inherits from
DBIx::DataModel::Schema
.
Possible options are :
dbh => $dbh
sqlDialect => $dialect
'MsAccess'
, 'BasisODBC'
,
'BasisJDBC'
or 'Default'
(contributions to enrich this list
are welcome). Otherwise $dialect
can also be a hashref in which you
supply the following information :
%s INNER JOIN %s ON %s
.
If your database does not support inner joins, set this to undef
and the generated SQL will be in the form T1, T2, ... Tn WHERE ... AND ...
.
%s LEFT OUTER JOIN %s ON %s
.
left
or right
%s AS %s
.
tableParent => 'My::Parent::Class::For::Tables'
DBIx::DataModel::Table
.
The argument may also be an arrayref of several parent class names.
viewParent => 'My::Parent::Class::For::Views'
DBIx::DataModel::View
.
The argument may also be an arrayref of several parent class names.
MySchema->Table($pckName, $dbTable, @primKey);
Creates a new Perl class of name $pckName
that represents a
database table. That class inherits from
DBIx::DataModel::Table
.
$dbTable
should contain the name of the table in the database.
@primKey
should contain the name of the column
(or names of columns) holding the primary
key for that table. This info will be used for interpreting arguments
to the fetch method, and for filling WHERE clauses in the SQL
generated by the update method.
MySchema->View($viewName, $columns, $dbTables, \%where, @parentTables);
Creates a new Perl class of name $viewName
that represents a
SQL SELECT request of shape
SELECT $columns FROM $dbTables [ WHERE %where ]
Therefore arguments $columns
and $dbTables
should be strings;
the optional \%where
argument should be a hashref, as explained below.
View()
is seldom called explicitly from client code; it is mainly
useful internally for implementing other methods like
ViewFromRoles or selectFromRoles. However, it could also
be used to build queries with specific SQL clauses like for example
MySchema->View(MyView => "DISTINCT column1 AS c1, t2.column2 AS c2", "Table1 AS t1 LEFT OUTER JOIN Table2 AS t2 ON t1.fk=t2.pk", {c1 => 'foo', c2 => {-like => 'bar%'}}, qw/Table1 Table2/);
The class generated by View()
has a select()
method, which will
$viewName
.
See the SQL::Abstract manpage and the select() method for a
complete description of what to put in the \%where
argument. For the
moment, just consider the following example:
my $lst = MyView->select({c3 => 22});
This would generate the SQL statement:
SELECT DISTINCT column1 AS c1, t2.column2 AS c2 FROM Table1 AS t1 LEFT OUTER JOIN Table2 AS t2 ON t1.fk=t2.pk WHERE (c1 = 'foo' AND c2 LIKE 'bar%' AND c3 = 22)
The \%where
argument can of course be undef
.
The optional list of @parentTables
contains names of Perl
table classes from which the view will also inherit.
If the SQL code in $dbTables
is a join between
several tables, then it is a good idea to mention these
tables in @parentTables
, so that their
role methods become available to instances
of MyView
. Be careful about table names :
the SQL code in $dbTables
should contain database table names,
whereas the members of @parentTables
should be
Perl table classes (might be the same, but not necessarily).
Perl views as defined here have nothing to do with views declared in
the database itself. Perl views are totally unknown to the database,
they are just abstractions of SQL statements. If you need to access
database views, just use the Table
declaration, like for a regular
table.
MySchema->Association([$class1, $role1, $multiplicity1, @columns1], [$class2, $role2, $multiplicity2, @columns2]);
Declares an association between two tables (or even two instances of the same table), in a UML-like fashion. Each side of the association specifies its table, the ``rolename'' of of this table in the association, the multiplicity, and the name of the column or list of columns that technically implement the association as a database join.
Role names should be chosen so as to avoid conflicts with column names in the same table.
Multiplicities should be written in the UML form '0..*', '1..*', '0..1', etc. (minimum .. maximum number of occurrences); this will influence how role methods and views are implemented, as explained below. The '*' for ``infinite'' may also be written 'n', i.e. '1..n'. Multiplicity '*' is a shortcut for '0..*', and multiplicity '1' is a shortcut for '1..1'. Other numbers may be given as multiplicity bounds, but this will be just documentary : technically, all that matters is
If @columns1
or @columns2
are omitted, they are guessed
as follows : for the table with multiplicity 1
or 0..1
,
the default is the primary key; for the other table, the default
is to take the same column names as the other side of the association.
As a result of the association declaration, the Perl class
$table1
will get an additional method named
$role2
for accessing the associated object(s)
in $table2
;
that method normally returns an arrayref, unless $multiplicity2
has maximum '1' (in that case the return value is a single object
ref). Of course, $table2
conversely gets a method named
$role1
.
To understand why tables and roles are crossed, look at the UML picture :
+--------+ +--------+ | | * 0..1 | | | Table1 +---------------------+ Table2 | | | role1 role2 | | +--------+ +--------+
so from an object of Table1
, you need a method role2
to access
the associated object of Table2
. In your diagrams, be careful to
get the role names correctly according to the UML
specification. Sometimes we see diagrams where role names are on the
wrong side, mainly because modelers have a background in
Entity-Relationship or Merise methods, where it is the other way
around.
Role methods perform joins within Perl (as opposed to joins directly performed within the database). That is, given a declaration
MySchema->Association([qw/Employee employee 1 /], [qw/Activity activities 0..*/]);
we can call
my activities = $anEmployee->activities
which will implicitly perform a
SELECT * FROM Activity WHERE emp_id = $anEmployee->{emp_id}
The role method can also accept additional parameters in the SQL::Abstract manpage format, exactly like the select() method. So for example
my $activities = $anEmployee->activities(-columns => [qw/act_name salary/], -where => {is_active => 'Y'});
would perform the following SQL request :
SELECT act_name, salary FROM Activity WHERE emp_id = $anEmployee->{emp_id} AND is_active = 'Y'
If the role method is called without any parameters, and
if that role was previously expanded (see expand method),
i.e. if the object hash contains an entry $obj->{$role}
,
then this data is reused instead of calling the database again.
To specify a unidirectional association, just supply
0 or an empty string (or even the string "0"
or '""'
or "none"
)
to one of the role names. In that case the corresponding role
method is not generated.
insert_into_...
When a role has multiplicity '*', another method
named insert_into_...
is also installed, that will
create new objects of the associated class, taking care
of the linking automatically :
$anEmployee->insert_into_activities({d_begin => $today, dpt_id => $dpt});
This is equivalent to
Activity->insert({d_begin => $today, dpt_id => $dpt, emp_id => $anEmployee->{emp_id}});
UML conceptual models may contain associations where both roles have multiplicity '*' (so-called many-to-many associations). However, when it comes to actual database implementation, such associations need an intermediate linking table to collect couples of identifiers from both tables.
DBIx::DataModel
supports many-to-many associations
as a kind of syntactic sugar, translated into
low-level associations with the linking table.
The linking table needs to be declared first :
MySchema->Table(qw/link_table link_table prim_key1 prim_key2/);
MySchema->Association([qw/table1 role1 0..1/], [qw/link_table links * /]);
MySchema->Association([qw/table2 role2 0..1/], [qw/link_table links * /]);
This describes a diagram like this :
+--------+ +-------+ +--------+ | | 0..1 * | Link | * 0..1 | | | Table1 +---------------+ -- +-----------------+ Table2 | | | role1 linksA | Table | linksB role2 | | +--------+ +-------+ +--------+
Then we can declare the many-to-many association, very much like
ordinary associations, except that the last items in the argument
lists are names of roles to follow, instead of names of columns to join.
In the diagram above, we must follow roles linksA
and role2
in order to obtain the rows of Table2
related to an instance
of Table1
; so we write
MySchema->Association([qw/table1 roles1 * linksB role1/], [qw/table2 roles2 * linksA role2/]);
which describes a diagram like this :
+--------+ +--------+ | | * * | | | Table1 +--------------------+ Table2 | | | roles1 roles2 | | +--------+ +--------+
The declaration has created a new method roles2
in
Table1
; that method is implemented by following roles
linksA
and role2
. So for an object obj1
of Table1
,
the call
my $obj2_arrayref = $obj1->roles2();
will generate the following SQL :
SELECT * FROM link_table INNER JOIN table2 ON link_table.prim_key2=table2.prim_key2 WHERE link_table.prim_key1 = $obj->{prim_key1}
Observe that roles2()
returns rows from a join,
so these rows will belong both to Table2
and to
Link_Table
.
Many-to-many associations do not have an
automatic insert_into_*
method : you must
explicitly insert into the link table.
In the previous section we were following two roles at once in order to implement a many-to-many association. More generally, it may be useful to follow several roles at once, joining the tables in a single SQL query. This can be done through the following methods :
View
that selects
from several tables, filling the joins automatically
selectFromRoles :
from a given object, follow a list of roles to get information
from associated tables.
MethodFromRoles :
add a new method in a table, that will follow a list of roles
(shortcut for repeated calls to selectFromRoles
).
MySchema->Composition([$class1, $role1, $multiplicity1, @columns1], [$class2, $role2, $multiplicity2, @columns2]);
Declares a composition between two tables, i.e an association with
some additional semantics. In UML class diagrams, compositions are
pictured with a black diamond on one side : this side will be called
the composite class, while the other side will be called the
component class. In DBIx::DataModel
, the diamond (the composite
class) corresponds to the first arrayref argument, and the component
class corresponds to the second arrayref argument, so the order of
both arguments is important (while for associations the order makes no
difference).
The UML intended meaning of a composition is that objects of the
component classes cannot exist outside of their composite class. Within
DBIx::DataModel
, the additional semantics for compositions is to
support cascaded insertions and deletions and auto-expansion :
insert
may contain references to subrecords.
The main record will be inserted in the composite class, and within
the same operation, subrecords will be inserted into the
component classes, with foreign keys automatically filled with
appropriate values.
the argument to a delete
may contain lists of component records to
be deleted together with the main record of the composite class.
roles declared through a Composition may then be supplied
to AutoExpand so that the composite class
can automatically fetch its component parts.
See the documentation of insert, delete and AutoExpand methods below for more details.
Note that compositions add nothing to the semantics of update operations.
Even though the arguments to a Composition
look exactly like for
Association
, there are some more constraints : the maximum
$multiplicity1
must be 1 (which is coherent with the notion of composition),
and the maximum $multiplicity2
must be greater than 1 (because
one-to-one compositions are not common and we don't know
exactly how to implement cascaded inserts or deletes in such a case).
Furthermore, a class cannot be component of several composite classes,
unless the corresponding multiplicities are all 0..1
instead of the
usual 1
.
my $view = MySchema->ViewFromRoles($table, $role1, $role2, ..);
Creates a View
class, starting from a given table class and
following one or several associations through their role names.
It calls the View method, with a collection of parameters
automatically inferred from the associations. So for example
MySchema->ViewFromRoles(qw/Department activities employee/);
is equivalent to
my $sql = <<_EOSQL_ Department LEFT OUTER JOIN Activity ON Department.dpt_id = Activity.dpt_id LEFT OUTER JOIN Employee ON Activity.emp_id = Employee.emp_id _EOSQL_ MySchema->View(DepartmentActivitiesEmployee => '*', $sql, qw/Department Activity Employee/);
For each pair of tables, the kind of join is chosen according to
the multiplicity declared with the role : if the minimum multiplicity is 0,
the join will be LEFT OUTER JOIN; otherwise it will be a usual inner join
(exception : after a first left join, all remaining tables are also
connected through additional left joins). The default kind of join
chosen by this rule may be overriden by inserting pseudo-roles
in the list, namely '<=>'
or INNER
for inner joins
and '=>'
or LEFT
for left joins. So for example
MySchema->ViewFromRoles(qw/Department <=> activities <=> employee/);
becomes equivalent to
my $sql = <<_EOSQL_ Department INNER JOIN Activity ON Department.dpt_id = Activity.dpt_id INNER JOIN Employee ON Activity.emp_id = Employee.emp_id _EOSQL_
All tables participating in a ViewFromRoles
are stacked,
and further roles are found by walking up the stack. So in
..->ViewFromRoles(qw/FirstTable role1 role2 role3/)
we must find a role1
in FirstTable
, from which we
know what will be the Table2
. Then, we must find
a role2
in in Table2
, or otherwise in FirstTable
,
in order to know Table3
. In turn, role3
must be
found either in Table3
, or in Table2
, or in FirstTable
, etc.
The resulting view name will be composed by concatenating the table
and the capitalized role names. If join kinds were explicitly
set, these also belong to the view name, like
Department_INNER_Activities
.
Since such names might be long and uncomfortable
to use, the view name is also returned as result of the method
call, so that the client code can store it in a variable and use
it as an alias.
The main purpose of ViewFromRoles
is to gain efficiency in
interacting with the database. If we write
foreach my $dpt (@{Department->select}) { foreach my $act ($dpt->activities) { my $emp = $act->employee; printf "%s works in %s since %s\n", $emp->{lastname}, $dpt->{dpt_name}, $act->{d_begin}; } }
many database calls are generated behind the scene. Instead we could write
my $view = MySchema->ViewFromRoles(qw/Department activities employee/); foreach my $row (@{$view->select}) { printf "%s works in %s since %s\n", $row->{lastname}, $row->{dpt_name}, $row->{d_begin}; }
which generates one single call to the database.
Caveat : ViewFromRoles
does not know about SQL table aliases.
Therefore, if the role list contains several occurrences of the
same table (for example through self-referential associations),
the generated SQL will be incorrect because of ambiguous table names.
my $lst = $obj->selectFromRoles(qw/role1 role2 .../) ->(-columns => [...], -where => {...}, -orderBy=>[...]);
Starting from a given object, returns a reference to a function that
selects a collection of data rows from associated tables, performing
the appropriate joins. Internally this is implemented throught the
ViewFromRoles
method, with an additional join criteria to constrain
on the primary key(s)
of $obj
. The returned function takes the
same arguments as the select method. So for example if
$emp->{emp_id} == 987
, then
$emp->selectFromRoles(qw/activities department/) ->(-where => {d_end => undef})
will generate
SELECT * FROM Activity INNER JOIN Department ON Activity.dpt_id = Department.dpt_id WHERE emp_id = 987 AND d_end IS NULL
$table->MethodFromRoles($meth_name => qw/role1 role2 .../);
Inserts into the table class a new method named $meth_name
,
that will automatically call selectFromRoles and
pass arguments to the resulting function.
This is useful for joining several tables at once, so for
example with
Department->MethodFromRoles(employees => qw/activities employee/);
we can then write
my $empl_ref = $someDept->employees(-where => {gender => 'F'}, -columns => [qw/firstname lastname]);
This method is used internally to implement many-to-many associations;
so if you have only two roles to follow, you would probably be better
off by defining the association, which is a more abstract notion.
Direct calls to MethodFromRoles
are still useful if you want
to follow three or more roles at once.
MyTable->DefaultColumns($columns);
Sets the default value for the -columns
argument to
select(). If nothing else is stated, the
default value for all tables is '*
'.
MyTable->ColumnHandlers($columnName => handlerName1 => coderef1, handlerName2 => coderef2, ...);
Associates some handlers to a given column in the current table class.
Then, when you call $obj->applyColumnHandler('someHandler')
,
each column having a handler of the corresponding name will execute the
associated code. This can be useful for all sorts of data manipulation :
Handlers receive the column value as usual through $_[0]
.
If the value is to be modified (for example for scalar
conversions or while inflating values into objects),
the result should be put back into $_[0]
.
In addition to the column value, other info is passed to the
handler :
sub myHandler { my ($columnValue, $obj, $columnName, $handlerName) = @; my $newVal = $obj->computeNewVal($columnValue, ...); $columnValue = $newVal; # WRONG : will be a no-op $_[0] = $newVal; # OK : value is converted }
The second argument $obj
is the object from where
$columnValue
was taken -- most probably an instance
of a Table or View class. Use this if you need to read some contextual
information, but avoid modifying $obj
: you would most
probably get unexpected results, since the collection of
available columns may vary from one call to the other.
Other arguments $columnName
and
$handlerName
are obvious.
Handler names fromDB and toDB have a special meaning : they are called automatically just after reading data from the database, or just before writing into the database. Handler name validate is used by the method hasInvalidColumns.
MySchema->ColumnType(type_name => handler_name1 => coderef1, handler_name2 => coderef2, ...);
Declares a collection of column handlers under name type_name
.
MyTable->ColumnType(type_name => qw/column1 column2 .../);
Retrieves all column handlers defined under type_name
in the schema
and calls ColumnHandlers to register those handlers to
column1
, column2
, etc.
MySchema->Autoload(1); # turn on AUTOLOAD MySchema->Autoload(0); # turn it off MyClass ->Autoload(1); # turn it on, just for one class
If AUTOLOAD is turned on (default is off),
then columns have implicit read accessors through AUTOLOAD.
So instead of $record->{column}
you can write
$record->column
.
Of course this is a bit slower than generating all accessors explicitly at compile time (through Class::Accessor or something similar), but the advantage is that you don't need to know all column names in advance. This is how we support variable column lists (two instances of the same Table do not necessarily hold the same set of columns, it all depends on what you chose when doing the SELECT).
Caveat : AUTOLOAD is a global feature, so
MySchema->Autoload(1)
actually turns it on
for all schemas.
MySchema->AutoInsertColumns( columnName1 => sub{...}, ... ); MyTable ->AutoInsertColumns( columnName1 => sub{...}, ... );
Declares handler code that will automatically fill column names
columnName1
, etc. at each insert, either for a single table, or (if
declared at the Schema level), for every table. For example, each
record could remember who created it and when with something
like
MySchema->AutoInsertColumns( created_by => sub{$ENV{REMOTE_USER} . ", " . localtime} );
The handler code will be called as
$handler->(\%record, $table)
so that it can know something about its calling context. In most cases, however, the handler will not need these parameters, because it just returns global information such as current user or current date/time.
MySchema->AutoUpdateColumns( columnName1 => sub{...}, ... ); MyTable ->AutoUpdateColumns( columnName1 => sub{...}, ... );
Just like AutoInsertColumns
, but will be called automatically
at each update and each insert. This is typically used to
remember the author and/or date and time of the last modification
of a record. If you use both AutoInsertColumns
and
AutoUpdateColumns
, make sure that the column names are not
the same.
When doing an update (i.e. not an insert), the handler code will be called as
$handler->(\%record, $table, \%where)
where %record
contains the columns to be updated and
%where
contains the primary key (column name(s)
and value(s)).
MySchema->NoUpdateColumns(@columns); MyTable ->NoUpdateColumns(@columns);
Defines an array of column names that will be excluded from INSERT/UPDATE statements. This is useful for example when some column are set up automatically by the database (like automatic time stamps or user identification). It can also be useful if you want to temporarily add information to memory objects, without passing it back to the database.
NoUpdate columns can be set for a whole Schema, or for a specific Table class.
Table->AutoExpand(qw/role1 role2 .../)
Generates an autoExpand method for the class, that will autoexpand on the roles listed (i.e. will call the appropriate method and store the result in a local slot within the object). In other words, the object knows how to expand itself, fetching information from associated tables, in order to build a data tree in memory. Only roles declared as Compositions may be auto-expanded.
Be careful about performance issues: when an object uses auto-expansion through a call to autoExpand, every auto-expanded role will generate an additional call to the database. This might be slow, especially if there are recursive auto-expansions; so in some cases it will be more appropriate to flatten the tree and use database joins, typically through the method selectFromRoles.
my $dbh = DBI::connect(...); Schema->dbh($dbh, %options); # set
my $dbh = Schema->dbh; # get back just the dbh my ($dbh, %options) = Schema->dbh; # get back all
Returns or sets the handle to a DBI database handle (see the DBI manpage).
This handle is schema-specific.
DBIx::DataModel
expects the handle to be opened with
RaiseError => 1
(see Transactions and error handling).
In %options
you may pass any key-value pairs, and retrieve
them later by calling dbh
in a list context.
DBIx::DataModel
will look in those options to try to find
the ``catalog'' and ``schema'' arguments for DBI
's
last_insert_id.
$schema = $class->schema; $schema = $object->schema;
Returns the name of the schema class for the given object or Table or View class.
$db_table = $class->db_table; $db_table = $object->db_table;
Returns the database table name registered via Schema->Table(..)
or collection of joined tables registered via Schema->View(..)
.
Schema->debug( 1 ); # will warn for each SQL statement Schema->debug( $debugObject ); # will call $debugObject->debug($sql) Schema->debug( 0 ); # turn off debugging
Debug mode is useful for seeing SQL statements generated
by DBIx::DataModel
. Enabling debugging with a $debugObject
will typically be useful in conjunction with something like
Log::Log4perl or
Log::Dispatch.
There is also another way to see the SQL code :
my $spy_sql = sub {my ($sql, @bind) = @_; print STDERR join "\n", $sql, @bind; return ($sql, @bind);}; my $result = $myClassOrObj->$someSelectMethod(-columns => \@columns, -where => \%criteria, -postSQL => $spy_sql);
my $val = $someClass ->classData->{someKey}; my $val = $someObject->classData->{someKey};
Returns a ref to a hash for storing class-specific data. Each subclass has its own hashref, so class data is NOT propagated along the inheritance tree. Class data should be mainly for reading; if you write into it, make sure you know what you are doing.
my @primKeyColumns = Table->primKey; my @primKeyValues = $obj->primKey;
If called as a class method, returns the list of columns
registered as primary key for that table (via Schema->Table(..)
).
If called as an instance method, returns the list of values in those columns.
When called in scalar context and the primary key has only one column,
returns that column (so you can call $my k = $obj->primKey
).
my @roles = Table->componentRoles;
Returns the list of roles declared through Composition.
my @cols = MySchema->noUpdateColumns;
Returns the array of column names declared as noUpdate through NoUpdateColumns.
my @cols = $obj->noUpdateColumns;
Returns the array of column names declared as noUpdate, either in the Schema or in the Table class of the invocant.
my @cols = MySchema->autoUpdateColumns;
Returns the array of column names declared as autoUpdate through AutoUpdateColumns.
my @cols = $obj->autoUpdateColumns;
Returns the array of column names declared as autoUpdate, either in the Schema or in the Table class of the invocant.
MySchema->selectImplicitlyFor('read only'); # will be added to SQL selects MySchema->selectImplicitlyFor(''); # turn it off
Gets or sets a default value for the -for
argument to
select(). Here it is set
at the Schema
level, so it will be applied to all tables.
$tableClass->selectImplicitlyFor('read only'); $viewClass->selectImplicitlyFor('');
Same thing, but at a Table or View level.
my $string = $object->selectImplicitlyFor;
Retrieves whatever whas set in the table or in the schema.
my @tables = MySchema->tables;
Returns an array of names of Table
subclasses declared in this schema.
my @views = MySchema->views;
Returns an array of names of View
subclasses declared in this schema.
MySchema->keepLasth(1); # schema will keep a reference to the last DBI handle MySchema->keepLasth(0); # turn it off
If true, the schema will keep a copy of the last generated
SQL statement handle (either select
, insert
or update
).
The handle may then be accessed through $obj->schema->lasth
.
This may be useful if you need to interact with the handle
for driver-specific operations. However, it also means that the
handle is not DESTROYed immediately, which might result in
resources being locked until the next statement. Therefore
keepLasth
is off by default.
my $sth = MySchema->lasth;
Returns the last DBI statement handle created by this module, if
keepLasth
is turned on.
my $coderef = sub {Table1->insert(...); Table2->update(...); ...}; MySchema->doTransaction($coderef);
Evaluates the code within a transaction. In case of failure, the transaction is rolled back, and an exception is raised with the error message and the status of the rollback (because the rollback itself may also fail).
Usually the coderef passed as argument will be a closure that may refer to variables local to the environment where the closure was created.
Nested calls to doTransaction
are supported : only
the top-level call will actually initiate and then
commit the transaction, while an exception at any level
will abort and rollback the whole thing.
my $record = MyTable->fetch(@keyValues, \%options);
Searches the single record whose primary key is @keyValues
.
Returns undef if none is found. The optional \%options
argument can specify things like -for
, -preExec
, -postExec
(see the select() method
for an explanation). Of course
it makes no sense to specify -where
in the options.
$records = MyTable->select(\@columns, \%where, \@order); $records = MyView->select(\@columns, \%where, \@order); $records = MyTable->select(-columns => \@columns, # OR : -distinct => \@columns, -where => \%where, -groupBy => \@groupings, -having => \%criteria, -orderBy => \@order, -for => 'read only', -postSQL => \&postSQL_callback, -preExec => \&preExec_callback, -postExec => \&preExec_callback, -resultAs => 'rows' || 'sth' || 'sql' || 'iterator'); $all_records = MyTable->select();
Applies a SQL SELECT to the associated table (or view), and returns
a result as specified by the -resultAs
argument (see below).
Arguments are all optional and may be passed either by name or by position
(but you cannnot combine both positional and named arguments in a single call).
The API is mostly borrowed from SQL::Abstract :
\@columns
is a reference to an array
of SQL column specifications (i.e. column names,
function or grouping operators, ``AS'' clauses, etc.).
A '|' in a column is translated into an 'AS' clause, according
to the current SQL dialect: this is convenient when
using perl qw/.../
operator for columns, as in
-columns => [ qw/table1.longColumn|t1lc table2.longColumn|t2lc/ ]
The argument to -columns
can also be a string instead of
an arrayref, like for example
"c1 AS foobar, MAX(c2) AS m_c2, COUNT(c3) AS n_c3"
.
If omitted, \@columns
takes the default, which is
usually '*', unless modified through DefaultColumns().
No verification is done on the list of retrieved \@columns
,
so it is OK if the list does not contain the primary or foreign keys ---
but then later attempts to perform joins or updates will obviously fail.
\%where
is a reference to a hash or array of
criteria that will be translated into SQL clauses. In most cases, this
will just be something like {col1 => 'val1', col2 => 'val2'}
;
see SQL::Abstract::select for
detailed description of the
structure of that hash or array. It can also be
a plain SQL string like "col1 IN (3, 5, 7, 11) OR col2 IS NOT NULL"
.
the third argument \@order
is a reference to a list
of columns for sorting. Again it can also be a plain SQL string
like "col1 DESC, col3, col2 DESC"
. Columns can
also be prefixed by '+' or '-' for indicating sorting directions,
so for example -orderBy => [qw/-col1 +col2 -col3/]
will generate the SQL clause
ORDER BY col1 DESC, col2 ASC, col3 DESC
.
If using named arguments, more options are available :
-distinct => \@columns
-columns
arguments, except that
keyword DISTINCT
will be included in the generated SQL.
-groupBy => "string"
or -groupBy => \@array
GROUP BY
clause in the SQL statement. Grouping columns are
specified either by a plain string or by an array of strings.
-having => "string"
or -having => \%criteria
HAVING
clause in the SQL statement (only makes
sense together with a GROUP BY
clause).
This is like a -where
clause, except that the criteria
are applied after grouping has occured.
-for => $clause
-for => 'read only'
or -for => 'update'
.
-postSQL => \&postSQL_callback
($sql, @bind) = $args->{-postSQL}->($sql, @bind) if $args->{-postSQL};
-preExec => \&preExec_callback, -postExec => \&postExec_callback
execute()
. So the sequence
will be more or less like this:
$sth = $dbh->prepare($sql_statement); $preExec_callback->($sth) if $preExec_callback; $sth->execute(@bind_values); $postExec_callback->($sth) if $postExec_callback;
This is mostly useful if you need to call driver-specific functions at those stages.
-resultAs => $result_kind
undef
is returned.
next
,
that fetches the next datarow and blesses it into the appropriate object.
The method returns undef
when there is no more data to fetch.
So a typical usage pattern is :
my $iterator = $class->select(-where => \%criteria, -resultAs => 'iterator'); while (my $row = $iterator->next) { do_something_with($row); }
DBI
statement handle. Then it is up to the
caller to retrieve data rows using the DBI API.
If needed, these rows can then be blessed into appropriate objects
through blessFromDB().
($sql, @bind)
, i.e. the SQL statement
together with the bind values.
my @ids = MyTable->insert({col1 => $val1, col2 => $val2, ...}, {...});
Applies the toDB
handlers, removes the noUpdate
columns,
and then inserts the new records into the database.
Because of the handlers, this operation may modify the argument data,
so it is not safe to access $val1
, $val2
, etc. after the call.
Primary key column(s)
should of course be present
in the supplied hashrefs, unless the the key is auto-generated.
Each hashref will be blessed into the MyTable
class, and
will be inserted through the internal _singleInsert method.
The default implementation of this method should be good enough
for most common uses, but you may want to refine it in your
table classes if you need some fancy handling on primary keys
(like for example computing a random key and checking whether
that key is free).
Scalar values returned by _singleInsert are collected into
an array, and then returned by insert()
; usually, these are
the primary keys of the inserted records (if on one single column).
In scalar context, the return value is the first id in the list above, which
makes sense if you call insert()
with a single argument. If you call
it with several arguments but from a scalar context, a warning is issued.
If the table is a composite class (see Composition above), then the component parts may be supplied within the hashref, with keys equal to the role names, and values given as arrayrefs of sub-hashrefs; then these will be inserted into the database, at the same time as the main record, with join values automatically filled in. For example :
HR::Employee->insert({firstname => "Johann Sebastian", lastname => "Bach", activities => [{d_begin => '01.01.1695', d_end => '18.07.1750', dpt_code => 'CPT'}]});
MyTable->update({col1 => $val1, ...}); MyTable->update(@primKey, {col1 => $val1, ...}); $obj->update;
This is both a class and an instance method.
It applies the toDB
handlers, removes the noUpdate
columns,
and then updates the database for the given record.
When called as a class method, the columns and values to update
are supplied as a hashref. The second syntax with
@primKey
is an alternate way to supply the values
for the primary key (it may be more convenient because you don't
need to repeat the name of primary key columns). So if emp_id
is the primary key of table Employee
, then the following
are equivalent :
Employee->update({emp_id => $eid, address => $newAddr, phone => $newPhone}); Employee->update($eid => {address => $newAddr, phone => $newPhone});
When called as an instance method, i.e.
$someEmployee->update;
the columns and values to update are taken from the object in
memory (ignoring all non-scalar values). After the update,
the memory for that object is emptied (to prevent any confusion,
because the 'toDB' handlers might have changed the values).
So to continue working with the same record, you must fetch it again
from the database (or clone it yourself before calling update
).
In either case, you have no control over the 'where' clause of the SQL update statement, that will be based exclusively on primary key columns. So if you need to simultaneously update several records with a SQL request like
UPDATE myTable SET col='newVal' WHERE otherCol like 'criteria%'
then you should generate the SQL yourself and pass it directly to
DBI->do($sql)
.
The update
method only updates the columns received
as arguments : it knows nothing about other columns that may sit
in the database table. Therefore if you have two concurrent clients
doing
(client1) MyTable->update($id, {c1 => $v1, c2 => $v2}); (client2) MyTable->update($id, {c3 => $v3, c4 => $v4, c5 => $v5});
the final state of record $id
in the database will
reflect changes from both clients.
MyTable->delete({column1 => value1, ...}); MyTable->delete(@primKey); $record->delete;
This is both a class and an instance method. It deletes a record from the database.
When called as a class method, the primary key of the record
to delete is supplied either as a hashref, or directly
as a list of values. Note that MyTable->delete(11, 22)
does not mean ``delete records with keys 11 and 22'', but rather
``delete record having primary key (11, 22)''; in other words,
you only delete one record at a time. In order to
simultaneously delete several records according to some
WHERE
criteria, you must generate
the SQL yourself and go directly to the DBI level.
When called as an instance method, the primary key is taken
from object columns in memory. After the delete,
the memory for that object is destroyed.
If the table is a composite class (see Composition above),
and if the object contains references to lists of component parts,
then those will be recursively deleted together with the main
object (cascaded delete). However, if there are other component parts
in the database, not referenced in the hashref, then those will
not be automatically deleted : in other words, the delete
method does not go by itself to the database to find all
dependent composite parts (this is the job of the client code, or
sometimes of the database itself).
$class ->applyColumnHandler($handlerName, \@objects); $object->applyColumnHandler($handlerName);
Inspects the target object or list of objects; for every
column that exists in the object, checks whether
a handler named $handlerName
was declared for
that column (see method ColumnHandlers), and if so,
calls the handler. By this definition, if a column
is absent in an object, then the handler for that column
is not called, even though it was declared in the class.
The results of handler calls are collected into a hashref, with an
entry for each column name. The value of each entry depends on how
applyColumnHandlers
was called : if it was called as an
instance method, then the result is something of shape
{columnName1 => resultValue1, columnName2 => resultValue2, ... }
if it was called as a class method (i.e. if \@objects
is defined),
then the result is something of shape
{columnName1 => [resultValue1forObject1, resultValue1forObject2, ...], columnName2 => [resultValue2forObject1, resultValue2forObject2, ...], ... }
If columnName
is not present in the target object(s), then the
corresponding result value is undef
.
my $invalid_columns = $obj->hasInvalidColumns; if ($invalid_columns) { print "wrong data in columns ", join(", ", @$invalid_columns); } else { print "all columns OK"; }
Applies the 'validate' handler to all existent columns. Returns a ref to the list of invalid columns, or undef if there are none.
Note that this is validation at the column level, not at the record level. As a result, your validation handlers can check if an existent column is empty, but cannot check if a column is missing (because in that case the handler would not be called).
Your 'validate' handlers, defined through ColumnHandlers,
should return 0 or an empty string whenever the column value is invalid.
Never return undef
, because we would no longer be able to
distinguish between an invalid existent column and a missing column.
$obj->expand($role [, @args] )
Executes the method $role
to follow an Association,
stores the result in the object itself under $obj->{$role}
,
and returns that result.
This is typically used to expand an object into a tree datastructure.
Optional @args
are passed to $obj->$role(@args)
, for
example for specifying -where
, -columns
or -orderBy
options.
After the expansion, further calls to
$obj->$role
(without any arguments) will reuse
that same expanded result instead of calling again the database.
This caching improves efficiency, but also introduces the risk
of side-effects across your code : after
$obj->expand(someRole => (-columns => [qw/just some columns/], -where => {someField => 'restriction'}))
further calls to $obj->someRole()
will just return
a dataset restricted according to the above criteria, instead
of a full join. To prevent that effect, you would need to
delete $obj->{someRole}
, or to call the role
with arguments : $obj->someRole('*')
.
$record->autoExpand( $recurse );
Asks the object to expand itself with some objects in foreign tables.
Does nothing by default. Should be redefined in subclasses,
most probably through the
AutoExpand method (with capital 'A').
If the optional argument $recurse
is true, then
autoExpand
is recursively called on the expanded objects.
$class->blessFromDB($record);
Blesses $record
into an object of the table or view class,
and applies the fromDB
column handlers.
my $meth = $class->preselectWhere({col1 => $val1, ...}, $multiplicity); # .. later my $result = $meth->(-where =>{otherCol => 'otherVal'}, -columns => \@cols);
Returns a reference to a function that will select data from
$class
, just like the select()
method, but where some
additional selection criteria are ``preselected''. The preselection
criteria are specified in SQL::Abstract format.
This method is
mainly for internal use; you only want to learn about it if you
intend to write your own role methods; an example is shown in section
Self-referential associations below.
If the optional $multiplicity
argument contains 1
or '0..1'
,
then the function behaves like
fetch() rather than select()|/select
:
that is, the result of $meth->(...)
will be a single
recordref, rather than an arrayref of records.
DBIx::DataModel
automatically generates Perl classes for Schemas,
Tables, Views, Associations. Before doing so, it checks that no Perl
package of the same name already exists. A similar check is performed
before adding role methods into classes.
The client code can insert additional methods into the generated
classes : just switch to the package and define your code. However,
because of the security checks just mentioned, DBIx::DataModel
must
create the package before you start adding methods to it, and
therefore the declarations should be inside a BEGIN block :
BEGIN { # make sure these declarations are immediately executed DBIx::DataModel->Schema('MySchema'); MySchema->Table(Activity => Activity => qw/act_id/); ... } # now we can safely add new methods package Activity; sub activePeriod { my $self = shift; $self->{d_end} ? "from $self->{d_begin} to $self->{d_end}" : "since $self->{d_begin}"; } package main; # switch back to the 'main' package
See the perlmod manpage for an explanation of BEGIN blocks.
DBIx::DataModel
follows the recommendations
of DBI
for transactions : it expects the database handle
to be opened with RaiseError => 1
and therefore does not check itself
for DBI
errors ; it is up to the client code
to catch the exceptions and deal with errors.
As explained in Transactions in the DBI manpage,
AutoCommit
should be set off for databases that support transactions;
then atomic operations are enclosed in an eval
, followed by either
$dbh->commit()
(in case of success) or
$dbh->rollback()
(in case of failure).
The doTransaction() method does all this for you
automatically.
Maybe you will encounter situations where
you need to generate SQL yourself (for example because of
clauses specific to your RDBMS), or to interact directly
with the DBI layer. This can be encapsulated in
additional methods incorporated into the classes
generated by DBIx::DataModel
.
In those methods, you may want to call blessFromDB()
so that the rows returned by DBI may be seen as
objects from your client program. Here is an example :
package MyTable; # switch to namespace 'MyTable'
sub fancyMethod {
# call the DBI API my $hash = $dbh->selectall_hashref($fancySQL, @keyFields);
# bless results into objects of MyTable MyTable->blessFromDB($_) foreach values %$hash;
return $hash; }
Associations can be self-referential, i.e. describing tree structures :
MySchema->Association([qw/OrganisationalUnit parent 1 ou_id/], [qw/OrganisationalUnit children * parent_ou_id/],
However, when there are several self-referential associations, we might get into problems : consider
MySchema->Association([qw/Person mother 1 pers_id/], [qw/Person children * mother_id/]); MySchema->Association([qw/Person father 1 pers_id/], [qw/Person children * father_id/]); # BUG
This does not work because there are two definitions of the ``children'' role name in the same class ``Person''. One solution is to distinguish these roles, and then write by hand a general ``children'' role :
MySchema->Association([qw/Person mother 1 pers_id/], [qw/Person motherChildren * mother_id/]); MySchema->Association([qw/Person father 1 pers_id/], [qw/Person fatherChildren * father_id/]); package Person; sub children { my $self = shift; my $id = $self->{pers_id}; my $sql = "SELECT * FROM Person WHERE mother_id = $id OR father_id = $id"; my $children = $self->dbh->selectall_arrayref($sql, {Slice => {}}); Person->blessFromDB($_) foreach @$children; return $children; }
Alternatively, since rolenames motherChildren
and fatherChildren
are most probably useless, we might just specify unidirectional
associations :
MySchema->Association([qw/Person mother 1 pers_id/], [qw/Person none * mother_id/]); MySchema->Association([qw/Person father 1 pers_id/], [qw/Person none * father_id/]);
And here is a more sophisticated way to define the ``children'' method, that will accept additional ``where'' criteria, like every regular method.
package Person; sub children { my $self = shift; # remaining args in @_ will be passed to select() my $id = $self->{pers_id}; my $select_func = Person->preselectWhere([mother_id => $id, father_id => $id]); return $select_func->(@_); }
This definition forces the join on mother_id
or
father_id
, while leaving open the possibility for the caller
to specify additional criteria. For example, all female children
of a person (either father or mother) can now be retrieved through
$person->children({gender => 'F'})
Observe that mother_id
and father_id
are inside an arrayref
instead of a hashref, so that the SQL::Abstract manpage will generate an SQL 'OR'.
This section documents some details that normally should not be relevant to clients; you only want to read about them if you intend to extend the framework.
The following picture shows the hierarchy of implementation classes :
+-----------------------+ | DBIx::DataModel::Base | +-----------------------+ / \ / \ +-------------------------+ +--------------------------------+ | DBIx::DataModel::Schema | | DBIx::DataModel::AbstractTable | +-------------------------+ +--------------------------------+ / \ / \ +------------------------+ +-----------------------+ | DBIx::DataModel::Table | | DBIx::DataModel::View | +------------------------+ +-----------------------+
+-----------------+ +---------------------------+ | DBIx::DataModel | | DBIx::DataModel::Iterator | +-----------------+ +---------------------------+
The Schema method creates a subclass of DBIx::DataModel::Schema. The Table method creates a subclass of DBIx::DataModel::Table. The View method and its related clients (ViewFromRoles, selectFromRoles, etc.) use multiple inheritance : views inherit first from DBIx::DataModel::View, but also from the supplied list of parent tables. As a result, instances of such views can exploit all role methods of their parent tables. The entry class DBIx::DataModel is just a façade interface to DBIx::DataModel::Schema. The helper class DBIx::DataModel::Iterator implements iterators returned by the select method.
DBIx::DataModel::Base->_setClassData($subclass, $data_ref);
DBIx::DataModel::Schema->_createPackage($pckName, $isa_arrayref);
Creates a new Perl package of name $pckName
that inherits from
@$isa_arrayref
. Raises an exception if the package name already
exists.
DBIx::DataModel::Schema->_defineMethod($pckName, $methName, $coderef);
Defines a new method in package $pckName
, bound to $coderef
;
or undefines a method if $coderef
is undef
.
Raises an exception if the method name already
exists in that package.
$obj->_rawInsert;
Internal implementation for insertions into the database :
takes keys and values within %$obj
, generates SQL for
insertion of those values into $obj->dbTable
,
and executes it. Never called directly, but used by the protected method
_singleInsert.
$obj->_singleInsert;
Implementation for inserting a record into the database; should never be called directly, but is used as a backend by the insert method.
This method receives an object blessed into some table class; the
object hash should only contain keys and values to be directly
inserted into the database, i.e. the noUpdateColumns
and all
references to foreign objects should have been removed already (this
is the job of the insert method). The method calls
_rawInsert for performing the database update, and then makes
sure that the object contains its own key (if not supplied by the
client code, for example when keys are auto-generated, then
the key has to be .
In the default implementation, this is done by calling DBI's
last_insert_id() whenever necessary. This may or
may not be meaningful, depending on your database driver. The four
arguments required by last_insert_id
are
supplied as follows : catalog and schema names are taken from options
given to Schema->dbh(...)
(or undef
otherwise), table and
column names are taken from the object's database table and
primary key, as declared in Schema->Table(...)
.
You may redeclare this method in your own table classes, for example if you need to compute a key, or construct it from other fields.
The scalar value returned by the method will in turn be returned by the insert method; usually this value if the primary key, if that key is on one single column.
Some alternative modules in this area are
the Class::DBI manpage, the DBIx::Class manpage,
Alzabo, Tangram,
the Rose::DB::Object manpage,
the Data::ObjectDriver manpage,
ORM,
SPOPS, the Class::PObject manpage, , the DBIx::RecordSet manpage,
the DBIx::SQLEngine manpage,the DBIx::Record manpage, , and a lot more
in the DBIx::*
namespace, all with different approaches.
For various reasons, none of these did fit nicely in my context,
so I decided to write DBIx:DataModel
.
Of course there might be also many reasons why DBIx:DataModel
will not fit in your context, so just do your own shopping.
A good place to start would be the general discussion on RDBMS - Perl
mappings at http://poop.sourceforge.net. There are also some
pointers in the Perl 5 Enterprise Environment website at
http://www.officevision.com/pub/p5ee/.
For discussions about DBIx::DataModel
,
please use the CPAN::Forum site at
http://www.cpanforum.com/dist/DBIx-DataModel.
- autoloader to get tables and associations from $dbh->table_info, etc. - 'hasInvalidColumns' : should be called automatically before insert/update ? - 'validate' record handler (not only column handlers) - 'normalize' handler : for ex. transform empty string into null - walk through WHERE queries and apply 'toDB' handler (not obvious!) - decide what to do with multiple inheritance of role methods in Views; use NEXT ? - implement table aliases - maybe it is not a good idea to modify data in place when performing inserts or updates; should perhaps clone the arguments. - more extensive and more organized testing - optional caching for fetch() within lookup tables .. but beware of caching if dbh changes - add support for UPDATE/DELETE ... WHERE ... - add PKEYS keyword in -columns, will be automatically replaced by names of primary key columns of the touched tables - design API for easy dynamic association of objects without dealing with the keys - remove spouse example from doc (because can't have same table twice in roles) - support for bind parameters for blobs
Laurent Dami, <laurent.dami AT etat geneve ch>
Thanks to Cedric Bouvier for bug fixes and improvements.
Copyright 2006, 2007 by Laurent Dami.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
DBIx::DataModel - Classes and UML-style Associations on top of DBI |