Alzabo::Runtime::Cursor - Base class for Alzabo cursors |
Alzabo::Runtime::Cursor - Base class for Alzabo cursors
use Alzabo::Runtime::Cursor;
This is the base class for cursors.
Virtual method.
Virtual method.
If the last next_row
or next_rows
call encountered a situation
where the SQL query returned primary keys not actually in the target
table, then the exception objects are stored in the cursor. This
method can be used to retrieve these objects. This allows you to
ignore these errors if you so desire without having to do explicit
exception handling. For more information on what you can do with this
method see the HANDLING ERRORS section.
A list of Alzabo::Exception::NoSuchRow
objects.
Resets the cursor so that the next next_row
or next_rows
call
will return the first row of the set.
Let's assume a database in with following tables:
TABLE NAME: movie
movie_id int title varchar(50)
TABLE NAME: movie_alias
alias_id int movie_id int alias varchar(50)
Now, let's assume you have a schema object $schema
and you execute
the following code:
my $cursor = $schema->join( tables => [ $schema->tables( 'movie', 'movie_alias' ) ], select => [ $schema->table('movie') ], where => [ $schema->table('movie_alias')->column('alias'), 'like', 'Foo%' ] );
The cursor returned is relying on the movie_id column in the
movie_alias table. It's possible that there are values in this column
that are not actually in the movie table but the cursor object will
ignore the exceptions caused by these bad ids. The next_row
(or
next_rows
) method will not return a false value until its
underlying DBI statement handle stops returning data. The reasoning
behind this is that otherwise there would be no way to distinguish
between: A) a false value caused by there being no more data coming
back from the query on the movie_alias table and B) a false value
caused by there being no row in the movie column matching a given
movie_id value.
It is certainly possible that there are situations when you don't care about referential integrity and you want to simply get all the rows you can. In other cases, you will want to handle errors. I would have used exceptions for this purpose except the following code would then not function properly.
while ( my $row = eval { $cursor->next_row } ) { do_something if $@; # or alternately just ignore $@
... do something with $row ... }
The reason is that throwing an exception in the eval block would cause
the eval to return an undef. This means that the 'do_something if
$@;' clause would _never_ get executed. In that case, you couldn't
ignore the exception if you wanted to because it interrupts the
while
loop. The workaround would be:
do { my $row = eval { $cursor->next_row }; # either do something with $@ or ignore it. } while ( $row || ( $@ && $@->isa('Alzabo::Exception::NoSuchRow') );
However, this is not an idiom I particularly want to encourage, as it is counter-intuitive.
So, while throwing an exception may be the most 'correct' way to do
it, I've instead created the errors
method.
This means that the idiom for checking errors from the next_row
or
next_rows
method is as follows:
while ( my $row = $cursor->next_row ) { do_something if $cursor->errors;
... do something with $row ... }
The advantage here is that ignoring the exception is easy. If you
want to check them then just remember that the errors
method will return a list of
Alzabo::Exception::NoSuchRow
objects that
occurred during the previous next_row
or next_rows
call.
Also note that other types of exceptions are rethrown from the
next_row
and next_rows
method.
Using cursors is definitely more complicated. However, there are two
excellent reasons for using them: speed and memory savings. As an
example, I did a test with the old code (which returned all its
objects at once) against a table with about 8,000 rows using the
Alzabo::Runtime::Table->all_rows
method. Under the old
implementation, it took significantly longer to return the first row.
Even more importantly than that, the old implementation used up about
10MB of memory versus about 4MB! Now imagine that with a 1,000,000
row table.
For those curious to know why, here's the reason. Under perl, the following code:
foreach (1..1_000_000) { print "$_\n"; }
first constructs a temporary array with _all_ the values (that's an array one million scalars long!) and then returns it one by one. It takes a nontrivial amount of time to construct that array, meaning that the first print statement is delayed. Even worse, the array uses up memory.
Thus Alzabo now uses cursors so can scale better. This is a particularly big win in the case where you are working through a long list of rows and may stop before the end is reached. With cursors, Alzabo creates only as many rows as you need. Plus the start up time on your loop is much, much quicker. In the end, your program is quicker and less of a memory hog. This is good.
Dave Rolsky, <autarch@urth.org>
Alzabo::Runtime::Cursor - Base class for Alzabo cursors |