PDL::Slices -- Stupid index tricks |
PDL::Slices -- Stupid index tricks
use PDL; $a = ones(3,3); $b = $a->slice('-1:0,(1)'); $c = $a->dummy(2);
This package provides many of the powerful PerlDL core index manipulation routines. These routines are usually two-way so you can get a unit matrix by
$a = zeroes(1000,1000); $a->diagonal(0,1) ++;
which is usually fairly efficient. See the PDL::Indexing manpage and the PDL::Tips manpage for more examples.
These functions are usually two-way:
$b = $a->slice("1:3"); $b += 5; # $a is changed!
If you want to force a copy and no ``flow'' backwards, you need
$b = $a->slice("1:3")->copy; $b += 5; # $a is not changed.
alternatively, you can use
$b = $a->slice("1:3")->sever;
which does not copy the struct but beware that after
$b = $a->slice("1:3"); $c = $b->sever;
the variables $b
and $c
point to the same object but with
->copy
they do not.
The fact that there is this kind of flow makes PDL a very powerful language in many ways: since you can alter the original data by altering some easier-to-use representation of it, many things are much easier to accomplish, just like making the above unit matrix.
Slicing is so central to the PDL language that a special compile-time syntax has been introduced to handle it compactly; see the PDL::NiceSlice manpage for details.
For the moment, you can't slice the empty piddle. This should probably change: slices of the empty piddle should probably return the empty piddle.
Many types of index errors are reported far from the indexing
operation that caused them. This is caused by the underlying architecture:
slice()
sets up a mapping between variables, but that mapping isn't
tested for correctness until it is used (potentially much later).
Signature: (P(); C())
Internal vaffine identity function.
Signature: (a(n); int ind(); [oca] c())
index
and index2d
provide rudimentary index indirection.
$c = index($source,$ind); $c = index2d($source2,$ind1,$ind2);
use the $ind
variables as indices to look up values in $source
.
index2d
uses separate piddles for X and Y coordinates. For more
general N-dimensional indexing, see the PDL::Slices manpage or the the PDL::NiceSlice manpage
syntax.
These functions are two-way, i.e. after
$c = $a->index(pdl[0,5,8]); $c .= pdl [0,2,4];
the changes in $c
will flow back to $a
.
index
provids simple threading: multiple-dimensioned arrays are treated
as collections of 1-D arrays, so that
$a = xvals(10,10)+10*yvals(10,10); $b = $a->index(3); $c = $a->index(9-xvals(10));
puts a single column from $a
into $b
, and puts a single element
from each column of $a
into $c
. If you want to extract multiple
columns from an array in one operation, see dice or indexND:/indexND.
Signature: (a(na,nb); int inda(); int indb(); [oca] c())
index
and index2d
provide rudimentary index indirection.
$c = index($source,$ind); $c = index2d($source2,$ind1,$ind2);
use the $ind
variables as indices to look up values in $source
.
index2d
uses separate piddles for X and Y coordinates. For more
general N-dimensional indexing, see the PDL::Slices manpage or the the PDL::NiceSlice manpage
syntax.
These functions are two-way, i.e. after
$c = $a->index(pdl[0,5,8]); $c .= pdl [0,2,4];
the changes in $c
will flow back to $a
.
index
provids simple threading: multiple-dimensioned arrays are treated
as collections of 1-D arrays, so that
$a = xvals(10,10)+10*yvals(10,10); $b = $a->index(3); $c = $a->index(9-xvals(10));
puts a single column from $a
into $b
, and puts a single element
from each column of $a
into $c
. If you want to extract multiple
columns from an array in one operation, see dice or indexND:/indexND.
Backwards-compatibility alias for indexND
Find selected elements in an N-D piddle, with optional boundary handling
$out = $source->indexND( $index, [$method] )
$source = 10*xvals(10,10) + yvals(10,10); $index = pdl([[2,3],[4,5]],[[6,7],[8,9]]); print $source->indexND( $index );
[ [23 45] [67 89] ]
IndexND collapses $index
by lookup into $source
. The
0th dimension of $index
is treated as coordinates in $source
, and
the return value has the same dimensions as the rest of $index
.
The returned elements are looked up from $source
. Dataflow
works -- propagated assignment flows back into $source
.
IndexND and IndexNDb were originally separate routines but they are both now implemented as a call to range, and have identical syntax to one another.
Signature: (P(); C(); SV *index; SV *size; SV *boundary)
Engine for range
Same calling convention as range, but you must supply all
parameters. rangeb
is marginally faster as it makes a direct PP call,
avoiding the perl argument-parsing step.
Extract selected chunks from a source piddle, with boundary conditions
$out = $source->range($index,[$size,[$boundary]])
Returns elements or rectangular slices of the original piddle, indexed by
the $index
piddle. $source
is an N-dimensional piddle, and $index
is
a piddle whose first dimension has order up to N. Each row of $index
is
treated as coordinates of a single value or chunk from $source
, specifying
the location(s)
to extract.
INPUTS
$index
and $size
can be piddles or array refs such as you would
feed to zeroes and its ilk. If $index
's 0th dimension
has order higher than the number of dimensions in $source
, then
$source
is treated as though it had trivial dummy dimensions of
order 1, up to the required size to be indexed by $index
-- so if
your source array is 1-D and your index array is a list of 3-vectors,
you get two dummy dimensions of order 1 on the end of your source array.
You can extract single elements or N-D rectangular ranges from $source
,
by setting $size
. If $size
is undef or zero, then you get a single
sample for each row of $index
. This behavior is similar to
indexNDb, which is in fact implemented as a call to range.
If $size
is positive then you get a range of values from $source
at
each location, and the output has extra dimensions allocated for them.
$size
can be a scalar, in which case it applies to all dimensions, or an
N-vector, in which case each element is applied independently to the
corresponding dimension in $source
. See below for details.
$boundary
is a number, string, or list ref indicating the type of
boundary conditions to use when ranges reach the edge of $source
. If you
specify no boundary conditions the default is to forbid boundary violations
on all axes. If you specify exactly one boundary condition, it applies to
all axes. If you specify more (as elements of a list ref, or as a packed
string, see below), then they apply to dimensions in the order in which they
appear, and the last one applies to all subsequent dimensions. (This is
less difficult than it sounds; see the examples below).
The boundary condition identifiers all begin with unique characters, so you can feed in multiple boundary conditions as either a list ref or a packed string. (The packed string is marginally faster to run). For example, the four expressions [0,1], ['forbid','truncate'], ['f','t'], and 'ft' all specify that violating the boundary in the 0th dimension throws an error, and all other dimensions get truncated.
If you feed in a single string, it is interpreted as a packed boundary array if all of its characters are valid boundary specifiers (e.g. 'pet'), but as a single word-style specifier if they are not (e.g. 'forbid').
OUTPUT
The output threads over both $index
and $source
. Because implicit
threading can happen in a couple of ways, a little thought is needed. The
returned dimension list is stacked up like this:
(index thread dims), (index dims (size)), (source thread dims)
The first few dims of the output correspond to the extra dims of
$index
(beyond the 0 dim). They allow you to pick out individual
ranges from a large, threaded collection.
The middle few dims of the output correspond to the size dims
specified in $size
, and contain the range of values that is extracted
at each location in $source
. Every nonzero element of $size
is copied to
the dimension list here, so that if you feed in (for example) $size
= [2,0,1]
you get an index dim list of (2,1)
.
The last few dims of the output correspond to extra dims of $source
beyond
the number of dims indexed by $index
. These dims act like ordinary
thread dims, because adding more dims to $source
just tacks extra dims
on the end of the output. Each source thread dim ranges over the entire
corresponding dim of $source
.
Dataflow: Dataflow is bidirectional.
Examples:
Here are basic examples of range
operation, showing how to get
ranges out of a small matrix. The first few examples show extraction
and selection of individual chunks. The last example shows
how to mark loci in the original matrix (using dataflow).
perldl> $src = 10*xvals(10,5)+yvals(10,5) perldl> print $src->range([2,3]) # Cut out a single element 23 perldl> print $src->range([2,3],1) # Cut out a single 1x1 block [ [23] ] perldl> print $src->range([2,3], [2,1]) # Cut a 2x1 chunk [ [23 33] ] perldl> print $src->range([[2,3]],[2,1]) # Trivial list of 1 chunk [ [ [23] [33] ] ] perldl> print $src->range([[2,3],[0,1]], [2,1]) # two 2x1 chunks [ [ [23 1] [33 11] ] ] perldl> # A 2x2 collection of 2x1 chunks perldl> print $src->range([[[1,1],[2,2]],[[2,3],[0,1]]],[2,1]) [ [ [ [11 22] [23 1] ] [ [21 32] [33 11] ] ] ] perldl> $src = xvals(5,3)*10+yvals(5,3) perldl> print $src->range(3,1) /* Thread over y dimension in $src */ [ [30] [31] [32] ]
perldl> $src = zeroes(5,4); perldl> $src->range(pdl([2,3],[0,1]),pdl(2,1)) .= xvals(2,2,1) + 1 perldl> print $src [ [0 0 0 0 0] [2 2 0 0 0] [0 0 0 0 0] [0 0 1 1 0] ]
CAVEAT: It's quite possible to select multiple ranges that intersect. In that case, modifying the ranges doesn't have a guaranteed result in the original PDL -- the result is an arbitrary choice among the valid values. For some things that's OK; but for others it's not. In particular, this doesn't work:
perldl> $photon_list = new PDL::RandVar->sample(500)->reshape(2,250)*10 perldl> histogram = zeroes(10,10) perldl> histogram->range($photon_list,1)++; #not what you wanted
The reason is that if two photons land in the same bin, then that bin doesn't get incremented twice. (That may get fixed in a later version...)
PERMISSIVE RANGING: If $index
has too many dimensions compared
to $source
, then $source is treated as though it had dummy
dimensions of order 1, up to the required number of dimensions. These
virtual dummy dimensions have the usual boundary conditions applied to
them.
Efficiency: Because range
isn't an affine transformation (it
involves lookup into a list of N-D indices), it is somewhat
memory-inefficient for long lists of ranges, and keeping dataflow open
is somewhat slower than for affine transformations (which don't have
to copy data around). Thus, if you have a few very large (order of
10^4 element) chunks, you might find that perl lists of ordinary
slices are faster.
range
is a perl front-end to a PP function, rangeb
. Calling
rangeb
is marginally faster but requires that you include all arguments.
DEVEL NOTES
* index thread dimensions are effectively clumped internally. This makes it easier to loop over the index array but a little more brain-bending to tease out the algorithm.
* Currently the index threads really do run fastest in memory; this is probably the wrong direction to thread, for fastest behavior -- modifying the appropriate dimincs in RedoDims ought to take care of it.
Signature: (int a(n); b(n); [o]c(m))
Run-length decode a vector
Given a vector $a
of the numbers of instances of values $b
, run-length
decode to $c
.
rld($a,$b,$c=null);
Signature: (c(n); int [o]a(n); [o]b(n))
Run-length encode a vector
Given vector $c
, generate a vector $a
with the number of each element,
and a vector $b
of the unique values. Only the elements up to the
first instance of 0
in $a
should be considered.
rle($c,$a=null,$b=null);
Signature: (P(); C(); int n1; int n2)
exchange two dimensions
Negative dimension indices count from the end.
The command
$b = $a->xchg(2,3);
creates $b
to be like $a
except that the dimensions 2 and 3
are exchanged with each other i.e.
$b->at(5,3,2,8) == $a->at(5,3,8,2)
Re-orders the dimensions of a PDL based on the supplied list.
Similar to the xchg method, this method re-orders the dimensions of a PDL. While the xchg method swaps the position of two dimensions, the reorder method can change the positions of many dimensions at once.
# Completely reverse the dimension order of a 6-Dim array. $reOrderedPDL = $pdl->reorder(5,4,3,2,1,0);
The argument to reorder is an array representing where the current dimensions
should go in the new array. In the above usage, the argument to reorder
(5,4,3,2,1,0)
indicates that the old dimensions ($pdl
's dims) should be re-arranged to make the
new pdl ($reOrderPDL
) according to the following:
Old Position New Position ------------ ------------ 5 0 4 1 3 2 2 3 1 4 0 5
Example:
perldl> $a = sequence(5,3,2); # Create a 3-d Array perldl> p $a [ [ [ 0 1 2 3 4] [ 5 6 7 8 9] [10 11 12 13 14] ] [ [15 16 17 18 19] [20 21 22 23 24] [25 26 27 28 29] ] ] perldl> p $a->reorder(2,1,0); # Reverse the order of the 3-D PDL [ [ [ 0 15] [ 5 20] [10 25] ] [ [ 1 16] [ 6 21] [11 26] ] [ [ 2 17] [ 7 22] [12 27] ] [ [ 3 18] [ 8 23] [13 28] ] [ [ 4 19] [ 9 24] [14 29] ] ]
The above is a simple example that could be duplicated by calling
$a->xchg(0,2)
, but it demonstrates the basic functionality of reorder.
As this is an index function, any modifications to the result PDL will change the parent.
Signature: (P(); C(); int n1; int n2)
move a dimension to another position
The command
$b = $a->mv(4,1);
creates $b
to be like $a
except that the dimension 4 is moved to the
place 1, so:
$b->at(1,2,3,4,5,6) == $a->at(1,5,2,3,4,6);
The other dimensions are moved accordingly. Negative dimension indices count from the end.
Signature: (P(); C(); int nth; int from; int step; int nsteps)
experimental function - not for public use
$a = oneslice();
This is not for public use currently. See the source if you have to. This function can be used to accomplish run-time changing of transformations i.e. changing the size of some piddle at run-time.
However, the mechanism is not yet finalized and this is just a demonstration.
Signature: (P(); C(); char* str)
Extract a rectangular slice of a piddle, from a string specifier.
slice
was the original Swiss-army-knife PDL indexing routine, but is
largely superseded by the NiceSlice source prefilter
and its associated nslice method. It is still used as the
basic underlying slicing engine for nslice, and is especially
useful in particular niche applications.
$a->slice('1:3'); # return the second to fourth elements of $a $a->slice('3:1'); # reverse the above $a->slice('-2:1'); # return last-but-one to second elements of $a
The argument string is a comma-separated list of what to do for each dimension. The current formats include the following, where a, b and c are integers and can take legal array index values (including -1 etc):
$a->slice(':,3')
is equal to $a->slice(',3')
).
$a
has dims (3,4,5)
then
$a->slice(':,(2),:')
has dimensions (3,5)
whereas
$a->slice(':,2,:')
has dimensions (3,1,5))
.
3:7:2
gives the indices
(3,5,7)
). This may be confusing to Matlab users but several other
packages already use this syntax.
An extension is planned for a later stage allowing
$a->slice('(=1),(=1|5:8),3:6(=1),4:6')
to express a multidimensional diagonal of $a
.
Trivial out-of-bounds slicing is allowed: if you slice a source
dimension that doesn't exist, but only index the 0th element, then
slice
treats the source as if there were a dummy dimension there.
The following are all equivalent:
xvals(5)->dummy(1,1)->slice('(2),0') # Add dummy dim, then slice xvals(5)->slice('(2),0') # Out-of-bounds slice adds dim. xvals(5)->slice((2),0) # NiceSlice syntax xvals(5)->((2))->dummy(0,1) # NiceSlice syntax
This is an error:
xvals(5)->slice('(2),1') # nontrivial out-of-bounds slice dies
Because slicing doesn't directly manipulate the source and destination pdl -- it just sets up a transformation between them -- indexing errors often aren't reported until later. This is either a bug or a feature, depending on whether you prefer error-reporting clarity or speed of execution.
Returns array of column numbers requested
line $pdl->using(1,2);
Plot, as a line, column 1 of $pdl
vs. column 2
perldl> $pdl = rcols("file"); perldl> line $pdl->using(1,2);
Signature: (P(); C(); SV *list)
Returns the multidimensional diagonal over the specified dimensions.
The diagonal is placed at the first (by number) dimension that is
diagonalized.
The other diagonalized dimensions are removed. So if $a
has dimensions
(5,3,5,4,6,5)
then after
$b = $a->diagonal(0,2,5);
the piddle $b
has dimensions (5,3,4,6)
and
$b->at(2,1,0,1)
refers
to $a->at(2,1,2,0,1,2)
.
NOTE: diagonal doesn't handle threadids correctly. XXX FIX
Signature: (P(); C(); int nthdim; int step; int n)
Returns a piddle of lags to parent.
Usage:
$lags = $a->lags($nthdim,$step,$nlags);
I.e. if $a
contains
[0,1,2,3,4,5,6,7]
then
$b = $a->lags(0,2,2);
is a (5,2) matrix
[2,3,4,5,6,7] [0,1,2,3,4,5]
This order of returned indices is kept because the function is called ``lags'' i.e. the nth lag is n steps behind the original.
$step
and $nlags
must be positive. $nthdim
can be
negative and will then be counted from the last dim backwards
in the usual way (-1 = last dim).
Signature: (P(); C(); int nthdim; int nsp)
Splits a dimension in the parent piddle (opposite of clump)
After
$b = $a->splitdim(2,3);
the expression
$b->at(6,4,x,y,3,6) == $a->at(6,4,x+3*y)
is always true (x
has to be less than 3).
Signature: (x(n); int shift(); [oca]y(n))
Shift vector elements along with wrap. Flows data back&forth.
Signature: (P(); C(); int id; SV *list)
internal
Put some dimensions to a threadid.
$b = $a->threadI(0,1,5); # thread over dims 1,5 in id 1
Signature: (P(); C())
A vaffine identity transformation (includes thread_id copying).
Mainly for internal use.
Signature: (P(); C(); int atind)
All threaded dimensions are made real again.
See [TBD Doc] for details and examples.
Dice rows/columns/planes out of a PDL using indexes for each dimension.
This function can be used to extract irregular subsets along many dimension of a PDL, e.g. only certain rows in an image, or planes in a cube. This can of course be done with the usual dimension tricks but this saves having to figure it out each time!
This method is similar in functionality to the slice
method, but slice requires that contiguous ranges or ranges
with constant offset be extracted. ( i.e. slice requires
ranges of the form 1,2,3,4,5
or 2,4,6,8,10
). Because of this
restriction, slice is more memory efficient and slightly faster
than dice
$slice = $data->dice([0,2,6],[2,1,6]); # Dicing a 2-D array
The arguments to dice are arrays (or 1D PDLs) for each dimension
in the PDL. These arrays are used as indexes to which rows/columns/cubes,etc
to dice-out (or extract) from the $data
PDL.
Use X
to select all indices along a given dimension (compare also
mslice). As usual (in slicing methods) trailing
dimensions can be omitted implying X
'es for those.
perldl> $a = sequence(10,4) perldl> p $a [ [ 0 1 2 3 4 5 6 7 8 9] [10 11 12 13 14 15 16 17 18 19] [20 21 22 23 24 25 26 27 28 29] [30 31 32 33 34 35 36 37 38 39] ] perldl> p $a->dice([1,2],[0,3]) # Select columns 1,2 and rows 0,3 [ [ 1 2] [31 32] ] perldl> p $a->dice(X,[0,3]) [ [ 0 1 2 3 4 5 6 7 8 9] [30 31 32 33 34 35 36 37 38 39] ] perldl> p $a->dice([0,2,5]) [ [ 0 2 5] [10 12 15] [20 22 25] [30 32 35] ]
As this is an index function, any modifications to the
slice change the parent (use the .=
operator).
Dice rows/columns/planes from a single PDL axis (dimension) using index along a specified axis
This function can be used to extract irregular subsets along any dimension, e.g. only certain rows in an image, or planes in a cube. This can of course be done with the usual dimension tricks but this saves having to figure it out each time!
$slice = $data->dice_axis($axis,$index);
perldl> $a = sequence(10,4) perldl> $idx = pdl(1,2) perldl> p $a->dice_axis(0,$idx) # Select columns [ [ 1 2] [11 12] [21 22] [31 32] ] perldl> $t = $a->dice_axis(1,$idx) # Select rows perldl> $t.=0 perldl> p $a [ [ 0 1 2 3 4 5 6 7 8 9] [ 0 0 0 0 0 0 0 0 0 0] [ 0 0 0 0 0 0 0 0 0 0] [30 31 32 33 34 35 36 37 38 39] ]
The trick to using this is that the index selects
elements along the dimensions specified, so if you
have a 2D image axis=0
will select certain X
values
- i.e. extract columns
As this is an index function, any modifications to the slice change the parent.
Copyright (C) 1997 Tuomas J. Lukka. Contributions by Craig DeForest, deforest@boulder.swri.edu. All rights reserved. There is no warranty. You are allowed to redistribute this software / documentation under certain conditions. For details, see the file COPYING in the PDL distribution. If this file is separated from the PDL distribution, the copyright notice should be included in the file.
PDL::Slices -- Stupid index tricks |