XML::XPathScript::Processor - the XML transformation engine in XML::XPathScript
|
XML::XPathScript::Processor - the XML transformation engine in XML::XPathScript
In a stylesheet ->{testcode}
sub for e.g. Docbook's
<ulink>
>> tag:
my $url = findvalue('@url',$self);
if (findnodes("node()", $self)) {
# ...
$t->{pre}=qq'<a href="$url">';
$t->{post}=qq'</a>';
return DO_SELF_AND_KIDS;
} else {
$t->{pre}=qq'<a href="$url">$url</a>';
$t->{post}=qq'';
return DO_SELF_ONLY;
};
At the stylesheet's top-level one often finds:
<%= apply_templates() %>
The XML::XPathScript distribution offers an XML parser glue, an
embedded stylesheet language, and a way of processing an XML document
into a text output. This package implements the latter part: it takes
an already filled out $t
template hash and an already parsed
XML document (which come from the XML::XPathScript manpage behind the scenes),
and provides a simple API to implement stylesheets. In particular, the
apply_templates function triggers the recursive expansion of
the whole XML document when used as shown in SYNOPSIS.
All of these functions are intended to be called solely from within
the ->{testcode}
templates or <% %>
or <%= %>
blocks in XPathScript stylesheets. They are automatically exported to
both these contexts.
- DO_SELF_AND_KIDS, DO_SELF_ONLY, DO_NOT_PROCESS,
DO_TEXT_AS_CHILD
-
Symbolic constants evaluating respectively to 1, -1, 0 and 2, to be
used as mnemotechnic return values in
->{testcode}
routines
instead of the numeric values which are harder to
remember. Specifically:
- DO_SELF_AND_KIDS
-
tells XML::XPathScript::Processor to render the current node as
<
$t-
{pre} >>, followed by the result of the call to
apply_templates on the subnodes, followed by $t->{post}
.
- DO_SELF_ONLY
-
tells XML::XPathScript::Processor to render the current node simply
as
$t->{pre}
, followed by $t->{post}
.
- DO_NOT_PROCESS
-
tells XML::XPathScript::Processor to render the current node as the
empty string.
- DO_TEXT_AS_CHILD
-
only meaningful for text nodes. When this value is returned, XML::XPathScript::Processor
pretends that the text is a child of the node, which basically means that
$t->{pre}
and $t->{post}
will frame the text instead of
replacing it.
E.g.
$t->{pre} = '<text/>';
# will do <foo>bar</foo> => <foo><text/></foo>
$t->{pre} = '<t>';
$t->{post} = '</t>';
$t->{testcode} = sub{ DO_TEXT_AS_CHILD };
# will do <foo>bar</foo> => <foo><t>bar</t></foo>
- findnodes($path)
-
- findnodes($path, $context)
-
Returns a list of nodes found by XPath expression $path, optionally
using $context as the context node (default is the root node of the
current document). In scalar context returns a NodeSet object (but
you do not want to do that, see XPath scalar return values considered harmful in the XML::XPathScript manpage).
- findvalue($path)
-
- findvalue($path, $context)
-
Evaluates XPath expression $path and returns the resulting value. If
the path returns one of the ``Literal'', ``Numeric'' or ``NodeList'' XPath
types, the stringification is done automatically for you using
xpath_to_string.
- xpath_to_string($blob)
-
Converts any XPath data type, such as ``Literal'', ``Numeric'',
``NodeList'', text nodes, etc. into a pure Perl string (UTF-8 tainted
too - see is_utf8_tainted). Scalar XPath types are interpreted in
the straightforward way, DOM nodes are stringified into conform XML,
and NodeList's are stringified by concatenating the stringification of
their members (in the latter case, the result obviously is not
guaranteed to be valid XML).
See XPath scalar return values considered harmful in the XML::XPathScript manpage
on why this is useful.
- findvalues($path)
-
- findvalues($path, $context)
-
Evaluates XPath expression $path as a nodeset expression, just like
findnodes would, but returns a list of UTF8-encoded XML strings
instead of node objects or node sets. See also
XPath scalar return values considered harmful in the XML::XPathScript manpage.
- findnodes_as_string($path)
-
- findnodes_as_string($path, $context)
-
Similar to findvalues but concatenates the XML snippets. The
result obviously is not guaranteed to be valid XML.
- matches($node, $path)
-
- matches($node, $path, $context)
-
Returns true if the node matches the path (optionally in context $context)
- apply_templates()
-
- apply_templates($xpath)
-
- apply_templates($xpath, $context)
-
- apply_templates(@nodes)
-
This is where the whole magic in XPathScript resides: recursively
applies the stylesheet templates to the nodes provided either
literally (last invocation form) or through an XPath expression
(second and third invocation forms), and returns a string
concatenation of all results. If called without arguments at all,
renders the whole document (same as
apply_templates("/")
).
Calls to apply_templates() may occur both implicitly (at the top of
the document, and for rendering subnodes when the templates choose to
handle that by themselves), and explicitly (because testcode
routines require the XML::XPathScript::Processor to
DO_SELF_AND_KIDS).
If appropriate care is taken in all templates (especially the
testcode
routines and the text() template), the string result of
apply_templates need not be UTF-8 (see
binmode in the XML::XPathScript manpage): it is thus possible to use XPathScript
to produce output in any character set without an extra translation
pass.
- call_template($node, $t, $templatename)
-
EXPERIMENTAL - allows
testcode
routines to invoke a template by
name, even if the selectors do not fit (e.g. one can apply template B
to an element node of type A). Returns the stylesheeted string
computed out of $node just like apply_templates would.
- is_element_node ( $object )
-
Returns true if $object is an element node, false otherwise.
- is_text_node ( $object )
-
Returns true if $object is a ``true'' text node (not a comment node),
false otherwise.
- is_comment_node ( $object )
-
Returns true if $object is an XML comment node, false otherwise.
- is_pi_node ( $object )
-
Returns true iff $object is a processing instruction node.
- is_nodelist ( $object )
-
Returns true if $node is a node list (as returned by findnodes in
scalar context), false otherwise.
- is_utf8_tainted($string)
-
Returns true if Perl thinks that $string is a string of characters (in
UTF-8 internal representation), and false if Perl treats $string as a
meaningless string of bytes.
The dangerous part of the story is when concatenating a non-tainted
string with a tainted one, as it causes the whole string to be
re-interpreted into UTF-8, even the part that was supposedly
meaningless character-wise, and that happens in a nonportable fashion
(depends on locale and Perl version). So don't do that - and use this
function to prevent that from happening.
- get_xpath_of_node($node)
-
Returns an XPath string that points to $node, from the root. Useful to
create error messages that point at some location in the original XML
document.
Right now XML::XPathScript::Processor is just an auxillary module
to the XML::XPathScript manpage which should not be called directly: in other
words, XPathScript's XML processing engine is not (yet) properly
decoupled from the stylesheet language parser, and thus cannot stand
alone.
XML::XPathScript::Processor - the XML transformation engine in XML::XPathScript
|