CAM::PDF::Content - PDF page layout parser

NAME

CAM::PDF::Content - PDF page layout parser

LICENSE

SYNOPSIS

    use CAM::PDF;
    my $pdf = CAM::PDF->new($filename);
    
    my $contentTree = $pdf->getPageContentTree(4);
    $contentTree->validate() || die 'Syntax error';
    print $contentTree->render('CAM::PDF::Renderer::Text');
    $pdf->setPageContent(5, $contentTree->toString());

DESCRIPTION

This class is used to manipulate the layout commands for a single page of PDF. The page content is passed as a scalar and parsed according to Adobe's PDF Reference 3rd edition (for PDF v1.4). All of the commands from Appendix A of that document are parsed and understood.

Much of the content object's functionality is wrapped up in renderers that can be applied to it. See the canonical renderer, CAM::PDF::GS, and the render() method below for more details.

FUNCTIONS

$pkg->new($content)
$pkg->new($content, $data)
$pkg->new($content, $data, $verbose): Parse a scalar CONTENT containing PDF page layout content. Returns a parsed, but unvalidated, data structure.; The DATA argument is a hash reference of contextual data that may be needed to work with content. This is only needed for toString() method (which needs doc => CAM::PDF object to work with images) and the render methods, to which the DATA reference is passed verbatim. See the individual renderer modules for details about required elements.; The VERBOSE boolean indicates whether the parser should Carp when it encounters problems. The default is false.
$self->parse($contentref): This is intended to be called by the new() method. The argument should be a reference to the content scalar. It's passed by reference so it is never copied.
$self->validate(): Returns a boolean if the parsed content tree conforms to the PDF specification.
$self->render($rendererclass): Traverse the content tree using the specified rendering class. See CAM::PDF::GS or CAM::PDF::Renderer::Text for renderer examples. Renderers should typically derive from CAM::PDF::GS, but it's not essential. Typically returns an instance of the renderer class.; The rendering class is loaded via require if not already in memory.
$self->computeGS()
$self->computeGS($skiptext): Traverses the content tree and computes the coordinates of each graphic point along the way. If the $skiptext boolean is true (default: false) then text blocks are ignored to save time, since they do not change the global graphic state.; This is a thin wrapper around render() with CAM::PDF::GS or CAM::PDF::GS::NoText selected as the rendering class.
$self->findImages(): Traverse the content tree, accumulating embedded images and image references, according to the CAM::PDF::Renderer::Images renderer.
$self->traverse($rendererclass): This recursive method is typically called only by wrapper methods, like render(). It instantiates renderers as needed and calls methods on them.
$self->toString(): Flattens a content tree back into a scalar, ready to be inserted back into a PDF document. Since whitespace is discarded by the parser, the resulting scalar will not be identical to the original.

AUTHOR

See the CAM::PDF manpage

CAM::PDF::Content - PDF page layout parser