PDF::Reuse - Reuse and mass produce PDF documents


NAME

PDF::Reuse - Reuse and mass produce PDF documents


SYNOPSIS

   use PDF::Reuse;                     
   prFile('myFile.pdf');
   prText(100, 500, 'Hello World !');
   prEnd();


DESCRIPTION

This module could be used when you want to mass produce similar (but not identical) PDF documents and reuse templates, JavaScripts and some other components. It is functional to be fast, and to give your programs capacity to produce many pages per second and very big PDF documents if necessary.

The module produces PDF-1.4 files. Some features of PDF-1.5, like ``object streams'' and ``cross reference streams'', are supported, but only at an experimental level. More testing is needed. (If you get problems with a new document from Acrobat 6 or 7, try to save it or recreate it as a PDF-1.4 document first, before using it together with this module.)

Templates
Use your favorite program, probably a commercial visual tool, to produce single PDF-files to be used as templates, and then use this module to mass produce files from them.

(If you want small PDF-files or want special graphics, you can use this module also, but visual tools are often most practical.)

Lists
The module uses ``XObjects'' extensively. This is a format that makes it possible create big lists, which are compact at the same time.

=item PDF-operators

The module gives you a good possibility to program at a ``low level'' with the basic graphic operators of PDF, if that is what you want to do. You can build your own libraries of low level routines, with PDF-directives ``controlled'' by Perl.

Archive-format
If you want, you get your new documents logged in a format suitable for archiving or transfer.

PDF::Reuse::Tutorial might show you best what you can do with this module.

JavaScript
You can attach JavaScripts to your PDF-files.

You can have libraries of JavaScripts. No cutting or pasting, and those who include the scripts in documents only need to know how to initiate them. (Of course those who write the scripts have to know Acrobat JavaScript well.)

Remarks about JavaScript

Some of the functions handling JavaScript have to be rewritten for Acrobat 7.

There are many limitations with Acrobat JavaScript, and the rules often change. So what works for one version of Acrobat/Reader, might not work for another. Another complication is this: When documents are downloaded via the net by Acrobat, they are most often converted (!) and necessary JavaScripts are lost.


=head1 FUNCTIONS

All functions which are successful return specified values or 1.

The module doesn't make any attempt to import anything from encrypted files.


Overview

To write a program with PDF::Reuse, you need these components:

First Perhaps* Always Any or None Probably** Finally
use PDF::Reuse prFile prInitVars
prExtract
prForm
prInit
prField
prImage
prJpeg
prFont
prFontSize
prGraphState
prAdd
prText
prJs
prCompress
prMbox
prBookmark
prStrWidth
prLink
prDoc
prPage
prSinglePage
prEnd
prDocDir*
prLogDir*
prDocForm*
prGetLogBuffer*
prBar*
prLog*
prTouchUp*
prVers*
prCid*
prId*
prIdType*
* = internal/ deprecated function
** = not needed before prEnd or a new prFile
In those cases prPage is automatically inserted


Mandatory Functions

prFile - define output

Alternative 1:

   prFile ( $fileName );

Alternative 2 with parameters in an anonymous hash:

   prFile ( { Name         => $fileName,
              HideToolbar  => 1,            # 1 or 0
              HideMenubar  => 1,            # 1 or 0
              HideWindowUI => 1,            # 1 or 0
              FitWindow    => 1,            # 1 or 0
              CenterWindow => 1   } );      # 1 or 0

Alternative 3:

   prFile ( $r );  # For mod_perl 2 pass the request object

$fileName is optional, just like the rest of the parameters. File to create. If another file is current when this function is called, the first one is written and closed. Only one file is processed at a single moment. If $fileName is undefined, output is written to STDOUT.

HideToolbar, HideMenubar, HideWindowUI, FitWindow and CenterWindow control the way the document is initially displayed.

Look at any program in this documentation for examples. prInitVars() shows how this function could be used together with a web server.

prEnd - end/flush buffers

   prEnd ()

When the processing is going to end, the buffers of the last file has to be written to the disc. If this function is not called, the page structure, xref part and so on will be lost.

Look at any program in this documentation for an example.


Optional Functions

prAdd - add ``low level'' instructions

    prAdd ( $string )

With this command you can add whatever you want to the current content stream. No syntactical checks are made, but if you use an internal name, the module tries to add the resource of the ``name object'' to the ``Resources'' of current page. ``Name objects'' always begin with a '/'.

(In this documentation I often use talk about an ``internal name''. It denotes a ``name object''. When PDF::Reuse creates these objects, it assigns Ft1, Ft2, Ft3 ... for fonts, Ig1, Ig2, Ig3 for images, Fo1 .. for forms, Cs1 .. for Color spaces, Pt1 .. for patterns, Sh1 .. for shading directories, Gs0 .. for graphic state parameter dictionaries. These names are kept until the program finishes, and my ambition is also to keep the resources available in internal tables.)

This is a simple and very powerful function. You should study the examples and the ``PDF-reference manual'', if you want to use it.(When this text is written, a possible link to download it is: http://partners.adobe.com/asn/developer/acrosdk/docs.html)

This function is intended to give you detail control at a low level.

   use PDF::Reuse;
   use strict;
   prFile('myFile.pdf');
   my $string = "150 600 100 50 re\n";  # a rectangle 
   $string   .= "0 0 1 rg\n";           # blue (to fill)
   $string   .= "b\n";                  # fill and stroke
   prAdd($string);                       
   prEnd();

=head2 prBookmark               - define bookmarks
   prBookmark($reference)

Defines a ``bookmark''. $reference refers to a hash or array of hashes which look something like this:


          {  text  => 'Document',
             act   => 'this.pageNum = 0; this.scroll(40, 500);',
             kids  => [ { text => 'Chapter 1',
                          act  => '1, 40, 600'
                        },
                        { text => 'Chapter 2',
                          act  => '10, 40, 600'
                        } 
                      ]
          }

Each hash can have these components:

        text    the text shown beside the bookmark
        act     the action to be triggered. Has to be a JavaScript action.
                (Three simple numbers are translated to page, x and y in the
                sentences: this.pageNum = page; this.scroll(x, y); )
        kids    will have a reference to another hash or array of hashes
        close   if this component is present, the bookmark will be closed
                when the document is opened
        color   3 numbers, RGB-colors e.g. '0.5 0.5 1' for light blue
        style   0, 1, 2, or 3. 0 = Normal, 1 = Italic, 2 = Bold, 3 = Bold Italic

Creating bookmarks for a document:

    use PDF::Reuse;
    use strict;
    my @pageMarks;
    prFile('myDoc.pdf');
    for (my $i = 0; $i < 100; $i++)
    {   prText(40, 600, 'Something is written');
        # ...
        my $page = $i + 1;
        my $bookMark = { text => "Page $page",
                         act  => "$i, 40, 700" };
        push @pageMarks, $bookMark;
        prPage();
    }
    prBookmark( { text  => 'Document',
                  close => 1,
                  kids  => \@pageMarks } );
    prEnd();

Traditionally bookmarks have mainly been used for navigation within a document,
but they can be used for many more things. You can e.g. use them to navigate within
your data. You can let your users go to external links also, so they can "drill down"
to other documents.

See ``Remarks about JavaScript''

prCompress - compress/zip added streams

   prCompress (1)

'1' here is a directive to compress all new streams of the current file. Streams which are included with prForm, prDocForm, prDoc or prSinglePage are not changed. New JavaScripts are also created as streams and compressed, if they are at least 100 bytes long. The streams are compressed in memory, so probably there is a limit of how big they can be.

prCompress(); is a directive not to compress. This is default.

See e.g. ``Starting to reuse'' in the tutorial for an example.

prDoc - include pages from a document

   prDoc ( $documentName, $firstPage, $lastPage )

or with the parameters in an anonymous hash:

   prDoc ( { file  => $documentName,
             first => $firstPage,
             last  => $lastPage } );

Returns number of extracted pages.

If ``first'' is not given, 1 is assumed. If ``last'' is not given, you don't have any upper limit. N.B. The numbering of the pages differs from Acrobat JavaScript. In JavaScript the first page has index 0.

Adds pages from a document to the one you are creating. N.B. From version 0.32 of this module: If there are contents created with with prText, prImage,prAdd, prForm and so on, prDoc tries to put the contents on the first extracted page from the old document.


If it is the first interactive
component ( prDoc() or prDocForm() ) the interactive functions are kept and also merged
with JavaScripts you have added, if any. But, if you specify a first page different than 1
or a last page, no JavaScript are extracted from the document, because then there is a
risk that an included JavaScript function might refer to something not included.
   use PDF::Reuse;
   use strict;
   prFile('myFile.pdf');                  # file to make
   prJs('customerResponse.js');           # include a JavaScript file
   prInit('nameAddress(12, 150, 600);');  # init a JavaScript function
   prForm('best.pdf');                    # page 1 from best.pdf
   prPage();
   prDoc('long.pdf');                     # a document with 11 pages
   prForm('best.pdf');                    # page 1 from best.pdf
   prText(150, 700, 'Customer Data');     # a line of text
   prEnd();

To extract pages 2-3 and 5-7 from a document and create a new document:

   use PDF::Reuse;
   use strict;
    
   prFile('new.pdf');
   prDoc( { file  => 'old.pdf',
            first => 2, 
            last  => 3 });
   prDoc( { file  => 'old.pdf',
            first => 5, 
            last  => 7 });
   prEnd();
   
   
To add a form, image and page number to each page of an 16 pages long document
(The document Battery.pdf is cropped so each page is fairly small)  You could also have
used prSinglePage, look at a very similar example under that function.
   use PDF::Reuse;
   use PDF::Reuse::Util;
   use strict;
     
   prFile('test.pdf');
     
     my $pageNumber = 0;
     
     for (my $page = 1; $page < 17; $page++)
     {   $pageNumber++;
         prForm(  { file =>'Words.pdf',
                    page => 5,
                    x    => 150,
                    y    => 150} );
                    
         prImage( { file =>'Media.pdf',
                    page => 6,
                    imageNo => 1,
                    x  => 450,
                    y  => 450 } );
         blackText();
         prText( 360, 250, $pageNumber);
         prDoc('Battery.pdf', $pageNumber, $pageNumber);
     }
   prEnd;

=head2 prDocDir         - set directory for produced documents
   prDocDir ( $directoryName )

Sets directory for produced documents

   use PDF::Reuse;
   use strict;
   prDocDir('C:/temp/doc');
   prFile('myFile.pdf');         # writes to C:\temp\doc\myFile.pdf
   prForm('myFile.pdf');         # page 1 from ..\myFile.pdf
   prText(200, 600, 'New text');
   prEnd();

prDocForm - use an interactive page as a form

Alternative 1) You put your parameters in an anonymous hash (only file is really necessary, the others get default values if not given).

   prDocForm ( { file     => $pdfFile,       # template file
                 page     => $page,          # page number (of imported template)
                 adjust   => $adjust,        # try to fill the media box
                 effect   => $effect,        # action to be taken
                 tolerant => $tolerant,      # continue even with an invalid form
                 x        => $x,             # $x points from the left
                 y        => $y,             # $y points from the bottom
                 rotate   => $degree,        # rotate 
                 size     => $size,          # multiply everything by $size
                 xsize    => $xsize,         # multiply horizontally by $xsize
                 ysize    => $ysize } )      # multiply vertically by $ysize
Ex.:
    my $internalName = prDocForm ( {file     => 'myFile.pdf',
                                    page     => 2 } );
              
Alternative 2) You put your parameters in this order
        prDocForm ( $pdfFile, [$page, $adjust, $effect, $tolerant, $x, $y, $degree,
            $size, $xsize, $ysize] )

Anyway the function returns in list context:  B<$intName, @BoundingBox, 
$numberOfImages>, in scalar context:  B<$internalName> of the form.

Look at prForm() for an explanation of the parameters.

N.B. Usually you shouldn't adjust or change size and proportions of an interactive page. The graphic and interactive components are independent of each other and there is a great risk that any coordination is lost.

This function redefines a page to an ``XObject'' (the graphic parts), then the page can be reused in a much better way. Unfortunately there is an important limitation here. ``XObjects'' can only have single streams. If the page consists of many streams, you should concatenate them first. Adobe Acrobat can do that. (If it is an important file, take a copy of it first. Sometimes the procedure fails.) Open the document with Acrobat. Then choose the the ``TouchUp Text'' tool (icon or from the tools menu). Select a line of text somewhere on the page. Right-click the mouse. Choose ``Attributes''.Change font size or anything else, and then you change it back to the old value. Save the document. If there was no text on the page, use some other ``Touch Up'' tool.


   use PDF::Reuse;
   use strict;
   prDocDir('C:/temp/doc');
   prFile('newForm.pdf');
   prField('Mr/Ms', 'Mr');
   prField('First_Name', 'Lars');
   prDocForm('myFile.pdf');
   prFontSize(24);
   prText(75, 790, 'This text is added');
   prEnd();

(You can use the output from the example in prJs() as input to this example. Remember to save that file before closing it.)

See Remarks about JavaScript

prExtract - extract an object group

   prExtract ( $pdfFile, $pageNo, $oldInternalName )

oldInternalName, a ``name''-object. This is the internal name you find in the original file. Returns a $newInternalName which can be used for ``low level'' programming. You have better look at graphObj_pl and modules it has generated for the tutorial, e.g. thermometer.pm, to see how this function can be used.

When you call this function, the necessary objects will be copied to your new PDF-file, and you can refer to them with the new name you receive.


=head2 prField          - assign a value to an interactive field
        prField ( $fieldName, $value )

$fieldName is an interactive field in the document you are creating. It has to be spelled exactly the same way here as it spelled in the document. $value is what you want to assigned to the field. Put all your sentences with prField early in your script. After prFile and before prDoc or prDocForm and of course before prEnd. Each sentence with prField is translated to JavaScript and merged with old JavaScript

See prDocForm() for an example

If you are going to assign a value to a field consisting of several lines, you can write like this:

   my $string = "This is the first line \r second line \n 3:rd line";
   prField('fieldName', $string);

You can also let '$value' be a snippet of JavaScript-code that assigns something to the field. Then you have to put 'js:' first in ``$value'' like this:

   my $sentence = encrypt('This will be decrypted by "unPack"(JavaScript) ');
   prField('Interest_9', "js: unPack('$sentence')");

If you refer to a JavaScript function, it has to be included with prJs first. (The JavaScript interpreter will simply not be aware of old functions in the PDF-document, when the initiation is done.)


=head2 prFont           - set current font
   prFont ( $fontName )

$fontName is an ``external'' font name. The parameter is optional. In list context returns $internalName, $externalName, $oldInternalName, $oldExternalname The first two variables refer to the current font, the two later to the font before the change. In scalar context returns b<$internalName>

If a font wasn't found, Helvetica will be set. These names are always recognized: Times-Roman, Times-Bold, Times-Italic, Times-BoldItalic, Courier, Courier-Bold, Courier-Oblique, Courier-BoldOblique, Helvetica, Helvetica-Bold, Helvetica-Oblique, Helvetica-BoldOblique or abbreviated TR, TB, TI, TBI, C, CB, CO, CBO, H, HB, HO, HBO. (Symbol and ZapfDingbats or abbreviated S, Z, also belong to the predefined fonts, but there is something with them that I really don't understand. You should print them first on a page, and then use other fonts, otherwise they are not displayed.)

You can also use a font name from an included page. It has to be spelled exactly as it is done there. Look in the file and search for ``/BaseFont'' and the font name. But take care, e.g. the PDFMaker which converts to PDF from different Microsoft programs, only defines exactly those letters you can see on the page. You can use the font, but perhaps some of your letters were not defined.

In the distribution there is an utility program, 'reuseComponent_pl', which displays included fonts in a PDF-file and prints some letters. Run it to see the name of the font and if it is worth extracting.

   use PDF::Reuse;
   use strict;
   prFile('myFile.pdf');
   ####### One possibility #########
   prFont('Times-Roman');     # Just setting a font
   prFontSize(20);
   prText(180, 790, "This is a heading");
   ####### Another possibility #######
   my $font = prFont('C');    # Setting a font, getting an  
                              # internal name
   prAdd("BT /$font 12 Tf 25 760 Td (This is some other text)Tj ET"); 
   prEnd();

The example above shows you two ways of setting and using a font. One simple, and one complicated with a possibility to detail control.


=head2 prFontSize               - set current font size
   prFontSize ( $size )

Returns $actualSize, $fontSizeBeforetheChange. Without parameters prFontSize() sets the size to 12 points, which is default.

prForm - use a page from an old document as a form/background

Alternative 1) You put your parameters in an anonymous hash (only file is really necessary, the others get default values if not given).

   prForm ( { file     => $pdfFile,       # template file
              page     => $page,          # page number (of imported template)
              adjust   => $adjust,        # try to fill the media box
              effect   => $effect,        # action to be taken
              tolerant => $tolerant,      # continue even with an invalid form
              x        => $x,             # $x points from the left
              y        => $y,             # $y points from the bottom
              rotate   => $degree,        # rotate 
              size     => $size,          # multiply everything by $size
              xsize    => $xsize,         # multiply horizontally by $xsize
              ysize    => $ysize } )      # multiply vertically by $ysize
Ex.:
    my $internalName = prForm ( {file     => 'myFile.pdf',
                                 page     => 2 } );
              
Alternative 2) You put your parameters in this order
        prForm ( $pdfFile, $page, $adjust, $effect, $tolerant, $x, $y, $degree,
            $size, $xsize, $ysize )

Anyway the function returns in list context:  B<$intName, @BoundingBox, 
$numberOfImages>, in scalar context:  B<$internalName> of the form.

if page is excluded 1 is assumed.

adjust, could be 1, 2 or 0/nothing. If it is 1, the program tries to adjust the form to the current media box (paper size) and keeps the proportions unchanged. If it is 2, the program tries to fill as much of the media box as possible, without regards to the original proportions. If this parameter is given, ``x'', ``y'', ``rotate'', ``size'', ``xsize'' and ``ysize'' will be ignored.

effect can have 3 values: 'print', which is default, loads the page in an internal table, adds it to the document and prints it to the current page. 'add', loads the page and adds it to the document. (Now you can ``manually'' manage the way you want to print it to different pages within the document.) 'load' just loads the page in an internal table. (You can now take parts of a page like fonts and objects and manage them, without adding all the page to the document.)You don't get any defined internal name of the form, if you let this parameter be 'load'.

tolerant can be nothing or something. If it is undefined, you will get an error if your program tries to load a page which the system cannot really handle, if it e.g. consists of many streams. If it is set to something, you have to test the first return value $internalName to know if the function was successful. Look at the program 'reuseComponent_pl' for an example of usage.

x where to start along the x-axis (cannot be combined with ``adjust'')

y where to start along the y-axis (cannot be combined with ``adjust'')

rotate A degree 0-360 to rotate the form counter-clockwise. (cannot be combined with ``adjust'') Often the form disappears out of the media box if degree >= 90. Then you can move it back with the x and y-parameters. If degree == 90, you can add the width of the form to x, If degree == 180 add both width and height to x and y, and if degree == 270 you can add the height to y.

rotate can also by one of 'q1', 'q2' or 'q3'. Then the system rotates the form clockwise 90, 180 or 270 degrees and tries to keep the form within the media box.

The rotation takes place after the form has been resized or moved.

   Ex. To rotate from portrait (595 x 842 pt) to landscape (842 x 595 pt)
   use PDF::Reuse;
   use strict;
   
   prFile('New_Report.pdf');
   prMbox(0, 0, 842, 595);           
   
   prForm({file   => 'cert1.pdf',
           rotate => 'q1' } );  
   prEnd();

The same rotation can be achieved like this:

   use PDF::Reuse;
   use strict;
   
   prFile('New_Report.pdf');
   prMbox(0, 0, 842, 595);
               
   prForm({file   => 'cert1.pdf',
           rotate => 270,
           y      => 595 } );  
   prEnd();

size multiply every measure by this value (cannot be combined with ``adjust'')

xsize multiply horizontally by this value (cannot be combined with ``adjust'')

ysize multiply vertically by $ysize (cannot be combined with ``adjust'')

This function redefines a page to an ``XObject'' (the graphic parts), then the page can be reused and referred to as a unit. Unfortunately there is an important limitation here. ``XObjects'' can only have single streams. If the page consists of many streams, you should concatenate them first. Adobe Acrobat can do that. (If it is an important file, take a copy of it first. Sometimes the procedure fails.) Open the document with Acrobat. Then choose the ``TouchUp Text'' tool. Select a line of text somewhere. Right-click the mouse. Choose ``Attributes''. Change font size or anything else, and then you change it back to the old value. Save the document. You could alternatively save the file as Postscript and redistill it with the distiller or with Ghost script, but this is a little more risky. You might loose fonts or something else. An other alternative could be to use prSinglePage().


   use PDF::Reuse;
   use strict;
   prFile('myFile.pdf');
   prForm('best.pdf');    # Takes page No 1
   prText(75, 790, 'Dear Mr Gates');
   # ...
   prPage();
   prMbox(0, 0, 900, 960);
   my @vec = prForm(   { file => 'EUSA.pdf',
                         adjust => 1 } );
   prPage();
   prMbox();
   prText(35, 760, 'This is the final page');
   # More text ..
   #################################################################
   # We want to put a miniature of EUSA.pdf, 35 points from the left
   # 85 points up, and in the format 250 X 200 points
   #################################################################
   my $xScale = 250 / ($vec[3] - $vec[1]);
   my $yScale = 200 / ($vec[4] - $vec[2]);
   
   prForm ({ file => 'EUSA.pdf',
             xsize => $xScale,
             ysize => $yScale,
             x     => 35,
             y     => 85 });
   prEnd();

The first prForm(), in the code, is a simple and ``normal'' way of using the the function. The second time it is used, the size of the imported page is changed. It is adjusted to the media box which is current at that moment. Also data about the form is taken, so you can control more in detail how it will be displayed.

prGetLogBuffer - get the log buffer.

prGetLogBuffer ()

returns a $buffer of the log of the current page. (It could be used e.g. to calculate a MD5-digest of what has been registered that far, instead of accumulating the single values) A log has to be active, see prLogDir() below

Look at ``Using the template'' and ``Restoring a document from the log'' in the tutorial for examples of usage.

prGraphState - define a graphic state parameter dictionary

   prGraphState ( $string )

This is a ``low level'' function. Returns $internalName. The $string has to be a complete dictionary with initial ``<<'' and terminating ``>>''. No syntactical checks are made. Perhaps you will never have to use this function.

   use PDF::Reuse;
   use strict;
   prFile('myFile.pdf');
   ###################################################
   # Draw a triangle with Gs0 (automatically defined)
   ###################################################
   my $str = "q\n";
   $str   .= "/Gs0 gs\n";
   $str   .= "150 700 m\n";
   $str   .= "225 800 l\n";
   $str   .= "300 700 l\n";
   $str   .= "150 700 l\n";
   $str   .= "S\n";
   $str   .= "Q\n";
   prAdd($str);
   ########################################################
   # Define a new graph. state param. dic. and draw a new
   # triangle further down 
   ########################################################
   $str = '<</Type/ExtGState/SA false/SM 0.02/TR2 /Default'
                      . '/LW 15/LJ 1/ML 1>>';
   my $gState = prGraphState($str);
   $str  = "q\n";
   $str .= "/$gState gs\n";
   $str .= "150 500 m\n";
   $str .= "225 600 l\n";
   $str .= "300 500 l\n";
   $str .= "150 500 l\n";
   $str .= "S\n";
   $str .= "Q\n";
   prAdd($str);
   
   prEnd();

=head2 prImage          - reuse an image from an old PDF document

Alternative 1) You put your parameters in an anonymous hash (only file is really necessary, the others get default values if not given).

   prImage( { file     => $pdfFile,       # template file
              page     => $page,          # page number
              imageNo  => $imageNo        # image number
              adjust   => $adjust,        # try to fill the media box
              effect   => $effect,        # action to be taken
              x        => $x,             # $x points from the left
              y        => $y,             # $y points from the bottom
              rotate   => $degree,        # rotate 
              size     => $size,          # multiply everything by $size
              xsize    => $xsize,         # multiply horizontally by $xsize
              ysize    => $ysize } )      # multiply vertically by $ysize
Ex.:
   prImage( { file    => 'myFile.pdf',
              page    => 10,
              imageNo => 2 } );
              
Alternative 2) You put your parameters in this order
        prImage ( $pdfFile, [$page, $imageNo, $effect, $adjust, $x, $y, $degree,
            $size, $xsize, $ysize] )

Returns in scalar context $internalName As a list $internalName, $width, $height

Assumes that $pageNo and $imageNo are 1, if not specified. If $effect is given and anything else then 'print', the image will be defined in the document, but not shown at this moment.

For all other parameters, look at prForm().

   use PDF::Reuse;
   use strict;
   prFile('myFile.pdf'); 
   my @vec = prImage({ file  => 'best.pdf',
                       x     => 10,
                       y     => 400,
                       xsize => 0.9,
                       ysize => 0.8 } );
   prText(35, 760, 'This is some text');
   # ...
   prPage();
   my @vec2 = prImage( { file    => 'destiny.pdf',
                         page    => 1,
                         imageNo => 1,
                         effect  => 'add' } );
   prText(25, 760, "There shouldn't be any image on this page");
   prPage();
   ########################################################
   #  Now we make both images so that they could fit into
   #  a box 300 X 300 points, and they are displayed
   ########################################################
   prText(25, 800, 'This is the first image :');
   my $xScale = 300 / $vec[1];
   my $yScale = 300 / $vec[2];
   if ($xScale < $yScale)
   {  $yScale = $xScale;
   }
   else
   {  $xScale = $yScale;
   }
   prImage({ file   => 'best.pdf',
             x      => 25,
             y      => 450,
             xsize  => $xScale,
             ysize  => $yScale} );
   prText(25, 400, 'This is the second image :');
   $xScale = 300 / $vec2[1];
   $yScale = 300 / $vec2[2];
   if ($xScale < $yScale)
   {  $yScale = $xScale;
   }
   else
   {  $xScale = $yScale;
   }
   prImage({ file   => 'destiny.pdf',
             x      => 25,
             y      => 25,
             xsize  => $xScale,
             ysize  => $yScale} );
   prEnd();

On the first page an image is displayed in a simple way. While the second page is processed, prImage(), loads an image, but it is not shown here. On the 3:rd page, the two images are scaled and shown.

In the distribution there is an utility program, 'reuseComponent_pl', which displays included images in a PDF-file and their ``names''.

prInit - add JavaScript to be executed at initiation

   prInit ( $string, $duplicateCode )

$string can be any JavaScript code, but you can only refer to functions included with prJs. The JavaScript interpreter will not know other functions in the document. Often you can add new things, but you can't remove or change interactive fields, because the interpreter hasn't come that far, when initiation is done.

$duplicateCode is undefined or anything. It duplicates the JavaScript code which has been used at initiation, so you can look at it from within Acrobat and debug it. It makes the document bigger. This parameter is deprecated.

   use PDF::Reuse;
   use strict;
   
   prFile('myFile.pdf');
   prInit('app.alert("This is displayed when opening the document");');
      
   prEnd();

Remark: Avoid to use "return" in the code you use at initiation. If your user has
downloaded a page with Web Capture, and after that opens a PDF-document where a 
JavaScript is run at initiation and that JavaScript contains a return-statement,
a bug occurs. The JavaScript interpreter "exits" instead of returning, the execution
of the JavaScript might finish to early. This is a bug in Acrobat/Reader 5.

=head2 prInitVars               - initiate global variables and internal tables
   prInitVars(1)

If you run programs with PDF::Reuse as persistent procedures, you probably need to initiate global variables. If you have '1' or anything as parameter, internal tables for forms, images, fonts and interactive functions are not initiated. The module ``learns'' offset and sizes of used objects, and can process them faster, but at the same time the size of the program grows.

   use PDF::Reuse;
   use strict;
   prInitVars();     # To initiate ALL global variables and tables
   # prInitVars(1);  # To make it faster, but more memory consuming
   $| = 1;
   print STDOUT "Content-Type: application/pdf \n\n";
   prFile();         # To send the document uncatalogued to STDOUT
   prForm('best.pdf');
   prText(25, 790, 'Dear Mr. Anders Persson');
   # ...
   prEnd();

If you call this function without parameters all global variables, including the internal tables, are initiated.


=head2 prJpeg           - import a jpeg-image
   prJpeg ( $imageFile, $width, $height )

$imageFile contains 1 single jpeg-image. $width and $height also have to be specified. Returns the $internalName

   use PDF::Reuse;
   use Image::Info qw(image_info dim);
   use strict;
   my $file = 'myImage.jpg';
   my $info = image_info($file);
   my ($width, $height) = dim($info);    # Get the dimensions
   prFile('myFile.pdf');
   my $intName = prJpeg("$file",         # Define the image 
                         $width,         # in the document
                         $height);
   my $str = "q\n";
   $str   .= "$width 0 0 $height 10 10 cm\n";
   $str   .= "/$intName Do\n";
   $str   .= "Q\n";
   prAdd($str);
   prEnd();

This is a little like an extra or reserve routine to add images to the document. The most simple way is to use prImage()

prJs - add JavaScript

   prJs ( $string|$fileName )

To add JavaScript to your new document. $string has to consist only of JavaScript functions: function a (..){ ... } function b (..) { ...} and so on If $string doesn't contain '{', $string is interpreted as a filename. In that case the file has to consist only of JavaScript functions.

See ``Remarks about JavaScript''

prLink - add a hyper link

   prLink( { page   => $pageNo,     # Starting with 1  !
             x      => $x,
             y      => $y,
             width  => $width,
             height => $height,
             URI    => $URI     } );

You can also call prLink like this:

   prLink($page, $x, $y, $width, $height, $URI);

You have to put prLink after prFile and before the sentences where its' page is created. The links are created at the page-breaks. If the page is already created, no new link will be inserted.

Here is an example where the links of a 4 page document are preserved, and a link is added at the end of the document. We assume that there is some suitable text at that place (x = 400, y = 350):

   use strict;
   use PDF::Reuse;
   prFile('test.pdf');
   prLink( {page   => 4,
            x      => 400,
            y      => 350,
            width  => 105,
            height => 15,
            URI    => 'http://www.purelyInvented.com/info.html' } );
   prDoc('fourPages.pdf');
   prEnd();

( If you are creating each page of a document separately, you can also use 'hyperLink' from PDF::Reuse::Util. Then you get an external text in Helvetica-Oblique, underlined and in blue.

  use strict;
  use PDF::Reuse;
  use PDF::Reuse::Util;
  prFile('test.pdf');
  prForm('template.pdf', 5);
  my ($from, $pos) = prText(25, 700, 'To get more information  ');
  $pos = hyperLink( $pos, 700, 'Press this link',
                    'http://www.purelyInvented.com/info.html' );
  ($from, $pos) = prText( $pos, 700, ' And get connected');
  prEnd();

'hyperLink' has a few parameters: $x, $y, $textToBeShown, $hyperLink and $fontSize (not shown in the example). It returns current x-position. )

prLog - add a string to the log

   prLog ( $string )

Adds whatever you want to the current log (a reference No, a commentary, a tag ?) A log has to be active see prLogDir()

Look at ``Using the template'' and ``Restoring the document from the log'' in the tutorial for an example.

prLogDir - set directory for the log

   prLogDir ( $directory )

Sets a directory for the logs and activates the logging. A little log file is created for each PDF-file. Normally it should be much, much more compact then the PDF-file, and it should be possible to restore or verify a document with the help of it. (Of course you could compress or store the logs in a database to save even more space.)

   use PDF::Reuse;
   use strict;
   prDocDir('C:/temp/doc');
   prLogDir('C:/run');
   prFile('myFile.pdf');
   prForm('best.pdf');
   prText(25, 790, 'Dear Mr. Anders Persson');
   # ...
   prEnd();

In this example a log file with the name 'myFile.pdf.dat' is created in the directory 'C:\run'. If that directory doesn't exist, the system tries to create it. (But, just as mkdir does, it only creates the last level in a directory tree.)

prMbox - define the format (MediaBox) for a new page.

   prMbox ( $lowerLeftX, $lowerLeftY, $upperRightX, $upperRightY )

If the function or the parameters are missing, they are set to 0, 0, 595, 842 points respectively. Only for new pages. Pages created with prDoc and prSinglePage keep their media boxes unchanged.

See prForm() for an example.


=head2 prPage           - create/insert a page
   prPage ($noLog)

Don't use the optional parameter, it is only used internally, not to clutter the log, when automatic page breaks are made.


See prForm() for an example.

prSinglePage - take single pages, one by one, from an old document

   prSinglePage($file, $pageNumber)

$pageNumber is optional. If not given, next page is assumed Returns number of remaining pages. This function is a variant of prDoc for single pages, with the addition that it has a counter of last page read, and total number of pages of the old document, so it can be used to loop through a document.

   
To add a form, image and page number to each page of a document
(The document Battery.pdf is cropped so each page is fairly small)  You could also have
used prDoc, but only if you knew in advance the number of pages of the old document
   use PDF::Reuse;
   use PDF::Reuse::Util;
   use strict;
      
   prFile('test.pdf');
      
   my $pageNumber = 0;
   my $left = 1;            # Every valid PDF-document has at least 1 page,
                            # so that can be assumed  
    
   while ($left) 
   {   $pageNumber++;
       prForm(  { file =>'Words.pdf',
                  page => 5,
                  x    => 150,
                  y    => 150} );
                    
       prImage( { file    =>'Media.pdf',
                  page    => 6,
                  imageNo => 1,
                  x       => 450,
                  y       => 450 } );
       blackText();
       prText( 360, 250, $pageNumber);
       $left = prSinglePage('Battery.pdf');   
    } 
     
    prEnd;

prSinglePage creates a new page from an old document and adds new content (to the array of streams of that page). Most often you can add new contents to the page like the example above, and it works fine, but sometimes you get surprises. There can e.g. be instructions in the earlier contents to make filling color white, and then you will probably not see added new text. That is why PDF::Reuse::Util::blackText() is used in the example. There can be other instructions like moving or rotating the user space. Also new contents can end up outside the crop-box. Of course all new programs should be tested. If prSinglePage can't be used, try to use prForm followed by prPage instead.


=head2 prStrWidth   - calculate the string width
   prStrWidth($string, $font, $fontSize)

Returns string width in points. Should be used in conjunction with one of these predefined fonts of Acrobat/Reader: Times-Roman, Times-Bold, Times-Italic, Times-BoldItalic, Courier, Courier-Bold, Courier-Oblique, Courier-BoldOblique, Helvetica, Helvetica-Bold, Helvetica-Oblique, Helvetica-BoldOblique. If some other font is given, Helvetica is used, and the returned value will at the best be approximate.

prText - add a text-string

   prText ( $x, $y, $string, $align, $rotation )

Puts $string at position $x, $y Returns 1 in scalar context. Returns ($xFrom, $xTo) in list context. $xTo will not be defined together with a rotation. prStrWidth() is used to calculate the length of the strings, so only the predefined fonts together with Acrobat/Reader will give reliable values for $xTo.

$align can be 'left' (= default), 'center' or 'right'. The parameter is optional.

$rotation can be a degree 0 - 360, 'q1', 'q2' or 'q3'. Also optional.

Current font and font size are used. (If you use prAdd() before this function, many other things could also influence the text.)

   use strict;
   use PDF::Reuse;
   prFile('test.pdf');
   #####################################
   # Use a "curser" ($pos) along a line
   #####################################
   my ($from, $pos) = prText(25, 800, 'First write this. ');
   ($from, $pos) = prText($pos, 800, 'Then write this. '); 
   prText($pos, 800, 'Finally write this.');
   #####################################
   # Right adjust and center sentences
   #####################################
   prText( 200, 750, 'A short sentence', 'right');
   prText( 200, 735, 'This is a longer sentence', 'right');
   prText( 200, 720, 'A word', 'right');
   prText( 200, 705, 'Centered around a point 200 points from the left', 'center');
   prText( 200, 690, 'The same center', 'center');
   prText( 200, 675, '->.<-', 'center');
   ############
   # Rotation
   ############
   prText( 200, 550, ' Rotate 0 degrees','', 0);
   prText( 200, 550, ' Rotate 60 degrees','', 60);
   prText( 200, 550, ' Rotate 120 degrees','', 120);
   prText( 200, 550, ' Rotate 180 degrees','', 180);
   prText( 200, 550, ' Rotate 240 degrees','', 240);
   prText( 200, 550, ' Rotate 300 degrees','', 300);
   prText( 400, 430, 'Rotate 90 degrees clock-wise','','q1');
   prText( 400, 430, 'Rotate 180 degrees clock-wise','', 'q2');
   prText( 400, 430, 'Rotate 270 degrees clock-wise','', 'q3');
   
   ##########################
   # Rotate and right adjust
   ##########################
   
   prText( 200, 230, 'Rotate 90 degrees clock-wise ra->','right','q1');
   prText( 200, 230, 'Rotate 180 degrees clock-wise ra->','right', 'q2');
   prText( 200, 230, 'Rotate 270 degrees clock-wise ra->','right', 'q3');
   
   prEnd();

=head1 INTERNAL OR DEPRECATED FUNCTIONS
prBar - define and paint bars for bar fonts
   prBar ($x, $y, $string)

Prints a bar font pattern at the current page. Returns $internalName for the font. $x and $y are coordinates in points and $string should consist of the characters '0', '1' and '2' (or 'G'). '0' is a white bar, '1' is a dark bar. '2' and 'G' are dark, slightly longer bars, guard bars. You can use e.g. GD::Barcode or one module in that group to calculate the bar code pattern. prBar ``translates'' the pattern to white and black bars.

   use PDF::Reuse;
   use GD::Barcode::Code39;
   use strict;
   prFile('myFile.pdf');
   my $oGdB = GD::Barcode::Code39->new('JOHN DOE');
   my $sPtn = $oGdB->barcode();
   prBar(100, 600, $sPtn);
   prEnd();

Internally the module uses a font for the bars, so you might want to change the font size before calling this function. In that case, use prFontSize() . If you call this function without arguments it defines the bar font but does not write anything to the current page.

An easier and often better way to produce bar codes is to use PDF::Reuse::Barcode. Look at that module!

prCid - define time stamp/check id
   prCid ( $timeStamp )

An internal function. Don't bother about it. It is used in automatic routines when you want to restore a document. It gives modification time of the next PDF-file or JavaScript. See ``Restoring a document from the log'' in the tutorial for more about the time stamp

prId - define id-string of a PDF document
   prId ( $string )

An internal function. Don't bother about it. It is used e.g. when a document is restored and an id has to be set, not calculated.

prIdType - define id-type
   prIdType ( $string )

An internal function. Avoid using it. $string could be ``Rep'' for replace or ``None'' to avoid calculating an id.

Normally you don't use this function. Then an id is calculated with the help of Digest::MD5::md5_hex and some data from the run.


=item prTouchUp         - make changes and reuse more difficult
   prTouchUp (1);

By default and after you have issued prTouchUp(1), you can change the document with the TouchUp tool from within Acrobat. If you want to switch off this possibility, you use prTouchUp() without any parameter. Then the user shouldn't be able to change anything graphic by mistake. He has to do something premeditated and perhaps with a little effort. He could still save it as Postscript and redistill, or he could remove or add single pages. (Here is a strong reason why the log files, and perhaps also check sums, are needed. It would be very difficult to forge a document unless the forger also has access to your computer and knows how the check sums are calculated.)

Avoid to switch off the TouchUp tool for your templates. It creates an extra level within the PDF-documents . Use this function for your final documents.

See ``Using the template'' in the tutorial for an example.

This function works for pages created with prPage, but mot with prDoc and prSinglePage, So it is more or less deprecated as these function have developed.

(To encrypt your documents: use the batch utility within Acrobat)


=item prVers            - check version of log and program
   prVers ( $versionNo )

To check version of this module in case a document has to be restored.


SEE ALSO

   PDF::Reuse::Tutorial
   PDF::Reuse::Barcode
   PDF::Reuse::OverlayChart

To program with PDF-operators, look at ``The PDF-reference Manual'' which probably is possible to download from http://partners.adobe.com/asn/tech/pdf/specifications.jsp Look especially at chapter 4 and 5, Graphics and Text, and the Operator summary.

Technical Note # 5186 contains the ``Acrobat JavaScript Object Specification''. I downloaded it from http://partners.adobe.com/asn/developer/technotes/acrobatpdf.html

If you are serious about producing PDF-files, you probably need Adobe Acrobat sooner or later. It has a price tag. Other good programs are GhostScript and GSview. I got them via http://www.cs.wisc.edu/~ghost/index.html Sometimes they can replace Acrobat. A nice little detail is e.g. that GSview shows the x- and y-coordinates better then Acrobat. If you need to convert HTML-files to PDF, HTMLDOC is a possible tool. Download it from http://www.easysw.com . A simple tool for vector graphics is Mayura Draw 2.04, download it from http://www.mayura.com. It is free. I have used it to produce the graphic OO-code in the tutorial. It produces postscript which the Acrobat Distiller (you get it together with Acrobat) or Ghostscript can convert to PDF.(The commercial product, Mayura Draw 4.01 or something higher can produce PDF-files straight away)

If you want to import jpeg-images, you might need

   Image::Info

To get definitions for e.g. colors, take them from

   PDF::API2::Util


LIMITATIONS

Meta data, info and many other features of the PDF-format have not been implemented in this module.

Many things can be added afterwards, after creating the files. If you e.g. need files to be encrypted, you can use a standard batch routine within Adobe Acrobat.


TODO

I have been experimenting a little with a helper application for Netscape or Internet Explorer and it is quite obvious that you could get very good performance and high reliability if you transferred the logs and constructed the documents at the target computer, instead of the transferring formatted documents. The reasons are:

The size of a log is usually only a fraction of the formatted document. The logs keep a time stamp for all source files, so you could have a simple cashing. It is possible to put a time stamp on the log file and then you get a hierarchal structure. When the system reads a log file it could quickly find out which source files are missing. If it encounters the URL and time stamp of cashed log file, that would be sufficient. It would not be necessary to get it over the net. You would minimize the number of conversations and you would also increase the possibilities to complete a task even if the connections are bad.

The cash could function as a secondary library for forms and JavaScripts. When you work with HTML you are usually interested in the most recent version of of a component. With PDF the emphasis is usually more on exactness, and PDF-documents tend to be more stable. This strengthens the motive for a functioning cash.

(Also I think you could skip some holy rules from HTML-processing. E.g. if an international body has forms and JavaScripts for booking a hotel room, any affiliated hotel should have the right to use the common files, so they could be used via the cash regardless of if you are booking a room in Agadir or Shanghai. That would create libraries and rational reuse of code. I think security and legal problems would be possible to handle.)

At the present time PDF cannot compete with HTML, but if you used the log files and a simple cash, PDF would be just superior for repeated tasks.


THANKS TO

Martin Langhoff, Matisse Enzer and others who have contributed with code, suggestions and error reports.

The functionality of prDoc and prSinglePage to include new contents was developed for a specific task with support from the Electoral Enrolment Centre, Wellington, New Zealand


MAILING LIST

   http://groups.google.com/group/PDF-Reuse


AUTHOR

Lars Lundberg larslund@cpan.org


COPYRIGHT

Copyright (C) 2003 - 2004 Lars Lundberg, Solidez HB.

Copyright (C) 2005 Karin Lundberg. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


DISCLAIMER

You get this module free as it is, but nothing is guaranteed to work, whatever implicitly or explicitly stated in this document, and everything you do, you do at your own risk - I will not take responsibility for any damage, loss of money and/or health that may arise from the use of this module.

 PDF::Reuse - Reuse and mass produce PDF documents