Syntax::Highlight::Universal - Syntax highlighting module based on the Colorer library |
Syntax::Highlight::Universal - Syntax highlighting module based on the Colorer library
use Syntax::Highlight::Universal; my $highlighter = Syntax::Highlight::Universal->new;
$highlighter->addConfig("hrc/proto.hrc"); $highlighter->setPrecompiledConfig("precompiled.hrcc"); $highlighter->setCacheDir("/tmp/highlighter"); $highlighter->setCachePrefixLen(2);
my $result = $highlighter->highlight("perl", "print 'Hello, World!'");
my $callbacks = { initParsing => \&myInitHandler, addRegion => \&myRegionHandler, enterScheme => \&mySchemeStartHandler, leaveScheme => \&mySchemeEndHandler, finalizeParsing => \&myFinalizeHandler, }; $highlighter->highlight("perl", "print 'Hello, World!'", $callbacks);
$highlighter->precompile("precompiled.hrcc");
This module can process text of any format and produce a syntax highlighted version of it. The default output format is (X)HTML, custom formats are also possible. It uses parts of the Colorer library (http://colorer.sf.net/) and supports its HRC configuration files (http://colorer.sf.net/hrc-ref/). Configuration files for about 100 file formats are included.
Syntax::Highlight::Universal doesn't export any functions. You can call its methods either statically or through an object. The result will be the same but we will use the latter here as it is the more convenient of the two.
my $highlighter = Syntax::Highlight::Universal->new;
This will create a new object and bind it to the
Syntax::Highlight::Universal
namespace. It can be used to call the methods
of this module in a more convenient way. However, this object has no other
meaning, any configuration changes performed through it will have global
effect.
my $result = $highlighter->highlight(FORMAT, TEXT, [CALLBACKS]);
This will process the text and produce its syntax highlighted variant, by default in (X)HTML format.
c, cpp, asm, perl, java, idl, pascal, csharp, jsnet, vbnet, forth, fortran, vbasic, html, css, html-css, svg-css, jsp, php, php-body, xhtml-trans, xhtml-strict, xhtml-frameset, asp.vb, asp.js, asp.ps, svg, coldfusion, jScript, actionscript, vbScript, xml, dtd, xslt, xmlschema, relaxng, xlink, clarion, Clipper, foxpro, sqlj, paradox, sql, mysql, Batch, shell, apache, config, hrc, hrd, delphiform, javacc, javaProperties, lex, yacc, makefile, regedit, resources, TeX, dcl, vrml, rarscript, nsi, iss, isScripts, c1c, ada, abap4, AutoIt, awk, dssp, adsp, Baan, cobol, cache, eiffel, icon, lisp, matlab, modula2, picasm, python, rexx, ruby, sml, ocaml, tcltk, sicstusProlog, turboProlog, verilog, vhdl, z80, asm80, filesbbs, diff, messages, text, default
{ initParsing => \&Syntax::Highlight::Universal::initParsing, addRegion => \&Syntax::Highlight::Universal::addRegion, enterScheme => \&Syntax::Highlight::Universal::enterScheme, leaveScheme => \&Syntax::Highlight::Universal::leaveScheme, finalizeParsing => \&Syntax::Highlight::Universal::finalizeParsing, }
The callbacks are explained in detail below.
<span>
elements, the class attribute is set to the region's name.
The resulting code can be formatted via CSS. Directory css
of the
distribution contains some sample CSS files created from Colorer's HRD files
with the hrd2css script (also in this directory). You have to keep in
mind that some of these color schemes are meant for a specific background
color.
If the default callback functions are overridden, either the return value
of the initParsing
or finalizeParsing
callback will be returned,
depending on whether initParsing
returns a value.
$highlighter->addConfig(FILE, ...);
This method imports a list of configuration files. They replace
hrc/proto.hrc
that is used by default.
$highlighter->precompile(FILE);
Parsing HRC files takes a while, resulting in a high time demand for processing of the first text. In order to speed it up, configuration files can be preprocessed into a binary file. The time to load the configuration will be reduced by a factor 5-10, memory usage also decreases. However, the binary file can't be changed and has to be rebuilt every time changes are made on the HRC files. Furthermore it isn't platform independent and should be always rebuilt when moving to another server/another operating system.
This method will process the current configuration and write it into a file in a binary format. It might take some time, the whole configuration needs to be loaded into memory.
make test
will create a precompiled configuration file precompiled.hrcc
.
It can be copied into the library directory of the module and used instead of
the HRC configuration.
$highlighter->setPrecompiledConfig([FILE]);
This method will load a precompiled configuration file. It can only be called once, combining several files isn't yet supported. addConfig can't be used either when using a precompiled configuration.
precompiled.hrcc
will
be loaded then.
$highlighter->setCacheDir(DIR);
This method will enable caching of the results and define a cache directory. Then, a text will only go through the complete processing if there is no file for it in the cache directory. Syntax highlighting takes time, therefore caching is generally a good idea. However, it won't be of much use if the texts processed are always different. Other problem is the cleaning of the cache directory. The cache files are never removed, this has to be done separately, e.g. with a cron script emptying the cache directory every two days.
Caching only works if the default callback functions are used.
$highlighter->setCachePrefixLen(LENGTH);
This method defines how many characters should be used for subdirectories of the cache directory.
All callback functions get a reference to the list of text lines as their first parameter. The other parameters differ:
These functions are called before/after the parsing of the text. If
initParsing
returns a value, parsing will be aborted and
highlight will return this value. This can be used for
caching to return cached results before even starting parsing.
Otherwise parsing will proceed normally and the return value of
finalizeParsing
will be returned.
Called whenever a new region inside a line is identified.
Called whenever the start/end of a scheme is found. The parameters
are all the same as for addRegion
, except:
Colorer defines a large set of regions that are organized hierarchically. Each region represents text elements of a certain type. The region object has the following methods:
Schemes in Colorer describe general context changes. For example, the scheme will change when parsing an interpolated string constant. The current scheme defines the regions that can be found, e.g. you can't have function calls inside a string scheme. Schemes unlike regions can stretch over multiple lines. The current Colorer version defines only one method for the scheme object:
The default output will use the name of a region and of all its parents as the class name for a block of text. This allows adding styles only for generally defined regions in most cases while still being able to take language-specific features into account. However, this increases the amount of text largely.
Solution 1: Use server-side compression, e.g. mod_gzip. The size difference in compressed output is negligible.
Solution 2: You can replace the function used for creating class names to
include only one region name of the def:*
scheme.
*Syntax::Highlight::Universal::_createClassName = sub { $region = shift;
while (defined $region && $region->name !~ /^def:/) { $region = $region->parent; } my $class = defined $region ? $region->name : 'unknown'; $class =~ s/\W/_/g; return $class; };
Note: this approach is not recommended and might stop working in future versions of the module.
Colorer was originally meant for desktop applications where one second to load the configuration files doesn't matter. Unfortunately it matters a lot for web applications. Furthermore, the parsing of text itself also needs some time though much less than processing HRC configuration.
Solution 1: If you often have to highlight the same texts, you can use caching. Set up caching directory where the module can store processed text. Next time the same text needs to be highlighted the result will be taken from the cache instead of parsing the text all over again (and loading the necessary configuration files in the process).
Solution 2: This module implements a mechanism to store an already parsed
Colorer configuration on disk and load it into memory again. The time
requirement is 5-10 times less than for loading HRC configurations. See
description of methods precompile and
setPrecompiledConfig for more
information on this feature. When installing the module make test
will
automatically create a precompiled configuration file precompiled.hrcc
(about 2 MB) that can be copied into the module's library directory (that's
where the hrc
directory is put when installing the module).
The source files belonging to the Colorer library are in the colorer
directory of the distribution. It is a subset of files from Colorer-take5
Library beta3. All files are unchanged with the following exceptions:
USE_CHUNK_ALLOC
, USE_DL_MALLOC
, JAR_INPUT_SOURCE
,
HTTP_INPUT_SOURCE
have been removed (the corresponding files haven't
been included). Instead, a constant ALLOW_SERIALIZATION
is defined.
MemoryChunks.h
any more.
ALLOW_SERIALIZATION
is defined. They are used to store additional information in the objects
of class SchemeNode
so that they can be written to disk and restored
without being changed.
ALLOW_SERIALIZATION
is defined, some fields are added to the class
SchemeNode
.
loadFileType()
has been made virtual to allow overloading.
Node
an enum
(MS Visual C++
6.0 has problems with the current definition).
If you plan to use this module with another Colorer version you should consider repeating these changes.
Furthermore the HRC files from the colorer library are included, these are
in the directory Syntax/Highlight/Universal
. Here only the file
hrc/inet/php-body.hrc
has been added and hrc/proto.hrc
changed
appropriately. This format is meant for highlighting pure PHP code that
isn't embedded in HTML.
Directory css
contains color schemes that have been translated from
Colorer's HRD files with the hrd2css script.
Wladimir Palant, <palant@cpan.org>
Copyright (C) 2005 by Wladimir Palant
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.1 or, at your option, any later version of Perl 5 you may have available.
Colorer is (C) by Igor Russkih. For information on the license see http://colorer.sf.net/.
Syntax::Highlight::Universal - Syntax highlighting module based on the Colorer library |