YAML - YAML Ain't Markup Language |
YAML - YAML Ain't Markup Language (tm)
use YAML;
# Load a YAML stream of 3 YAML documents into Perl data structures. my ($hashref, $arrayref, $string) = Load(<<'...'); --- name: ingy age: old weight: heavy # I should comment that I also like pink, but don't tell anybody. favorite colors: - red - white - blue --- - Clark Evans - Oren Ben-Kiki - Brian Ingerson --- > You probably think YAML stands for "Yet Another Markup Language". It ain't! YAML is really a data serialization language. But if you want to think of it as a markup, that's OK with me. A lot of people try to use XML as a serialization format.
"YAML" is catchy and fun to say. Try it. "YAML, YAML, YAML!!!" ...
# Dump the Perl data structures back into YAML. print Dump($string, $arrayref, $hashref);
# YAML::Dump is used the same way you'd use Data::Dumper::Dumper use Data::Dumper; print Dumper($string, $arrayref, $hashref);
The YAML.pm module implements a YAML Loader and Dumper based on the YAML 1.0 specification. http://www.yaml.org/spec/
YAML is a generic data serialization language that is optimized for human readability. It can be used to express the data structures of most modern programming languages. (Including Perl!!!)
For information on the YAML syntax, please refer to the YAML specification.
eval()
built-in
to deserialize the data. Somebody could add a snippet of Perl to erase
your files.
YAML's parser does not need to eval anything.
YAML.pm also has the ability to handle code (subroutine) references and typeglobs. (Still experimental) These features are not found in Perl's other serialization modules.
The following functions are exported by YAML.pm by default when you use YAML.pm like this:
use YAML;
To prevent YAML.pm from exporting functions, say:
use YAML ();
Dump(list-of-Perl-data-structures)
Load(string-containing-a-YAML-stream)
thaw()
function or the eval()
function in relation to
Data::Dumper. It parses a string containing a valid YAML stream into a
list of Perl data structures.
Store()
The reason for this deprecation is that the YAML spec talks about programs called Loaders and Dumpers. ``Storers'' is too hard to say, I guess...
LoadFile(filepath)
yaml_dump()
function. A yaml_dump()
function should take a perl node and
return a yaml node. If no second argument is provided, Bless will create a
yaml node. This node is not returned, but can be retrieved with the Blessed()
function.
Here's an example of how to use Bless. Say you have a hash containing three keys, but you only want to dump two of them. Furthermore the keys must be dumped in a certain order. Here's how you do that:
use YAML qw(Dump Bless); $hash = {apple => 'good', banana => 'bad', cauliflower => 'ugly'}; print Dump $hash; Bless($hash)->keys(['banana', 'apple']); print Dump $hash;
produces:
--- #YAML:1.0 apple: good banana: bad cauliflower: ugly --- #YAML:1.0 banana: bad apple: good
Bless returns the tied part of a yaml-node, so that you can call the YAML::Node methods. This is the same thing that YAML::Node::ynode() returns. So another way to do the above example is:
use YAML qw(:all); use YAML::Node; $hash = {apple => 'good', banana => 'bad', cauliflower => 'ugly'}; print Dump $hash; Bless($hash); $ynode = ynode(Blessed($hash)); $ynode->keys(['banana', 'apple']); print Dump $hash;
Blessed(perl-node)
Dumper()
freeze()
and thaw()
Dump()
and Load(). For Storable fans.
This will also allow YAML.pm to be plugged directly into modules like POE.pm, that use the freeze/thaw API for internal serialization.
This is a list of the various groups of exported functions that you can import using the following syntax:
use YAML ':groupname';
Bless()
and Blessed().
freeze()
and thaw().
freeze()
and thaw().
YAML can also be used in an object oriented manner. At this point it offers no real advantage. This interface will be improved in a later release.
new()
my $y = YAML->new; $y->Indent(4); $y->dump($foo, $bar);
dump()
load()
YAML options are set using a group of global variables in the YAML namespace. This is similar to how Data::Dumper works.
For example, to change the indentation width, do something like:
local $YAML::Indent = 3;
The current options are:
By the way, YAML can use any number of characters for indentation at any level. So if you are editing YAML by hand feel free to do it anyway that looks pleasing to you; just be consistent for a given level.
This tells YAML.pm whether to use a separator string for a Dump operation. This only applies to the first document in a stream. Subsequent documents must have a YAML header by definition.
Tells YAML.pm whether to include the YAML version on the separator/header.
The canonical form is:
--- YAML:1.0
Tells YAML.pm whether or not to sort hash keys when storing a document.
YAML::Node objects can have their own sort order, which is usually what you want. To override the YAML::Node order and sort the keys anyway, set SortKeys to 2.
Anchor names are normally numeric. YAML.pm simply starts with '1' and increases by one for each new anchor. This option allows you to specify a string to be prepended to each anchor number.
eval()
to parse untrusted code is, well, untrustworthy. Safe
deserialization is one of the core goals of YAML.
DumpCode can also be set to a subroutine reference so that you can write your own serializing routine. YAML.pm passes you the code ref. You pass back the serialization (as a string) and a format indicator. The format indicator is a simple string like: 'deparse' or 'bytecode'.
eval()
. Since this is potentially risky, only use this option if you
know where your YAML has been.
LoadCode can also be set to a subroutine reference so that you can write your own deserializing routine. YAML.pm passes the serialization (as a string) and a format indicator. You pass back the code reference.
NOTE: YAML's block style is akin to Perl's here-document.
NOTE: YAML's folded style is akin to the way HTML folds text, except smarter.
Sometimes, when you KNOW that your data is nonrecursive in nature, you may want to serialize such that every node is expressed in full. (ie as a copy of the original). Setting $YAML::UseAliases to 0 will allow you to do this. This also may result in faster processing because the lookup overhead is by bypassed.
THIS OPTION CAN BE DANGEROUS. *If* your data is recursive, this option
*will* cause Dump()
to run in an endless loop, chewing up your computers
memory. You have been warned.
Compresses the formatting of arrays of hashes:
- foo: bar - bar: foo
becomes:
- foo: bar - bar: foo
Since this output is usually more desirable, this option is turned on by default.
YAML is a full featured data serialization language, and thus has its own terminology.
It is important to remember that although YAML is heavily influenced by Perl and Python, it is a language in it's own right, not merely just a representation of Perl structures.
YAML has three constructs that are conspicuously similar to Perl's hash, array, and scalar. They are called mapping, sequence, and string respectively. By default, they do what you would expect. But each instance may have an explicit or implicit type that makes it behave differently. In this manner, YAML can be extended to represent Perl's Glob or Python's tuple, or Ruby's Bigint.
--- a: mapping foo: bar --- - a - sequence
--- YAML:1.0 This: top level mapping is: - a - YAML - document
- !perl/Foo::Bar foo: 42 bar: stool
a mapping: foo: bar two: times two is 4
a sequence: - one bourbon - one scotch - one beer
a scalar key: a scalar value
YAML has many styles for representing scalars. This is important because varying data will have varying formatting requirements to retain the optimum human readability.
- a simple string - -42 - 3.1415 - 12:34 - 123 this is an error
- 'When I say ''\n'' I mean "backslash en"'
- "This scalar\nhas two lines, and a bell -->\a"
- > This is a multiline scalar which begins on the next line. It is indicated by a single carat. It is unescaped like the single quoted scalar. Line folding is also performed.
- | QTY DESC PRICE TOTAL --- ---- ----- ----- 1 Foo Fighters $19.95 $19.95 2 Bar Belles $29.95 $59.90
A parser parses a YAML stream. YAML.pm's Load()
function contains a
parser.
Load()
function is a loader. This takes the
information from the parser and loads it into a Perl data structure.
Dump()
function consists of a dumper and an emitter. The dumper
walks through each Perl data structure and gives info to the emitter.
NOTE: In YAML.pm the parser/loader and the dumper/emitter code are currently very closely tied together. When libyaml is written (in C) there will be a definite separation. libyaml will contain a parser and emitter, and YAML.pm (and YAML.py etc) will supply the loader and dumper.
For more information please refer to the immensely helpful YAML specification available at http://www.yaml.org/spec/.
The YAML distribution ships with a script called 'ysh', the YAML shell. ysh provides a simple, interactive way to play with YAML. If you type in Perl code, it displays the result in YAML. If you type in YAML it turns it into Perl code.
To run ysh, (assuming you installed it along with YAML.pm) simply type:
ysh [options]
Please read ysh for the full details. There are lots of options.
If you find a bug in YAML, please try to recreate it in the YAML Shell with logging turned on ('ysh -L'). When you have successfully reproduced the bug, please mail the LOG file to the author (ingy@cpan.org)
WARNING: This is *ALPHA* code.
BIGGER WARNING: This is *TRIAL1* of the YAML 1.0 specification. The YAML syntax may change before it is finalized. Based on past experience, it probably will change. The authors of this spec have worked for over a year putting together YAML 1.0, and we have flipped it on it's syntactical head almost every week. We're a fickle lot, we are. So use this at your own risk!!!
$foo = \$foo;
This serializes fine, but I can't parse it correctly yet. Unfortunately, every wiseguy programmer in the world seems to try this first when you ask them to test your serialization module. Even though it is of almost no real world value. So please don't report this bug unless you have a pure Perl patch to fix it for me.
By the way, similar non-leaf structures Dump and Load just fine:
$foo->[0] = $foo;
You can test these examples using 'ysh -r'. This option makes sure that the example can be deserialized after it is serialized. We call that ``roundtripping'', thus the '-r'.
YAML.pm can currently parse structured keys, but their meaning gets lost when they are loaded into a Perl hash. Consider this example using the YAML Shell:
ysh > --- yaml> ? yaml> foo: bar yaml> : baz yaml> ... $VAR1 = { 'HASH(0x1f1d20)' => 'baz' }; ysh >
YAML.pm will need to be fixed to preserve these keys somehow. Why? Because if YAML.pm gets a YAML document from YAML.py it must be able to return it with the Python data intact.
NOTE: For a (huge) dump of Perl's global guts, try:
perl -MYAML -e '$YAML::UseCode=1; print Dump \%main::'
To limit this to a single namespace try:
perl -MCGI -MYAML -e '$YAML::UseCode=1; print Dump \%CGI::'
Neil Watkiss and Clark Evans are currently developing libyaml, the official C implementation of the YAML parser and emitter. YAML.pm will be refactoring to use this library once it is stable. Other languages like Python, Tcl, PHP, Ruby, JavaScript and Java can make use of the same core library.
Please join us on the YAML mailing list if you are interested in implementing something.
An upcoming release will have support for incremental parsing. Incremental dumping is harder. Stay tuned.
Please read the YAML::Node manpage for advanced YAML features.
http://www.yaml.org is the official YAML website.
http://www.yaml.org/spec/ is the YAML 1.0 specification.
http://wiki.yaml.org/spec/ is the official YAML wiki.
YAML has been registered as a Source Forge project. (http://www.sourceforge.net) Currently we are only using the mailing list facilities there.
This is the first implementation of YAML functionality based on the 1.0 specification.
The following people have shown an interest in doing implementations. Please contact them if you are also interested in writing an implementation.
--- - name: Neil Watkiss project: - libyaml - YAML mode for the vim editor email: nwatkiss@ttul.org
- name: Brian Ingerson project: YAML.pm, libyaml Perl binding email: ingy@ttul.org
- name: Clark Evans project: libyaml, Python binding email: cce@clarkevans.com
- name: Oren Ben-Kiki project: Java Loader/Dumper email: orenbk@richfx.com
- name: Paul Prescod project: YAML Antagonist/Anarchist email: paul@prescod.net
- name: Ryan King project: YAML test specialist email: rking@panoptic.com
- name: Steve Howell project: Python and Ruby implementations email: showell@zipcon.net
- name: Patrick Leboutillier project: Java Loader/Dumper email: patrick_leboutillier@hotmail.com
- name: Shane Caraveo project: PHP Loader/Dumper email: shanec@activestate.com
- name: Brian Quinlan project: Python Loader/Dumper email: brian@sweetapp.com
- name: Jeff Hobbs project: Tcl Loader/Dumper email: jeff@hobbs.org
- name: Claes Jacobsson project: JavaScript Loader/Dumper email: claes@contiller.se
Brian Ingerson <INGY@cpan.org> is resonsible for YAML.pm.
The YAML language is the result of a ton of collaboration between Oren Ben-Kiki, Clark Evans and Brian Ingerson. Several others have added help along the way.
Neil Watkiss is pioneering libyaml. Bless that boy!
Ryan King offered much help on the 0.35 release. The XP advocate extraordinaire, help me refactor my entire test suite into its current form. Regression tests are extremely important to the success of this project.
Copyright (c) 2001, 2002. Brian Ingerson. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
See http://www.perl.com/perl/misc/Artistic.html
YAML - YAML Ain't Markup Language |