SQL::Statement::Syntax - documentation of SQL::Statement's SQL Syntax



NAME

SQL::Statement::Syntax - documentation of SQL::Statement's SQL Syntax


SYNOPSIS

See the SQL::Statement manpage for usage.


DESCRIPTION

The SQL::Statement module can be used either from a DBI drive like DBD::CSV or directly. The syntax below applies to both situations. In the case of DBDs, each DBD can implement its own sub-dialect so be sure to check the DBD documentations also.

SQL::Statement is meant primarly as a base class for DBD drivers and as such concentrates on a small but useful subset of SQL. It does *not* in any way pretend to be a complete SQL parser for all dialects of SQL. The module will continue to add new supported syntax, and users may also extend the syntax (see #Extending the SQL syntax).


USAGE

Default Supported SQL syntax - Summary

SQL Statements

   CALL <function>
   CREATE [TEMP] TABLE <table> <column_def_clause>
   CREATE [TEMP] TABLE <table> AS <select statement>
   CREATE [TEMP] TABLE <table> AS IMPORT()
   CREATE FUNCTION <user_defined_function> [ NAME <perl_subroutine> ]
   CREATE KEYWORD  <user_defined_keyword>  [ NAME <perl_subroutine> ]
   CREATE OPERATOR <user_defined_operator> [ NAME <perl_subroutine> ]
   CREATE TYPE     <user_defined_type>     [ NAME <perl_subroutine> ]
   DELETE FROM <table> [<where_clause>]
   DROP TABLE [IF EXISTS] <table>
   DROP FUNCTION <function>
   DROP KEYWORD  <keyword>
   DROP OPERATOR <operator>
   DROP TYPE     <type>
   INSERT [INTO] <table> [<column_list>] VALUES <value_list>
   LOAD <user_defined_functions_module>
   SELECT <function>
   SELECT <select_clause>
          <from_clause>
          [<where_clause>] 
          [ ORDER BY ocol1 [ASC|DESC], ... ocolN [ASC|DESC]] ]
          [ GROUP BY gcol1 [, ... gcolN] ]
          [ LIMIT [start,] length ]
   UPDATE <table> SET <set_clause> [<where_clause>]

Explict Join Qualifiers

   NATURAL, INNER, OUTER, LEFT, RIGHT, FULL

Built-in Functions

   * Aggregate : MIN, MAX, AVG, SUM, COUNT
   * Date/Time : CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP
   * String    : CHAR_LENGTH, CONCAT, COALESCE, DECODE, LOWER, POSITION,
                 REGEX, REPLACE, SOUNDEX, SUBSTRING, TRIM, UPPER

Special Utility Functions

  * IMPORT  - imports a table from an external RDBMS or perl structure
  * RUN     - prepares & executes statements in a file of SQL statements

Operators and Predicates

   = , <> , < , > , <= , >= , IS [NOT] NULL , LIKE , CLIKE , IN , BETWEEN

Identifiers and Aliases

   * regular identifiers are case insensitive (though see note on table names)
   * delimited identifiers (inside double quotes) are case sensitive
   * column and table aliases are supported

Concatenation

   * use either ANSI SQL || or the CONCAT() function
   * e.g. these are the same:  {foo || bar} {CONCAT(foo,bar)}

Comments

   * comments must occur before or after statements, can't be embedded
   * SQL-style single line -- and C-style multi-line /* */ comments are supported

NULLs

   * currently NULLs and empty strings are identical, but this will change
   * use {col IS NULL} to find NULLs, not {col=''} (though both currently work)

See below for further details.

Syntax - Details

CREATE TABLE

Creates permanenet and in-memory tables.

 CREATE [TEMP] TABLE <table_name> ( <column_definitions> )
 CREATE [TEMP] TABLE <table_name> AS <select statement>
 CREATE [TEMP] TABLE <table_name> AS IMPORT()

Column definitions are standard SQL column names, types, and constraints, see Column Definitions.

  # create a permanent table
  #
  $dbh->do("CREATE TABLE qux (id INT PRIMARY KEY,word VARCHAR(30))");

The ``AS SELECT'' clause creates and populates the new table using the data and column structure specified in the select statement.

  # create and populate a table from a query to two other tables
  #
  $dbh->do("CREATE TABLE qux AS SELECT id,word FROM foo NATURAL JOIN bar");

If the optional keyword TEMP (or its synonym TEMPORARY) is used, the table will be an in-memory table, available for the life of the current database handle or until a DROP TABLE command is issued.

  # create a temporary table
  #
  $dbh->do("CREATE TEMP TABLE qux (id INT PRIMARY KEY,word VARCHAR(30))");

TEMP tables can be modified with SQL commands but the updates are not automatically reflected back to any permanent tables they may be associated with. To save a TEMP table - just use an AS SELECT clause:

 $dbh = DBI->connect( 'dbi:CSV:' );
 $dbh->do("CREATE TEMP TABLE qux_temp AS (id INT, word VARCHAR(30))");
 #
 # ... modify qux_temp with INSERT, UPDATE, DELETE statements, then save it
 #
 $dbh->do("CREATE TABLE qux_permanent AS SELECT * FROM qux_temp");

Tables, both temporary and permanent may also be created directly from perl arrayrefs and from heterogeneous queries to any DBI accessible data source, see the IMPORT() function.

 CREATE [ {LOCAL|GLOBAL} TEMPORARY ] TABLE $table
        (
           $col_1 $col_type1 $col_constraints1,
           ...,
           $col_N $col_typeN $col_constraintsN,
        )
        [ ON COMMIT {DELETE|PRESERVE} ROWS ]
     * col_type must be a valid data type as defined in the
       "valid_data_types" section of the dialect file for the
       current dialect
     * col_constriaints may be "PRIMARY KEY" or one or both of
       "UNIQUE" and/or "NOT NULL"
     * IMPORTANT NOTE: temporary tables, data types and column
       constraints are checked for syntax violations but are
       currently otherwise *IGNORED* -- they are recognized by
       the parser, but not by the execution engine
     * The following valid ANSI SQL92 options are not currently
       supported: table constraints, named constraints, check
       constriants, reference constraints, constraint
       attributes, collations, default clauses, domain names as
       data types

DROP TABLE

 DROP TABLE $table [ RESTRICT | CASCADE ]
     * IMPORTANT NOTE: drop behavior (cascade or restrict) is
       checked for valid syntax but is otherwise *IGNORED* -- it
       is recognized by the parser, but not by the execution
       engine

INSERT INTO

 INSERT INTO $table [ ( $col1, ..., $colN ) ] VALUES ( $val1, ... $valN )
     * default values are not currently supported
     * inserting from a subquery is not currently supported

DELETE FROM

 DELETE FROM $table [ WHERE search_condition ]
     * see "search_condition" below

UPDATE

 UPDATE $table SET $col1 = $val1, ... $colN = $valN [ WHERE search_condition ]
     * default values are not currently supported
     * see "search_condition" below

SELECT

      SELECT select_clause
        FROM from_clause
     [ WHERE search_condition ]
  [ ORDER BY $ocol1 [ASC|DESC], ... $ocolN [ASC|DESC] ]
     [ LIMIT [start,] length ]
      * select clause ::=
              [DISTINCT|ALL] *
           | [DISTINCT|ALL] col1 [,col2, ... colN]
           | set_function1 [,set_function2, ... set_functionN]
      * set function ::=
             COUNT ( [DISTINCT|ALL] * )
           | COUNT | MIN | MAX | AVG | SUM ( [DISTINCT|ALL] col_name )
      * from clause ::=
             table1 [, table2, ... tableN]
           | table1 NATURAL [join_type] JOIN table2
           | table1 [join_type] table2 USING (col1,col2, ... colN)
           | table1 [join_type] JOIN table2 ON table1.colA = table2.colB
      * join type ::=
             INNER
           | [OUTER] LEFT | RIGHT | FULL
      * if join_type is not specified, INNER is the default
      * if DISTINCT or ALL is not specified, ALL is the default
      * if start position is omitted from LIMIT clause, position 0 is
        the default
      * ON clauses may only contain equal comparisons and AND combiners
      * self-joins are not currently supported
      * if implicit joins are used, the WHERE clause must contain
        and equijoin condition for each table

SEARCH CONDITION

       [NOT] $val1 $op1 $val1 [ ... AND|OR $valN $opN $valN ]

OPERATORS

       $op  = |  <> |  < | > | <= | >=
              | IS NULL | IS NOT NULL | LIKE | CLIKE | BETWEEN | IN
  The "CLIKE" operator works exactly the same as the "LIKE"
  operator, but is case insensitive.  For example:
      WHERE foo LIKE 'bar%'   # succeeds if foo is "barbaz"
                              # fails if foo is "BARBAZ" or "Barbaz"
      WHERE foo CLIKE 'bar%'  # succeeds for "barbaz", "Barbaz", and "BARBAZ"

BUILT-IN AND USER-DEFINED FUNCTIONS

There are many built-in functions and you can also create your own new functions from perl subroutines. See the SQL::Statement::Functions manpage for documentation of functions.

Identifiers (table & column names)

Regular identifiers (table and column names *without* quotes around them) are case INSENSITIVE so column foo, fOo, FOO all refer to the same column.

Delimited identifiers (table and column names *with* quotes around them) are case SENSITIVE so column ``foo'', ``fOo'', ``FOO'' each refer to different columns.

A delimited identifier is *never* equal to a regular identifer (so ``foo'' and foo are two different columns). But don't do that :-).

Remember thought that, in DBD::CSV if table names are used directly as file names, the case sensitivity depends on the OS e.g. on Windows files named foo, FOO, and fOo are the same as each other while on Unix they are different.

Special Utility SQL Functions

IMPORT()

Imports the data and structure of a table from an external data source into a permanent or temporary table.

 $dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$oracle_sth);
 $dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$AoA);
 $dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$AoH);

IMPORT() can also be used anywhere that table_names can:

 $sth=$dbh->prepare("
    SELECT * FROM IMPORT(?) AS T1 NATURAL JOIN IMPORT(?) AS T2 WHERE T1.id ...
 ");
 $sth->execute( $pg_sth, $mysql_sth );

The IMPORT() function imports the data and structure of a table from an external data source. The IMPORT() function is always used with a placeholder parameter which may be 1) a prepared and executed statement handle for any DBI accessible data source; or 2) an AoA whose first row is column names and whose succeeding rows are data 3) an AoH.

The IMPORT() function may be used in the AS clause of a CREATE statement, and in the FROM clause of any statement. When used in a FROM clause, it should be used with a column alias e.g. SELECT * FROM IMPORT(?) AS TableA WHERE ...

You can also write your own IMPORT() functions to treat anything as a data source. See User-Defined Function in the SQL::Statement::Functions manpage.

Examples:

 # create a CSV file from an Oracle query
 #
 $dbh = DBI->connect('dbi:CSV:');
 $oracle_sth = $oracle_dbh->prepare($any_oracle_query);
 $oracle_sth->execute(@params);
 $dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$oracle_sth);
 # create an in-memory table from an AoA
 #
 $dbh      = DBI->connect( 'dbi:File:' );
 $arrayref = [['id','word'],[1,'foo'],[2,'bar'],];
 $dbh->do("CREATE TEMP TABLE qux AS IMPORT(?)",{},$arrayref);
 # query a join of a PostgreSQL table and a MySQL table
 #
 $dbh        = DBI->connect( 'dbi:File:' );
 $pg_dbh     = DBI->connect( ... DBD::pg connect params );
 $mysql_dbh  = DBI->connect( ... DBD::mysql connect params );
 $pg_sth     = $pg_dbh->prepare( ... any pg query );
 $pg_sth     = $pg_dbh->prepare( ... any mysql query );
 #
 $sth=$dbh->prepare("
    SELECT * FROM IMPORT(?) AS T1 NATURAL JOIN IMPORT(?) AS T2
 ");
 $sth->execute( $pg_sth, $mysql_sth );

RUN()

Run SQL statements from a user supplied file. <b>Please Note:</b> this function is experimental, please let me know if you have problems.

 RUN( sql_file )

If the file contains non-SELECT statements such as CREATE and INSERT, use the RUN() function with $dbh->do(). For example, this prepares and executes all of the SQL statements in a file called ``populate.sql'':

 $dbh->do(" CALL RUN( 'populate.sql') ");

If the file contains SELECT statements, the RUN() function may be used anywhere a table name may be used, for example, if you have a file called ``query.sql'' containing ``SELECT * FROM Employee'', then these two lines are exactly the same:

 my $sth = $dbh->prepare(" SELECT * FROM Employee ");
 my $sth = $dbh->prepare(" SELECT * FROM RUN( 'query.sql' ) ");

If the file contains a statement with placeholders, the values for the placehoders can be passed in the call to $sth->execute() as normal. If the query.sql file contains ``SELECT id,name FROM x WHERE id=?'', then these two are the same:

 my $sth = $dbh->prepare(" SELECT id,name FROM x WHERE id=?");
 $sth->execute(64);
 my $sth = $dbh->prepare(" SELECT * FROM RUN( 'query.sql' ) ");
 $sth->execute(64);

Note This function assumes that the SQL statements in the file are separated by a semi-colon+newline combination (/;\n/). If you wish to use different separators or import SQL from a different source, just over-ride the RUN() function with your own user-defined-function.

Further details

Integers
Reals
Syntax obvious

Strings
Surrounded by either single or double quotes; some characters need to be escaped with a backslash, in particular the backslash itself (\\), the NUL byte (\0), Line feeds (\n), Carriage return (\r), and the quotes (\' or \``).

Parameters
Parameters represent scalar values, like Integers, Reals and Strings do. However, their values are read inside Execute() and not inside Prepare(). Parameters are represented by question marks (?).

Identifiers
Identifiers are table or column names. Syntactically they consist of alphabetic characters, followed by an arbitrary number of alphanumeric characters. Identifiers like SELECT, INSERT, INTO, ORDER, BY, WHERE, ... are forbidden and reserved for other tokens. Identifiers are always compared case-insenitively, i.e. ``select foo from bar'' will be evaluated the same as ``SELECT FOO FROM BAR''. One exception is that if the module is used in conjunction with a file storage system, the names of tables are case sensitive.


Extending SQL syntax using SQL

The Supported SQL syntax shown above is the default for SQL::Statement but it can be extended (or contracted) either on-the-fly or on a permanent basis. In other words, you can modify the SQL syntax accepted as valid by the parser and accepted as executable by the executer. There are two methods for extending the syntax - 1) with SQL commands that can be issued directly in SQL::Statement or form a DBD or 2) by subclassing SQL::Parser.

The following SQL commands modify the default SQL syntax:

  CREATE/DROP FUNCTION
  CREATE/DROP KEYWORD
  CREATE/DROP TYPE
  CREATE/DROP OPERATOR

A simple example would be a situation in which you have a table named 'TABLE'. Since table is an ANSI reserved key word, by default SQL::Statement will produce an error when you attempt to create or access it. You could put the table name inside double quotes since quoted identifiers can validly be reserved words, or you could rename the table. But if those aren't options, you would do this:

  DROP KEYWORD table

Once that statement is issued, the parser will no longer object to 'table' as a table name. Careful though, if you drop too many keywords you may confuse the parser, especially keywords like FROM and WHERE that are central to parsing the statement.

In the reverse situation, suppose you want to parse some SQL that defines a column as type BIG_BLOB. Since 'BIG_BLOB' isn't a recognized ANSI data type, an error will be produced by default. To make the parser treat it as a valid data type, you'd do this:

 CREATE TYPE big_blob

Keywords and types are case-insensitive, so it doesn't matter what case you define it or use it with.

Suppose you're working with some SQL that contains the cosh() function (an Oracle function for hyperbolic cosine, whatever that is :-). The cosh() function isn't currently implemented in SQL::Statement so the parser would die with an error. But you can easily trick the parser into accepting the function:

 CREATE FUNCTION cosh

Once the parser has read that CREATE FUNCTION statement, it will no longer object to the use of the cosh() function in SQL statements.

If your only interest is in parsing SQL statements, then 'CREATE FUNCTION cosh' is sufficient. But if you actually want to be able to use the cosh() function in executable statements, you need to supply a perl subroutine that performs the cosh() function:

  CREATE FUNCTION cosh AS perl_subroutine_name

The subroutine name can refer to a subroutine in your current script, or to a subroutine in any available package. See the SQL::Statement::Functions manpage for details of how to create and load functions.

Functions can be used as predicates in search clauses, for example:

 SELECT * FROM x WHERE c1=7 AND SOUNDEX(c3,'foo') AND c8='bar'

In the SQL above, the SOUNDEX() function full predicate - it plays the same role as ``c1=7'' or ``c8='bar'''.

Functions can also serve as predicate operators. An operator, unlike a full predicate, has something on the left and right sides. An equal sign is an operator, so is LIKE. If you really want to you can get the parser to not accept LIKE as an operator with

 DROP OPERATOR like

Or, you can invent your own operator. Suppose you have an operator ``REVERSE_OF'' that is true if the string on its left side when reversed is equal to the string on the right side:

  CREATE OPERATOR reverse_of
  SELECT * FROM x WHERE c1=7 AND c3 REVERSE_OF 'foo'

The operator could just as well have been written as a function:

  CREATE FUNCTION reverse_of
  SELECT * FROM x WHERE c1=7 AND REVERSE_OF(c3,'foo')

Like functions, if you want to actually execute a user-defined operator as didistinct form just parsing it, you need to assign the operator to a perl subroutine. This is done exactly like assignin functions:

  CREATE OPERATOR reverse_of AS perl_subroutine_name


Extending SQL syntax using subclasses

In addition to using the SQL shown above to modify the parser's behavior, you can also extend the SQL syntax by subclassing SQL::Parser. See the SQL::Parser manpage for details.


AUTHOR & COPYRIGHT

Copyright (c) 2005, Jeff Zucker <jzuckerATcpan.org>, all rights reserved.

This document may be freely modified and distributed under the same terms as Perl itself.

 SQL::Statement::Syntax - documentation of SQL::Statement's SQL Syntax