SQL::Parser -- validate and parse SQL strings
use SQL::Parser; # CREATE A PARSER OBJECT
my $parser = SQL::Parser->new();
$parser->feature( $class, $name, $value ); # SET OR FIND STATUS OF
my $has_feature = $parser->feature( $class, $name ); # A PARSER FEATURE
$parser->dialect( $dialect_name ); # SET OR FIND STATUS OF
my $current_dialect = $parser->dialect; # A PARSER DIALECT
SQL::Parser is part of the SQL::Statement distribution and, most
interaction with the parser should be done through SQL::Statement.
The methods shown above create and modify a parser object. To use the
parser object to parse SQL and to examine the resulting structure, you
should use SQL::Statement.
Important Note: Previously SQL::Parser had its own hash-based
interface for parsing, but that is now deprecated and will eventually
be phased out in favor of the object-oriented parsing interface of
SQL::Statement. If you are unable to transition some features to the
new interface or have concerns about the phase out, please contact me.
See The Parse Structure for details of the now-deprecated hash
method if you still need them.
Create a new parser object
use SQL::Parser;
my $parser = SQL::Parser->new();
The new() method creates a SQL::Parser object which can then be
used to parse and validate the syntax of SQL strings. It takes two
optional parameters - 1) the name of the SQL dialect that will define
the syntax rules for the parser and 2) a reference to a hash which can
contain additional attributes of the parser. If no dialect is specified,
'AnyData' is the default.
use SQL::Parser;
my $parser = SQL::Parser->new( $dialect_name, \%attrs );
The dialect_name parameter is a string containing any valid
dialect such as 'ANSI', 'AnyData', or 'CSV'. See the section on
the dialect() method below for details.
The attrs parameter is a reference to a hash that can
contain error settings for the PrintError and RaiseError
attributes.
An example:
use SQL::Parser;
my $parser = SQL::Parser->new('AnyData', {RaiseError=>1} );
This creates a new parser that uses the grammar rules
contained in the .../SQL/Dialects/AnyData.pm file and which
sets the RaiseError attribute to true.
$parser->dialect( $dialect_name ); # load a dialect configuration file
my $dialect = $parser->dialect; # get the name of the current dialect
For example:
$parser->dialect('AnyData'); # loads the AnyData config file
print $parser->dialect; # prints 'AnyData'
The $dialect_name parameter may be the name of any dialect
configuration file on your system. Use the
$parser->list('dialects') method to see a list of available
dialects. At a minimum it will include ``ANSI'', ``CSV'', and
``AnyData''. For backwards compatibility 'Ansi' is accepted as a
synonym for 'ANSI', otherwise the names are case sensitive.
Loading a new dialect configuration file erases all current
parser features and resets them to those defined in the
configuration file.
Features define the rules to be used by a specific parser
instance. They are divided into the following classes:
* valid_commands
* valid_options
* valid_comparison_operators
* valid_data_types
* reserved_words
Within each class a feature name is either enabled or
disabled. For example, under ``valid_data_types'' the name ``BLOB''
may be either disabled or enabled. If it is not enabled
(either by being specifically disabled, or simply by not being
specified at all) then any SQL string using ``BLOB'' as a data
type will throw a syntax error ``Invalid data type: 'BLOB'''.
The feature() method allows you to enable, disable, or check the
status of any feature.
$parser->feature( $class, $name, 1 ); # enable a feature
$parser->feature( $class, $name, 0 ); # disable a feature
my $feature = $parser->feature( $class, $name ); # return status of a feature
For example:
$parser->feature('reserved_words','FOO',1); # make 'FOO' a reserved word
$parser->feature('valid_data_types','BLOB',0); # disallow 'BLOB' as a
# data type
# determine if the LIKE
# operator is supported
my $LIKE = $parser->feature('valid_comparison_operators','LIKE');
See the section below on ``Backwards Compatibility'' for use of
the feature() method with SQL::Statement 0.1x style parameters.
The SQL::Statement distribution can be used to either just parse SQL
statements or to execute them against actual data. A broader set of
syntax is supported in the parser than in the executor. For example
the parser allows you to specify column constraints like PRIMARY KEY.
Currently, these are ignored by the execution engine. Likewise syntax
such as RESTRICT and CASCADE on DROP statements or LOCAL GLOBAL TEMPORARY
tables in CREATE are supported by the parser but ignored by the executor.
To see the list of Supported SQL syntax formerly kept in this pod, see
the SQL::Statement manpage.
In the event you need to either extend or modify SQL::Parser's
default behavior, the following methods may be overridden:
- $self->get_btwn($string)
-
Processes the BETWEEN...AND... predicates; default converts to
2 range predicates.
- $self->get_in($string)
-
Process the IN (...list...) predicates; default converts to
a series of OR'd '=' predicate, or AND'd '<>' predicates for
NOT IN.
- $self->transform_syntax($string)
-
Abstract method; default simply returns the original string.
Called after
repl_btwn() and repl_in(), but before any further
predicate processing is applied. Possible uses include converting
other predicate syntax not recognized by SQL::Parser into user-defined
functions.
This section outlines the now-deprecated hash interface to the
parsed structure. It is included for backwards compatibility only.
You should use the SQL::Statement object interface to the structure
instead. See the SQL::Statement manpage.
Parse Structures
Here are some further examples of the data structures returned
by the structure() method after a call to parse(). Only
specific details are shown for each SQL instance, not the entire
structure.
parse()
Once a SQL::Parser object has been created with the new()
method, the parse() method can be used to parse any number of
SQL strings. It takes a single required parameter -- a string
containing a SQL command. The SQL string may optionally be
terminated by a semicolon. The parse() method returns a true
value if the parse is successful and a false value if the parse
finds SQL syntax errors.
Examples:
1) my $success = $parser->parse('SELECT * FROM foo');
2) my $sql = 'SELECT * FROM foo';
my $success = $parser->parse( $sql );
3) my $success = $parser->parse(qq!
SELECT id,phrase
FROM foo
WHERE id < 7
AND phrase <> 'bar'
ORDER BY phrase;
!);
4) my $success = $parser->parse('SELECT * FRoOM foo ');
In examples #1,#2, and #3, the value of $success will be true
because the strings passed to the parse() method are valid SQL
strings.
In example #4, however, the value of $success will be false
because the string contains a SQL syntax error ('FRoOM' instead
of 'FROM').
In addition to checking the return value of parse() with a
variable like $success, you may use the PrintError and
RaiseError attributes as you would in a DBI script:
* If PrintError is true, then SQL syntax errors will be sent as
warnings to STDERR (i.e. to the screen or to a file if STDERR
has been redirected). This is set to true by default which
means that unless you specifically turn it off, all errors
will be reported.
* If RaiseError is true, then SQL syntax errors will cause the
script to die, (i.e. the script will terminate unless wrapped
in an eval). This is set to false by default which means
that unless you specifically turn it on, scripts will
continue to operate even if there are SQL syntax errors.
Basically, you should leave PrintError on or else you will not
be warned when an error occurs. If you are simply validating a
series of strings, you will want to leave RaiseError off so that
the script can check all strings regardless of whether some of
them contain SQL errors. However, if you are going to try to
execute the SQL or need to depend that it is correct, you should
set RaiseError on so that the program will only continue to
operate if all SQL strings use correct syntax.
IMPORTANT NOTE #1: The parse() method only checks syntax, it
does NOT verify if the objects listed actually exist. For
example, given the string ``SELECT model FROM cars'', the parse()
method will report that the string contains valid SQL but that
will not tell you whether there actually is a table called
``cars'' or whether that table contains a column called 'model'.
Those kinds of verifications are performed by the
SQL::Statement module, not by SQL::Parser by itself.
IMPORTANT NOTE #2: The parse() method uses rules as defined by
the selected dialect configuration file and the feature()
method. This means that a statement that is valid in one
dialect may not be valid in another. For example the 'CSV' and
'AnyData' dialects define 'BLOB' as a valid data type but the
'ANSI' dialect does not. Therefore the statement 'CREATE TABLE
foo (picture BLOB)' would be valid in the first two dialects but
would produce a syntax error in the 'ANSI' dialect.
structure()
After a SQL::Parser object has been created and the parse()
method used to parse a SQL string, the structure() method
returns the data structure of that string. This data structure
may be passed on to other modules (e.g. SQL::Statement) or it
may be printed out using, for example, the Data::Dumper module.
The data structure contains all of the information in the SQL
string as parsed into its various components. To take a simple
example:
$parser->parse('SELECT make,model FROM cars');
use Data::Dumper;
print Dumper $parser->structure;
Would produce:
$VAR1 = {
'column_defs' => [
{ 'type' => 'column',
'value' => 'make', },
{ 'type' => 'column',
'value' => 'model', },
],
'command' => 'SELECT',
'table_names' => [
'cars'
]
};
'SELECT make,model, FROM cars'
command => 'SELECT',
table_names => [ 'cars' ],
column_names => [ 'make', 'model' ],
'CREATE TABLE cars ( id INTEGER, model VARCHAR(40) )'
column_defs => {
id => { data_type => INTEGER },
model => { data_type => VARCHAR(40) },
},
'SELECT DISTINCT make FROM cars'
set_quantifier => 'DISTINCT',
'SELECT MAX (model) FROM cars'
set_function => {
name => 'MAX',
arg => 'models',
},
'SELECT * FROM cars LIMIT 5,10'
limit_clause => {
offset => 5,
limit => 10,
},
'SELECT * FROM vars ORDER BY make, model DESC'
sort_spec_list => [
{ make => 'ASC' },
{ model => 'DESC' },
],
"INSERT INTO cars VALUES ( 7, 'Chevy', 'Impala' )"
values => [ 7, 'Chevy', 'Impala' ],
You can find documentation for this module with the perldoc command.
perldoc SQL::Parser
perldoc SQL::Statement
You can also look for information at:
For questions about installation or usage, please ask on the
dbi-users@perl.org mailing list or post a question on PerlMonks
(http://www.perlmonks.org/, where Jeff is known as jZed).
Jens does not visit PerlMonks on a regular basis.
If you have a bug report, a patch or a suggestion, please open a new
report ticket at CPAN (but please check previous reports first in case
your issue has already been addressed). You can mail any of the module
maintainers, but you are more assured of an answer by posting to the
dbi-users list or reporting the issue in RT.
Report tickets should contain a detailed description of the bug or
enhancement request and at least an easily verifiable way of
reproducing the issue or fix. Patches are always welcome, too.
Bugs and feature requests are accepted against the latest version
only. To get patches for earlier versions, you need to get an
agreement with a developer of your choice - who may or not report the
the issue and a suggested fix upstream (depends on the license you
have chosen).
For business support you can contact Jens via his CPAN email
address rehsackATcpan.org. Please keep in mind that business
support is neither available for free nor are you eligible to
receive any support based on the license distributed with this
package.
This module is
copyright (c) 2001,2005 by Jeff Zucker and
copyright (c) 2007-2020 by Jens Rehsack.
All rights reserved.
The module may be freely distributed under the same terms as
Perl itself using either the ``GPL License'' or the ``Artistic
License'' as specified in the Perl README file.
Jeff can be reached at: jzuckerATcpan.org
Jens can be reached at: rehsackATcpan.org or via dbi-devATperl.org
|