Is there a regex which extracts SQL queries from a string? I'm NOT interested to validate any SQL syntax, rather and only extracting a selection of SQL commands. This to parse a given SQL file/string in a flexible manner.
Given is the following SQL file/string example:
SELECT
*
FROM
test_table
WHERE
test_row = 'Testing ; semicolon';
SELECT * FROM another_test_table;
INSERT INTO
table_name
VALUES
(value1,'value which contains semicolon ;;;;',value3,...);
Some pseudocode example would be: ^(UPDATE|SELECT|INSERT INTO)(.*)(;)$
. In the future i'm looking to extend this with all (possible) commands.
any character
(including whitespaces and newlines);
, which delimits the SQL query.Whenever this would be possible via a regex the following java code is able to extract all SQL commands:
final String regex = "LOOKING_FOR_THIS_ONE";
final Pattern p = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = p.matcher(content);
while (matcher.find()) {
// matcher.group() now contains the full SQL command
}
Thanks in advance!
I'll start off by saying that this is not a good way of doing it, and strongly urge you to find another method of doing it, preferrably tagging it properly where the statements are made, so you don't end up in this situation.
That being said, SQL requires it to start with one of the following; DELETE
, SELECT
, WITH
, UPDATE
or INSERT INTO
. It also requires that the input ends with ;
.
We can use this to grab all sequences matching SQL with the following:
final String regex = "^(INSERT INTO|UPDATE|SELECT|WITH|DELETE)(?:[^;']|(?:'[^']+'))+;\\s*$";
final Pattern p = Pattern.compile(regex, Pattern.MULTILINE | Pattern.DOTALL);
Group 1 now holds the operating word, in case you wish to filter valid SQL on UPDATE
or SELECT
.
See the regex in action, as well as a cave-at here: