标签
PostgreSQL , Oracle , Greenplum , LOG ERRORS INTO
背景
Oracle支持DML的log errors,是一个很赞的功能。
https://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_errlog.htm#CEGEJAAJ
https://oracle-base.com/articles/10g/dml-error-logging-10gr2
支持insert,update,delete,merge的错误日志记录,可以跳过错误的行
INSERT INTO dest
SELECT *
FROM source
LOG ERRORS INTO err$_dest ('INSERT') REJECT LIMIT UNLIMITED;
99998 rows created.
SQL>
COLUMN ora_err_mesg$ FORMAT A70
SELECT ora_err_number$, ora_err_mesg$
FROM err$_dest
WHERE ora_err_tag$ = 'INSERT';
ORA_ERR_NUMBER$ ORA_ERR_MESG$
--------------- ---------------------------------------------------------
1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")
1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")
2 rows selected.
SQL>
UPDATE dest
SET code = DECODE(id, 9, NULL, 10, NULL, code)
WHERE id BETWEEN 1 AND 10
LOG ERRORS INTO err$_dest ('UPDATE') REJECT LIMIT UNLIMITED;
8 rows updated.
SQL>
COLUMN ora_err_mesg$ FORMAT A70
SELECT ora_err_number$, ora_err_mesg$
FROM err$_dest
WHERE ora_err_tag$ = 'UPDATE';
ORA_ERR_NUMBER$ ORA_ERR_MESG$
--------------- ---------------------------------------------------------
1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")
1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")
2 rows selected.
SQL>
MERGE INTO dest a
USING source b
ON (a.id = b.id)
WHEN MATCHED THEN
UPDATE SET a.code = b.code,
a.description = b.description
WHEN NOT MATCHED THEN
INSERT (id, code, description)
VALUES (b.id, b.code, b.description)
LOG ERRORS INTO err$_dest ('MERGE') REJECT LIMIT UNLIMITED;
99998 rows merged.
SQL>
COLUMN ora_err_mesg$ FORMAT A70
SELECT ora_err_number$, ora_err_mesg$
FROM err$_dest
WHERE ora_err_tag$ = 'MERGE';
ORA_ERR_NUMBER$ ORA_ERR_MESG$
--------------- ---------------------------------------------------------
1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")
1400 ORA-01400: cannot insert NULL into ("TEST"."DEST"."CODE")
2 rows selected.
SQL>
DELETE FROM dest
LOG ERRORS INTO err$_dest ('DELETE') REJECT LIMIT UNLIMITED;
99996 rows deleted.
SQL>
COLUMN ora_err_mesg$ FORMAT A69
SELECT ora_err_number$, ora_err_mesg$
FROM err$_dest
WHERE ora_err_tag$ = 'DELETE';
ORA_ERR_NUMBER$ ORA_ERR_MESG$
--------------- ---------------------------------------------------------------------
2292 ORA-02292: integrity constraint (TEST.DEST_CHILD_DEST_FK) violated -
child record found
2292 ORA-02292: integrity constraint (TEST.DEST_CHILD_DEST_FK) violated -
child record found
2 rows selected.
SQL>
Greenplum copy兼容log errors
Greenplum可以通过COPY支持log errors。暂时未支持insert, merge, update, delete的error log.
COPY table [(column [, ...])] FROM {'file' | STDIN}
[ [WITH]
[OIDS]
[HEADER]
[DELIMITER [ AS ] 'delimiter']
[NULL [ AS ] 'null string']
[ESCAPE [ AS ] 'escape' | 'OFF']
[NEWLINE [ AS ] 'LF' | 'CR' | 'CRLF']
[CSV [QUOTE [ AS ] 'quote']
[FORCE NOT NULL column [, ...]]
[FILL MISSING FIELDS]
[[LOG ERRORS [INTO error_table] [KEEP]
SEGMENT REJECT LIMIT count [ROWS | PERCENT] ]
COPY {table [(column [, ...])] | (query)} TO {'file' | STDOUT}
[ [WITH]
[OIDS]
[HEADER]
[DELIMITER [ AS ] 'delimiter']
[NULL [ AS ] 'null string']
[ESCAPE [ AS ] 'escape' | 'OFF']
[CSV [QUOTE [ AS ] 'quote']
[FORCE QUOTE column [, ...]] ]
[IGNORE EXTERNAL PARTITIONS ]
LOG ERRORS [INTO error_table] [KEEP]
This is an optional clause that can precede a SEGMENT REJECT LIMIT clause to log
information about rows with formatting errors. The INTO error_table clause specifies an
error table where rows with formatting errors will be logged when running in single row error
isolation mode.
If the INTO error_table clause is not specified, the error log information is stored internally
(not in an error table). Error log information that is stored internally is accessed with the
Greenplum Database built-in SQL function gp_read_error_log().
If the error_table specified already exists, it is used. If it does not exist, it is created. If
error_table exists and does not have a random distribution (the DISTRIBUTED RANDOMLY
clause was not specified when creating the table), an error is returned.
If the command generates the error table and no errors are produced, the default is to drop
the error table after the operation completes unless KEEP is specified. If the table is created
and the error limit is exceeded, the entire transaction is rolled back and no error data is
saved. If you want the error table to persist in this case, create the error table prior to running
the COPY.
See Notes for information about the error log information and built-in functions for viewing
and managing error log information.
Note: The optional INTO error_table clause is deprecated and will not be
supported in a future release. Only internal error logs will be supported.
When you specify LOG ERRORS INTO error_table, Greenplum Database creates the table error_table
that contains errors that occur while reading the external table. The table is defined as follows:
CREATE TABLE error_table_name ( cmdtime timestamptz, relname text,
filename text, linenum int, bytenum int, errmsg text,
rawdata text, rawbytes bytea ) DISTRIBUTED RANDOMLY;
You can view the information in the table with SQL commands.
For error log data that is stored internally when the INTO error_table is not specified:
- Use the built-in SQL function gp_read_error_log('table_name'). It requires SELECT privilege on
table_name. This example displays the error log information for data loaded into table ext_expenses
with a COPY command:
SELECT * from gp_read_error_log('ext_expenses');
The error log contains the same columns as the error table.
The function returns FALSE if table_name does not exist.
- If error log data exists for the specified table, the new error log data is appended to existing error log
data. The error log information is not replicated to mirror segments.
- Use the built-in SQL function gp_truncate_error_log('table_name') to delete the error log data
for table_name. It requires the table owner privilege This example deletes the error log information
captured when moving data into the table ext_expenses:
SELECT gp_truncate_error_log('ext_expenses');
The function returns FALSE if table_name does not exist.
Specify the * wildcard character to delete error log information for existing tables in the current
database. Specify the string *.* to delete all database error log information, including error log
information that was not deleted due to previous database issues. If * is specified, database owner
privilege is required. If *.* is specified, operating system super-user privilege is required.
When a Greenplum Database user who is not a superuser runs a COPY command, the command can be
controlled by a resource queue. The resource queue must be configured with the ACTIVE_STATEMENTS
parameter that specifies a maximum limit on the number of queries that can be executed by roles assigned
to that queue. Greenplum Database does not apply a cost value or memory value to a COPY command,
resource queues with only cost or memory limits do not affect the running of COPY commands.
A non-superuser can runs can run these types of COPY commands:
- COPY FROM command where the source is stdin
- COPY TO command where the destination is stdout
For information about resource queues, see "Workload Management with Resource Queues" in the
Greenplum Database Administrator Guide.
参考
https://greenplum.org/docs/570/ref_guide/sql_commands/COPY.html
《PostgreSQL 11 preview - MERGE 语法支持与CTE内支持,兼容SQL:2016 , 兼容 Oracle》