oracle的字符集检查工具CSSCAN(一)

简介:
使用CSSCAN 工具在源库检查数据。
指定扫描的schame,只能指定一个schame
$ csscan \"sys/@ as sysdba\" LOG=/tmp/expcheck USER=scott CAPTURE=Y TOCHAR=AL32UTF8 ARRAY=1024000 PROCESS=3
指定扫描的表,在unix环境下当指定多个表时,' 是必须的,否则会报错:0403-057 Syntax error at line 21 : `(' is not expected. ):
$ csscan \"sys/@ as sysdba\" LOG=/tmp/expcheck TABLE='(SCOTT.DEPT,SCOTT.EMP)' CAPTURE=Y TOCHAR=AL32UTF8 ARRAY=1024000 PROCESS=2
在windows环境下,' 可以省略!
c:\>csscan \"sys/@ as sysdba\" LOG=c:\temp\expcheck TABLE=(SCOTT.DEPT,SCOTT.EMP) CAPTURE=Y TOCHAR=AL32UTF8 ARRAY=1024000 PROCESS=2
如果你有很多表要扫描,或许要避免不同os环境下语法的麻烦。就可以使用参数文件。
Example contents of csscan.par:
LOG=/tmp/expcheck
TABLE=(SCOTT.DEPT,SCOTT.EMP)
FULL=N
CAPTURE=Y
TOCHAR=AL32UTF8
  PROCESS=6
ARRAY=1024000

使用参数文件执行:
$ csscan \"sys/@ as sysdba\" PARFILE=/tmp/csscan.par
or
C:\>csscan \"sys/@ as sysdba\" PARFILE=c:\temp\csscan.par

Csscan will create 3 files :
expcheck.out csscan 输出的日志 a log of the output of csscan
expcheck.txt 数据库扫描概要the Database Scan Summary Report
expcheck.err 扫描时出错的记录,包含了出错的表,字段 (contains the rowid's of the Convertible , Truncation and Lossy rows reported in expcheck.txt)
下面是一些注意点:
You can only specify on user at the time, if you are exporting the majority of a database then use FULL=Y
$ csscan \"sys/@ as sysdba\" FULL=Y TOCHAR=AL32UTF8 LOG=expcheck CAPTURE=Y ARRAY=1000000 PROCESS=2
* Always run Csscan connecting with a 'sysdba' connection/user, do not use the "system" or "csmig" user.
* The PROCESS= parameter influences the load on your system, the higher this is (6 or 8 for example) the faster Csscan will be done, the lower this is the less impact it will have on your system. Adapt if needed.
* Do not specify the TONCHAR or FROMNCHAR csscan parameters , those are to change the NLS_NCHAR_CHARACTERSET. Again they are not needed and should not be specified.
* The csscan SUPPRESS parameter limits the size of the .err file by limiting the amount of information logged / table. Using SUPPRESS=1000 will log max 1000 rows for each table in the .err file. It will not affect the information in the .txt file. It WILL affect the data logged in the .err file. This is mainly useful for the first scan of big databases, if you have no idea how much "Convertible" or "Lossy" there is in a database then this will avoid that the .err file becomes 100's of MB big and it limits also the space used by the csscan tables under the Csmig schema.
note that to have correct result of the following select you should NOT use the Csscan  SUPPRESS option.

Once Csscan has been run you then need to check the .txt file

If there is any "Lossy" data - this is data that CANNOT be converted to the new NLS_CHARACTERSET
- you need to check further and see why this data is "Lossy", discussing "Lossy" is outside the scope of this note。
This select will give all the lossy objects found in the last Cssan run:
conn / AS sysdba
SELECT DISTINCT z.owner_name
|| '.'
|| z.table_name
|| '('
|| z.column_name
|| ') - '
|| z.column_type
|| ' ' LossyColumns
FROM csmig.csmv$errors z
WHERE z.error_type ='DATA_LOSS'
ORDER BY LossyColumns
/


For solving the ORA-01401 / ORA-12899 error you need to know which columns are logged as "Truncation" and what the new size of the data will be after import.

You can find that in the expcheck.err file as "Max Post Conversion Data Size"
For example, check in the expcheck.txt file wich table has "Truncation", let's assume you have there a row that say's:

-- snip from expcheck.txt
[Distribution of Convertible, Truncated and Lossy Data by Table]

USER.TABLE Convertible Truncation Lossy
--------------------- ---------------- ---------------- ----------------
...
SCOTT.TESTUTF8 69 6 0
...

then look in the expcheck.err file for "TESTUTF8" until the "Max Post Conversion Data Size" is bigger then the column size for that table.

-- snip from expcheck.err
User : SCOTT
Table : TESTUTF8
Column: ITEM_NAME
Type : VARCHAR2(80)
Number of Exceptions : 6
Max Post Conversion Data Size: 81
the max size after going to AL32UTF8 will be 81 bytes for this column.

Or you can use this select to have a list of the columns that have "Truncation" and the new size in bytes that is minimally needed:

conn / AS sysdba
SET serveroutput ON
DECLARE
newmaxsz NUMBER;
BEGIN
FOR rec IN
( SELECT DISTINCT u.owner_name,
u.table_name,
u.column_name ,
u.column_type,
u.owner_id,
u.table_id,
u.column_id,
u.column_intid
FROM csmv$errors u
WHERE u.error_type='EXCEED_SIZE'
ORDER BY u.owner_name,
u.table_name,
u.column_name
)
LOOP
SELECT MAX(cnvsize)
INTO newmaxsz
FROM csm$errors
WHERE usr# =rec.owner_id
AND obj# =rec.table_id
AND col# =rec.column_id
AND intcol#=rec.column_intid;

DBMS_OUTPUT.PUT_LINE(rec.owner_name ||'.'|| rec.table_name||' ('|| rec.column_name ||') - '|| rec.column_type ||' - '|| newmaxsz || ' Bytes');
END LOOP;
END;
/       
相关文章
|
2天前
|
存储 Oracle 关系型数据库
|
6天前
|
SQL Oracle 关系型数据库
关系型数据库Oracle备份工具
【7月更文挑战第19天】
23 4
|
18天前
|
Oracle 关系型数据库 Java
Oracle数据库导入工具IMP详解与用法
Oracle数据库导入工具IMP详解与用法
|
1月前
|
SQL Oracle 关系型数据库
探索 Linux 命令 `db_archive`:Oracle 数据库归档日志的工具
探索 Linux 中的 `db_archive`,实际与 Oracle 数据库归档日志管理相关。在 Oracle 中,归档日志用于恢复,当在线重做日志满时自动归档。管理员可使用 SQL*Plus 查看归档模式,通过 `RMAN` 进行备份和恢复操作。管理归档日志需谨慎,避免数据丢失。了解归档管理对 Oracle 管理员至关重要,确保故障时能快速恢复数据库。
|
27天前
|
Oracle 关系型数据库 Java
Oracle数据库导入工具IMP详解与用法
Oracle数据库导入工具IMP详解与用法
|
1月前
|
SQL Oracle 关系型数据库
实时计算 Flink版产品使用问题之同步oracle表时,数据量约800万,检查点异常,该如何排查
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStream API、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
2月前
|
Oracle 关系型数据库 数据库
Oracle 11gR2学习之三(创建用户及表空间、修改字符集和Oracle开机启动)
Oracle 11gR2学习之三(创建用户及表空间、修改字符集和Oracle开机启动)
|
2月前
|
Oracle 关系型数据库
Oracle 管理诊断数据工具ADRCI
Oracle 管理诊断数据工具ADRCI
37 2
|
2月前
|
存储 Oracle 关系型数据库
Oracle系列之三:Oracle字符集
Oracle系列之三:Oracle字符集
|
8月前
|
SQL Oracle 关系型数据库
Oracle常用工具
Oracle常用工具