[20170419]ora-28547.txt

简介: [20170419]ora-28547.txt --//测试环境,莫名奇妙使用toad登陆出现如下错误(使用ezconnect方式)。 ORA-28547: connection to server failed, probable Oracle Net adm...

[20170419]ora-28547.txt

--//测试环境,莫名奇妙使用toad登陆出现如下错误(使用ezconnect方式)。
ORA-28547: connection to server failed, probable Oracle Net admin error

$ oerr ora 28547
28547, 0000, "connection to server failed, probable Oracle Net admin error"
// *Cause:   A failure occurred during initialization of a network connection
//           from a client process to the Oracle server.  The connection
//           was completed but a disconnect occurred while trying to
//           perform protocol-specific initialization, usually due to
//           use of different network protocols by opposite sides
//           of the connection.  This usually is caused by incorrect
//           Oracle Net administrative setup for database links or external
//           procedure calls.   The most frequent specific causes are:
//           --  The connection uses a connect string which refers
//               to a Heterogeneous Services agent instead of
//               an Oracle server.
//           --  The connection uses a connect string which includes
//               an (HS=) specification.
//           --  Hitting Ctrl+C before connection initialization is
//               complete.
//           --  Using DRCP on Windows with SQLNET.AUTHENTICATION_SERVICES
//               set to NTS.
//           --  Connection timeout due to SQLNET.INBOUND_CONNECT_TIMEOUT
//               parameter when all shared servers are busy in
//               shared server configuration.
// *Action:  Check Oracle Net administration in the following ways:
//           --  When using TNSNAMES.ORA or an Oracle Names server, make sure
//               that the client connection to the Oracle server uses
//               the correct service name or SID.
//           --  Check LISTENER.ORA on the connection end point's host machine
//               to assure that this service name or SID refers
//               to the correct server.
//           --  Confirm in TNSNAMES.ORA or the equivalent service definition
//               that the connect string does not contain (HS=).
//           --  Set SQLNET.AUTHENTICATION_SERVICES to NONE when using DRCP
//               on Windows.
//

--//我改用sqlplus测试也一样。
sqlplus scott/book@192.168.100.78:1521/book

ERROR:
ORA-28547: connection to server failed, probable Oracle Net admin error

--//而改用tnsnames.ora 登陆,没有问题。重启监听问题依旧。
sqlplus scott/book@78

--//最后我选择重启数据库看看,在shutdown immediate时,我注意alert文件出现如下信息时:
Waiting for dispatcher 'D000' to shutdown
--//停留了很久,才想起来可能共享服务相关的进程可能出现问题,因为我的配置参数里面:
SYS@book> show parameter dispatchers
NAME             TYPE    VALUE
---------------- ------- --------------------------------------
dispatchers      string  (PROTOCOL=TCP) (SERVICE=book,bookXDB)
max_dispatchers  integer

--//这样如果使用ezconnect连接模式,实际上使用优先使用共享服务模式。链接:http://blog.itpub.net/267265/viewspace-2124172/
$ rlsql scott/book@127.0.0.1:1521/book
SCOTT@127.0.0.1:1521/book> select saddr,sid,paddr,status,server from v$session where sid in (select sid from v$mystat);
SADDR                   SID PADDR            STATUS   SERVER
---------------- ---------- ---------------- -------- ---------
00000000854B75C0        222 000000008591A280 ACTIVE   SHARED

SCOTT@127.0.0.1:1521/book> @ &r/spid
       SID    SERIAL# SPID       PID  P_SERIAL# C50
---------- ---------- ------ ------- ---------- --------------------------------------------------
       222         47 37225       20          1 alter system kill session '222,47' immediate;

--//当然因为重启数据库,现在已经正常使用,如何模拟这个问题呢?我自己做了一个测试。

1.环境:
SCOTT@127.0.0.1:1521/book> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

$ ps -ef | grep ora_[sd]000
oracle   37223     1  0 Apr18 ?        00:00:00 ora_d000_book
oracle   37225     1  0 Apr18 ?        00:00:02 ora_s000_book

--//共享服务器使用进程ora_d000_book,然后再紧系ora_s000_book进程,注意看前面的spid=37225.通过lsnrctl services也可以看出来。
$ lsnrctl services
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 19-APR-2017 08:44:18
Copyright (c) 1991, 2013, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1521)))
Services Summary...
Service "book" has 2 instance(s).
  Instance "book", status UNKNOWN, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0
         LOCAL SERVER
  Instance "book", status READY, has 2 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:7 refused:0 state:ready
         LOCAL SERVER
      "D000" established:12 refused:0 current:1 max:1022 state:ready
         DISPATCHER <machine: gxqyydg4, pid: 37223>
         (ADDRESS=(PROTOCOL=tcp)(HOST=gxqyydg4)(PORT=47060))
Service "book123" has 1 instance(s).
  Instance "book", status UNKNOWN, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0
         LOCAL SERVER
Service "bookXDB" has 1 instance(s).
  Instance "book", status READY, has 1 handler(s) for this service...
    Handler(s):
      "D000" established:12 refused:0 current:1 max:1022 state:ready
         DISPATCHER <machine: gxqyydg4, pid: 37223>
         (ADDRESS=(PROTOCOL=tcp)(HOST=gxqyydg4)(PORT=47060))
The command completed successfully

2.先停止进程ora_s000_book(进程号37225)看看。
$ kill -l
1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL
5) SIGTRAP      6) SIGABRT      7) SIGBUS       8) SIGFPE
9) SIGKILL     10) SIGUSR1     11) SIGSEGV     12) SIGUSR2
13) SIGPIPE     14) SIGALRM     15) SIGTERM     16) SIGSTKFLT
17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP
21) SIGTTIN     22) SIGTTOU     23) SIGURG      24) SIGXCPU
25) SIGXFSZ     26) SIGVTALRM   27) SIGPROF     28) SIGWINCH
29) SIGIO       30) SIGPWR      31) SIGSYS      34) SIGRTMIN
35) SIGRTMIN+1  36) SIGRTMIN+2  37) SIGRTMIN+3  38) SIGRTMIN+4
39) SIGRTMIN+5  40) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8
43) SIGRTMIN+9  44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12
47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14
51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10
55) SIGRTMAX-9  56) SIGRTMAX-8  57) SIGRTMAX-7  58) SIGRTMAX-6
59) SIGRTMAX-5  60) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2
63) SIGRTMAX-1  64) SIGRTMAX

--//linux SIGSTOP(19) 是给进程发stop信号。 SIGCONT(18) 是继续运行,注好像其他OS的信号数字不一样,使用kill -l 参考就明白了。

$ kill -19 37225

$  rlsql scott/book@127.0.0.1:1521/book
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 19 08:47:52 2017
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
ERROR:
ORA-28547: connection to server failed, probable Oracle Net admin error

--//OK,现在模拟出来问题了,至于为什么hang住,已经无法猜测。这个时候如果看监听日志:
$ tail -f listener.log
19-APR-2017 08:52:17 * (CONNECT_DATA=(SERVICE_NAME=book)(CID=(PROGRAM=sqlplus)(HOST=xxxx)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=7512)) * establish * book * 0

--//这也是我当时感觉奇怪的地方,无法定位问题。

$ kill -18 37225

--//再登陆就ok了。

--//这也是共享服务器模式的弊端,参考链接:http://blog.itpub.net/267265/viewspace-2124172/

3.继续测试,kill 进程ora_d000_book呢?

$  kill -9 37225

$  rlsql scott/book@127.0.0.1:1521/book
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 19 08:56:17 2017
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SCOTT@127.0.0.1:1521/book>  select saddr,sid,paddr,status,server from v$session where sid in (select sid from v$mystat);
SADDR                   SID PADDR            STATUS   SERVER
---------------- ---------- ---------------- -------- ---------
00000000854BA6A0        221 000000008591A280 ACTIVE   SHARED

--//要等上1小会,大约40秒后进入。

SCOTT@127.0.0.1:1521/book> @ &r/spid
       SID    SERIAL# SPID       PID  P_SERIAL# C50
---------- ---------- ------ ------- ---------- --------------------------------------------------
       221         17 41328       20          2 alter system kill session '221,17' immediate;

$ ps -ef | grep ora_[sd]000
oracle   37223     1  0 Apr18 ?        00:00:00 ora_d000_book
oracle   41328     1  0 08:56 ?        00:00:00 ora_s000_book

--//可以发现生成新的ora_s000_book进程,进程号=41328.你还可以发现一个细节PADDR与前面一样的(kill -9 前)

4.如果先停止进程ora_d000_book(进程号37223)看看。

$  rlsql scott/book@127.0.0.1:1521/book
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 19 09:02:39 2017
Copyright (c) 1982, 2013, Oracle.  All rights reserved.

--等2,3分钟依旧没有反映。

$ kill -18 37223

--//马上出现提示,而且监听log才出现
19-APR-2017 09:05:59 * (CONNECT_DATA=(SERVICE_NAME=book)(CID=(PROGRAM=sqlplus)(HOST=gxqyydg4)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=7678)) * establish * book * 0
19-APR-2017 09:06:01 * service_update * book * 0
--//这也说明sNNN相关进程hang住了。

SCOTT@127.0.0.1:1521/book>  select saddr,sid,paddr,status,server from v$session where sid in (select sid from v$mystat);

SADDR                   SID PADDR            STATUS   SERVER
---------------- ---------- ---------------- -------- ---------
00000000854BA6A0        221 000000008591A280 ACTIVE   SHARED

5.继续测试kill -9 ora_d000_book的情况:

$ ps -ef | grep ora_[sd]000
oracle   37223     1  0 Apr18 ?        00:00:00 ora_d000_book
oracle   41328     1  0 08:56 ?        00:00:00 ora_s000_book

$  rlsql scott/book@127.0.0.1:1521/book
SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 19 09:09:18 2017
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
ERROR:
ORA-12541: TNS:no listener

--等上一小会,就ok了。

$ ps -ef | grep ora_[sd]000
oracle   41328     1  0 08:56 ?        00:00:00 ora_s000_book
oracle   41424     1  0 09:09 ?        00:00:00 ora_d000_book

--//另外alert里面也有相关记录:
Wed Apr 19 08:56:43 2017
found dead shared server 'S000', pid = (20, 1)
Wed Apr 19 09:03:18 2017
Incremental checkpoint up to RBA [0x2c2.10a1f.0], current log tail at RBA [0x2c2.11415.0]

Wed Apr 19 09:09:19 2017
found dead dispatcher 'D000', pid = (19, 1)

--//oracle后台也在监测,如果发现这些进程被kill,自动建立新的进程,当然需要等1小会....

--//从以上测试看,最怕进程阻塞,导致后续的语句以及登陆出现问题。
--//从另外一个方面说明,如果使用共享模式登陆,要密切注意相关进程的状态。max_dispatchers参数以及sql语句优化。

目录
相关文章
|
Oracle 关系型数据库
[20180502]ORA-01580.txt
[20180502]ORA-01580.txt RMAN> backup current controlfile; Starting backup at 2018-05-02 15:36:03 using channel ORA_DISK_1 using...
1174 0
|
关系型数据库 Oracle Linux
[20180321]ORA-08180.txt
[20180321]ORA-08180.txt $ oerr ora 08180 08180, 00000, "no snapshot found based on specified time" // *Cause: Could not match the time to an SCN from the mapping table.
1297 0
|
Oracle 关系型数据库 Linux
|
Oracle 关系型数据库 数据库
[20171208]ORA-16014(10G).txt
[20171208]ORA-16014(10G).txt --//别人配置dg遇到的问题,当时没有记录.在自己的环境做一个演示.这个问题相对隐藏很深.主要是自己不注意这些细节.
1201 0
|
关系型数据库 Oracle Linux
[20170628]12C ORA-54032.txt
[20170628]12C ORA-54032.txt https://hourim.wordpress.com/2017/06/24/ora-54032-column-to-be-renamed-is-used-in-a-virtual-column-expres...
1460 0
[20160623]ora-00445.txt
[20160623]ora-00445.txt Wed Jun 22 07:38:25 2016 Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smco_3976.
1323 0
|
测试技术
[20160311]ora-01732.txt
[20160311]ora-01732.txt --今天工作出现ora-01732错误,做一个测试例子,做一个记录: 1.环境: SCOTT@test01p> @ ver1 PORT_STRING          VERSION     BANNER    ...
994 0
|
SQL Oracle 关系型数据库
[20151215]ORA-60014.txt
[20151215]ORA-60014: invalid MAXSIZE storage option value.txt SCOTT@book> @ &r/ver1 PORT_STRING                    VERSION        ...
846 0
|
SQL
[20150707]ORA-00932.txt
[20150707]ORA-00932: inconsistent datatypes: expected - got CLOB.txt --自己写一个通过sql_id查看sql语句的脚本: column sqltext format a200 sele...
777 0
|
Oracle 关系型数据库 数据库
[20150529]ORA-16664.txt
[20150529]ORA-16664.txt --今天在例行检查中.执行 dgmgrl 出现ora-16664错误. --检查drcdbcn1.log文件,发现如下错误.
996 0