Error:
ORA-24330: internal OCI error
SIVTMasterProProfList:1002 _getList_ SAVES=0 < "CCS" "PRS" F 1 "29644" "" :0 17000 "Database error: SQL> FETCH SelectCursor USING SQL DESCRIPTOR SelectBindDesc_." < "CCS" "PRS" F 1 "29644" "" :0 17000 "Database error: SELECT siv_t_mp_p0.sy_createddate, siv_t_mp_p0.rfu9, siv_t_mp_p0.mp_profname, siv_t_mp_p0.rfu8, siv_t_mp_p0.rfu7, siv_t_mp_p0.rfu6, siv_t_mp_p0.rfu5, siv_t_mp_p0.rfu4, siv_t_mp_p0.av_values, siv_t_mp_p0.rfu3, siv_t_mp_p0.rfu2, siv_t_mp_p0.rfu1, siv_t_mp_p0.sy_pendoperation, siv_t_mp_p0.rfu10, siv_t_mp_p0.vpn, siv_t_mp_p0.sy_modifieddate, siv_t_mp_p0.sy_committeddate, siv_t_mp_p0.domain, siv_t_mp_p0.sy_objectid, siv_t_mp_p0.metadata, siv_t_mp_p0.sy_profname, siv_t_mp_p0.sy_changeorderid, siv_t_mp_p0.sy_version, siv_t_mp_p0.sy_objectid FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_objectid = 'SIVU0000Fe' AND siv_t_mp_p0.sy_pendoperation != 2 AND (siv_t_mp_p0.sy_changeorderid = 'SYxf004M5G' AND siv_t_mp_p0.sy_version = 1 OR siv_t_mp_p0.sy_version = 0 AND NOT (siv_t_mp_p0.sy_objectid IN (SELECT siv_t_mp_p0.sy_objectid FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_changeorderid = 'SYxf004M5G' AND siv_t_mp_p0.sy_version = 1)))." < "CCS" "PRS" F 1 "29644" "" :0 17000 "Database error: ORA-03106: fatal two-task communication protocol error. (Code=-3106) (Position=0)." < "CCS" "ATR" P 1 "29644" "" SIVTMasterProProfList:1002 85010 "Getlist failed. [VPN Access Path Master Profile P List #1002 [SIVTMasterProProfList:1002]]" =TOP
PERMITTED (default)
Just before enlisting solution, here are few snaps showing how we came to conclusion that setting parameter PERMITTED will resolve this issue
At DB:
ORA-24330: internal OCI error
ORA-03106: fatal two-task communication
protocol error
ORA-03114: not connected to ORACLE
Where: Error logged on oracle
11.2.0.2 client application server. Application based on c/c++ coding. Solaris 10
SPARC
Background:
ORA-24330: internal OCI error & ORA-03114: not connected to ORACLE errors
caused due to session disconnection by ORA-03106:
fatal two-task communication protocol error
So, if you
receive any of these three errors there could be other error strings relevant
to it.
First time
we observed ORA-24330: internal OCI error
& started hunting for resolution, but did not find anything than contact
oracle support , However debugging further in logs found other two errors i.e. ORA-03114 & ORA-03106
Showing
complete error logs below, where you will find the sql statement & errors
Being DBA It
may take many efforts to get all these errors out of thousands of application logs.
Initially we
found just OCI Error which took us in wrong direction completely until we (application
team) sighted ORA-03106: fatal two-task
communication protocol error
[RSP] 139100
SIVTMasterProProfList:1001 _getList_
SAVES=0 < "CCS" "PRS" F 1 "18512" "":0 17000 "Database error: SQL> FETCH SelectCursor USING SQL
DESCRIPTOR electBindDesc_."
< "CCS" "PRS" F 1 "18512" ""
:0 17000 "Database error: SELECT siv_t_mp_p0.sy_createddate,
siv_t_mp_p0.rfu9, siv_t_mp_p0.mp_profname, siv_t_mp_p0.rfu8, siv_t_mp_p0.rfu7,
siv_t_mp_p0.rfu6, siv_t_mp_p0.rfu5, siv_t_mp_p0.rfu4, siv_t_mp_p0.av_values,
siv_t_mp_p0.rfu3, siv_t_mp_p0.rfu2, siv_t_mp_p0.rfu1,
siv_t_mp_p0.sy_pendoperation, siv_t_mp_p0.rfu10, siv_t_mp_p0.vpn,
siv_t_mp_p0.sy_modifieddate, siv_t_mp_p0.sy_committeddate, siv_t_mp_p0.domain,
siv_t_mp_p0.sy_objectid, siv_t_mp_p0.metadata, siv_t_mp_p0.sy_profname,
siv_t_mp_p0.sy_changeorderid, siv_t_mp_p0.sy_version, siv_t_mp_p0.sy_objectid
FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_objectid = 'SIVU0000Fe' AND
siv_t_mp_p0.sy_pendoperation != 2 AND (siv_t_mp_p0.sy_changeorderid =
'SYxf004PZK' AND siv_t_mp_p0.sy_version = 1 OR siv_t_mp_p0.sy_version = 0 AND
NOT (siv_t_mp_p0.sy_objectid IN (SELECT siv_t_mp_p0.sy_objectid FROM siv_t_mp_p
siv_t_mp_p0 WHERE siv_t_mp_p0.sy_changeorderid = 'SYxf004PZK' AND
siv_t_mp_p0.sy_version = 1)))."
< "CCS" "PRS" F 1 "18512" ""
:0 17000 "Database error:
ORA-24330: internal OCI error. (Code=-24330) (Position=0)." <
"CCS"
"ATR" P 1
"18512" "" SIVTMasterProProfList:1001 85010
"Getlist failed. [VPN Access Path Master Profile P List #1001
[SIVTMasterProProfList:1001]]" =TOP [->REQ] 139100 SIVTMasterProProfList:1001 _clearList_SAVES=0 < "CCS" "PRS" F 1 "18512" "" :0 17000 "Database error: SQL> OPEN GetInfoCursor." < "CCS" "PRS" F 1 "18512" "" :0 17000 "Database error: ORA-03114: not connected to ORACLE. (Code=-3114) (Position=0)." < "CCS" "PRS" I 1 "18512" "" :0 17097 "Database server is no longer accessible, forcing NPA shutdown and restart." =TOP [EVT->] 138999 LogMessage:1 systemadmin2 CCS PRS 18512 ":0" 0 17000 F 1 "!nouser!" "Database error: ORA-03114: not connected to ORACLE. (Code=-3114) (Position=0)." [EVT->] 138999 LogMessage:1 systemadmin2 CCS PRS 18512 ":0" 0 17000 F 1 "!nouser!" "Database error: SQL> OPEN GetInfoCursor."
[SIVTMasterProProfList:1001]]" =TOP [->REQ] 139100 SIVTMasterProProfList:1001 _clearList_SAVES=0 < "CCS" "PRS" F 1 "18512" "" :0 17000 "Database error: SQL> OPEN GetInfoCursor." < "CCS" "PRS" F 1 "18512" "" :0 17000 "Database error: ORA-03114: not connected to ORACLE. (Code=-3114) (Position=0)." < "CCS" "PRS" I 1 "18512" "" :0 17097 "Database server is no longer accessible, forcing NPA shutdown and restart." =TOP [EVT->] 138999 LogMessage:1 systemadmin2 CCS PRS 18512 ":0" 0 17000 F 1 "!nouser!" "Database error: ORA-03114: not connected to ORACLE. (Code=-3114) (Position=0)." [EVT->] 138999 LogMessage:1 systemadmin2 CCS PRS 18512 ":0" 0 17000 F 1 "!nouser!" "Database error: SQL> OPEN GetInfoCursor."
SIVTMasterProProfList:1002 _getList_ SAVES=0 < "CCS" "PRS" F 1 "29644" "" :0 17000 "Database error: SQL> FETCH SelectCursor USING SQL DESCRIPTOR SelectBindDesc_." < "CCS" "PRS" F 1 "29644" "" :0 17000 "Database error: SELECT siv_t_mp_p0.sy_createddate, siv_t_mp_p0.rfu9, siv_t_mp_p0.mp_profname, siv_t_mp_p0.rfu8, siv_t_mp_p0.rfu7, siv_t_mp_p0.rfu6, siv_t_mp_p0.rfu5, siv_t_mp_p0.rfu4, siv_t_mp_p0.av_values, siv_t_mp_p0.rfu3, siv_t_mp_p0.rfu2, siv_t_mp_p0.rfu1, siv_t_mp_p0.sy_pendoperation, siv_t_mp_p0.rfu10, siv_t_mp_p0.vpn, siv_t_mp_p0.sy_modifieddate, siv_t_mp_p0.sy_committeddate, siv_t_mp_p0.domain, siv_t_mp_p0.sy_objectid, siv_t_mp_p0.metadata, siv_t_mp_p0.sy_profname, siv_t_mp_p0.sy_changeorderid, siv_t_mp_p0.sy_version, siv_t_mp_p0.sy_objectid FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_objectid = 'SIVU0000Fe' AND siv_t_mp_p0.sy_pendoperation != 2 AND (siv_t_mp_p0.sy_changeorderid = 'SYxf004M5G' AND siv_t_mp_p0.sy_version = 1 OR siv_t_mp_p0.sy_version = 0 AND NOT (siv_t_mp_p0.sy_objectid IN (SELECT siv_t_mp_p0.sy_objectid FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_changeorderid = 'SYxf004M5G' AND siv_t_mp_p0.sy_version = 1)))." < "CCS" "PRS" F 1 "29644" "" :0 17000 "Database error: ORA-03106: fatal two-task communication protocol error. (Code=-3106) (Position=0)." < "CCS" "ATR" P 1 "29644" "" SIVTMasterProProfList:1002 85010 "Getlist failed. [VPN Access Path Master Profile P List #1002 [SIVTMasterProProfList:1002]]" =TOP
Cause &
Analysis:
As I explained earlier getting exact bunch of errors &
the actual error out of it was the trick to get into correct direction to solve
this problem
ORA-03106: fatal two-task communication protocol was the actual error & rest are
side effects of it causing disconnection of DB session
Here, I would like to show step by step analysis that we did
rather than jumping to solution directly which is at the end. In case the
solution is not relevant to your problem you may get some hint or direction to
resolve your problem based on analysis shown here below …
Day 1:
Application Team reported error ORA-03114
first time appearing within specific time interval 2-4 PM IST
What we did? Of
course pointed finger to network
[EVT->] 138999 LogMessage:1 systemadmin2 CCS PRS 18512
":0" 0 17000 F 1
"!nouser!" "Database
error: ORA-03114: not connected to
ORACLE. (Code=-3114) (Position=0)."[EVT->] 138999 LogMessage:1 systemadmin2 CCS PRS 18512
":0" 0 17000 F 1
"!nouser!" "Databaseerror: SQL> OPEN GetInfoCursor."
Issue went in Network and system admin’s bucket as no error was reported in alert log of DB
Issue went in Network and system admin’s bucket as no error was reported in alert log of DB
Day 2: Network team were ready to capture the log to
identify what’s going wrong between client & server communications, but Nothing
Found suspicious at Network level!
System admin also hold the line of control, they enabled something
like tcpdump, packet snooping and other traces, No luck!
Day 3: Yes, it was DBA’s turn on day 3
We did below to capture what’s going wrong between 2-4 PM IST
I. Ensure
AWR snaps being taken
II. Enabled
10046 trace for every incoming session from application server host
III.
Continuously Monitored TOP cpu/memory/locking session, load on DB & APPS
server
What we observed?
- Load found to be going high on both DB &
Apps server between 2-4 PM IST when error occurs
- Huge COMMIT operation from application as if
some batch job being executed
- RMAN backup was running causing huge IO
- 5 Queries with poor execution plan including a
DML executing almost 50 K time , two of the SELECT queries using /*+
FIRST_ROWS_1*/ which suggests a COMMIT after every row causing high COMMIT rate
- Log file sync , db sequential read , DB CPU were
amongst top 5 wait events
What we did?
- Highlighted these DB observation to application
team for their analysis
- As log file sync observed , when checked the
redo log file it was just 50 MB * 3 redo groups, so we replaced it by 1G * 6
Redo Log group with 2 member each
- RMAN backup scheduled changed to Night
What we got
from application in reply to DB observation is New Error i.e. ORA-24330:
internal OCI error along with previous ORA-03114: not connected
Day 4: ORA-24330:
internal OCI error made this issue interesting for others (application,
network team) as it confirms to be complete oracle issue either at oracle client
or database
What we did /
found?
- Logged Sev 1 SR in My Oracle Support (MOS)
- Relevant to OCI error ORA-24330 found below
oracle notes
Effect of client_result_cache_size On Client
Applications Or OCI Based Applications (Doc ID 1300727.1)
Bug 10427522 - ORA-24330 "internal OCI error" during DML (Doc ID 10427522.8)
Bug 10427522 - ORA-24330 "internal OCI error" during DML (Doc ID 10427522.8)
In above
oracle notes, it was suggested to
disable 11g feature of client result cache by setting client_result_cache_size=0
in parameter file
If value set
for client_result_cache_size is
greater than 0, then during heavy DML buffer size limited to this value causing
OCI error
But client_result_cache_size
was already set to zero on db, still we enforced it by adding parameter client_result_cache_size=0
in pfile as db was running with pfile than spfile with a db reboot
At the end
of day 4 we were completely dependent & waiting for SR updates for recommendation to get breakthrough
At the end
of the day 4, one major challenge was to re-produce the scenario which
application team was unable to do it using a sample code and till now exact
part of the application code was not identified
MOS
expecting us to create a testcase by re-producing the problem using below, but
we were unable to do so
make
-f demo_rdbms.mk build EXE=testcase OBJS=testcase.o
Day 5:
Fortunately last day of Hunt, followed by Happy Weekend
Again error
appeared at 3:46 PM IST & this time application team introduced us to new
ORA error i.e.
ORA-03106: fatal two-task communication
protocol error along with previous ORA-24330
& ORA-03114
It is not
like the new errors appearing every day but as I mentioned in beginning, huge
application logs to be mined (log created for almost every server process id at
application level) took time to identify the ORA- errors which was there right
from the beginning of the issues but we got it on day 5, unfortunately (or say fortunately
on day 5 at least)
SIVTMasterProProfList:1002 _getList_
SAVES=0 < "CCS" "PRS" F 1 "29644" ""
:0 17000 "Database error: SQL> FETCH SelectCursor USING SQL
DESCRIPTOR SelectBindDesc_." < "CCS" "PRS" F 1
"29644" "" :0 17000 "Database error: SELECT
siv_t_mp_p0.sy_createddate, siv_t_mp_p0.rfu9, siv_t_mp_p0.mp_profname,
siv_t_mp_p0.rfu8, siv_t_mp_p0.rfu7, siv_t_mp_p0.rfu6,
siv_t_mp_p0.rfu5, siv_t_mp_p0.rfu4, siv_t_mp_p0.av_values, siv_t_mp_p0.rfu3,
siv_t_mp_p0.rfu2, siv_t_mp_p0.rfu1, siv_t_mp_p0.sy_pendoperation, siv_t_mp_p0.rfu10,
siv_t_mp_p0.vpn, siv_t_mp_p0.sy_modifieddate, siv_t_mp_p0.sy_committeddate,
siv_t_mp_p0.domain, siv_t_mp_p0.sy_objectid, siv_t_mp_p0.metadata, siv_t_mp_p0.sy_profname,
siv_t_mp_p0.sy_changeorderid, siv_t_mp_p0.sy_version, siv_t_mp_p0.sy_objectid
FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_objectid = 'SIVU0000Fe' AND
siv_t_mp_p0.sy_pendoperation != 2 AND (siv_t_mp_p0.sy_changeorderid =
'SYxf004M5G' AND siv_t_mp_p0.sy_version = 1 OR siv_t_mp_p0.sy_version = 0 AND NOT (siv_t_mp_p0.sy_objectid IN
(SELECT siv_t_mp_p0.sy_objectid FROM siv_t_mp_p siv_t_mp_p0 WHERE siv_t_mp_p0.sy_changeorderid = 'SYxf004M5G' AND siv_t_mp_p0.sy_version = 1)))."
< "CCS" "PRS" F 1 "29644" ""
:0 17000 "Database error: ORA-03106:
fatal two-task communication protocol error. (Code=-3106)
(Position=0)." < "CCS" "ATR" P 1 "29644" "" SIVTMasterProProfList:1002 85010 "Getlist failed. [VPN Access Path
Master Profile P List #1002 [SIVTMasterProProfList:1002]]" =TOP
Also a hint
that error is appearing while selecting rows from blank LOB datatype column value
of the table, which application team were able to replicate using a sample
pro*C code found on oracle site
Ansidyn1.pc is an Oracle Pro*C program that
implements a dynamic SQL interpreter. Using the ANSI embedded SQL syntax,
Ansidyn1.pc accepts dynamic SQL at the input command prompt (SQL>
We tried to
compile this proc*C code which expects DB Username, Password & db service name
as input
proc
mode=ansi ansidyn1
Pro*C/C++: Release 11.2.0.2.0 - Production
on Sat Mar 23 15:34:31 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
System default option values taken from: /rdbms/11gR2/db_2/precomp/admin/pcscfg.cfg
Error at line 41, column 10 in file ansidyn1.pc
#include "tt_version.h"
.........1
PCC-S-02015, unable to open include file
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
System default option values taken from: /rdbms/11gR2/db_2/precomp/admin/pcscfg.cfg
Error at line 41, column 10 in file ansidyn1.pc
#include "tt_version.h"
.........1
PCC-S-02015, unable to open include file
But got so
many errors, so we need to rely on application team who handed over complied
code file a.out
There were solutions
to this compilation error, but i did not tried as got readily available
compiled file
./a.out CE ***1234 CEPRD
Connecting as CE@CEPRD
Connected
Connected
You are now in a SQL intrepreter shell using
ANSI Dynamic SQL.
Please enter SQL DDL and DML statements, eg
create table, select, commit etc
SQL> select
sraddcmds from sr_cfglet_p;
SRADDCMDS
ANSI sqlstate: 63000:
ORA-03106:
fatal two-task communication protocol error
SQL> exit
In the select
query used above to re-produce the problem sraddcmds was the CLOB column of
table sr_cfglet_p with few empty rows
So the root
cause of the problem was error being thrown for LOB empty column values
Now, Next
question was why is it so ?
With the
help of Oracle support we did get the answer i.e. db_securefile=ALWAYS parameter
ALWAYS
All LOBs
created in the system are created as SecureFile LOBs. If the LOB is not created in an
Automatic Segment Space Managed tablespace, then an error will occur. Any BasicFile LOB storage options are ignored. The SecureFile defaults will be
used for all storage options not specified.
PERMITTED (default)
LOBs are
allowed to be created as SecureFiles.
If the COMPATIBLE initialization parameter
is not set to 11.1 or higher, then LOBs are not treated as SecureFiles.
So action
plan was to make set db_securefile=PERMITTED , but huge setback was just by setting a db
parameter it was not going to resolve the problem completely as LOB tables we
created with ALWAYS options will throw same error if only db parameter
is changed . So post changing db_securefile to PERMITTED it was mandatory to re-create those tables
Changing DB
parameter along with tables re-creation was very much effort consuming task,
as it involves major structure changes at schema as well at code level
To save
these efforts a workaround was suggested to put some data or string "NULL" in
all blank columns along with modifying column to NOT NULL constraint & set the
DB parameter.
So that new
tables will be created under default option of PERMITTED wherein LOBS are
allowed to be created as secure files without restriction of ASSM
Due to some
issues oracle analyst was unable to re-produce the problem in his setup &
he was suspecting one parameter sql92_security=TRUE
which is when set to FALSE could
resolve the issue, but that did not happen
Bug has been
created by MOS as below for this issue which is in progress while I am writing
this article and you may get benefited with the solution/patch once made available
Bug 16536665 - ORA-3106 WHEN CLOB NULL AND SQL92_SECURITY=TRUE AND DB_SECUREFILE=ALWAYS
Just before enlisting solution, here are few snaps showing how we came to conclusion that setting parameter PERMITTED will resolve this issue
Test_1. Tests based on setting db_securefile PERMITTED/ALWAYS
at session level to re-produce the problem
When sample table
created with session level parameter PERMITTED no error observed, but it does
when set to ALWAYS
At DB:
At Apps: test_lob_2 is created when session level ALWAYS value was set for db_securefile
parameter
Test_2. To test the workaround
that blank LOB column row removal or replacing with some value could resolve
the issue
At DB:
At DB:
At App: Yes, it worked. But need to move such
columns to NOT NULL for future
Solution:
1. Set db_securefile to PERMITTED & re-create all LOB data type tables (Workaround)
2. Replace blank row column values of the table by some
value or NULL flag, and enable NOT NULL constraint for those columns
3. Apply Patch
<Patch 14727315> DATABASE PATCH SET UPDATE 11.2.0.2.9 (INCLUDES CPUJAN2013)
since it includes accoding to
11.2.0.2 Patch Set Updates - List of Fixes in each PSU (Doc ID 1340011.1)
the fix for
Bug 11723722ORA-3106 READNG SECUREFILES BY OCI SQLTYPE SQLT_LBI AND SQLT_CHR
Lesson Learned:
As the parameter db_securefile default value is PERMITTED , which got changed to ALWAYS as a part if database hardening as per CSI benchmark 11g
Need to have proper testing & review from application after DB hardening
3. Apply Patch
<Patch 14727315> DATABASE PATCH SET UPDATE 11.2.0.2.9 (INCLUDES CPUJAN2013)
since it includes accoding to
11.2.0.2 Patch Set Updates - List of Fixes in each PSU (Doc ID 1340011.1)
the fix for
Bug 11723722ORA-3106 READNG SECUREFILES BY OCI SQLTYPE SQLT_LBI AND SQLT_CHR
Lesson Learned:
As the parameter db_securefile default value is PERMITTED , which got changed to ALWAYS as a part if database hardening as per CSI benchmark 11g
Need to have proper testing & review from application after DB hardening
Excellent post and wonderful blog, I really like this type of interesting articles keep it up.
ReplyDeleteexcellent...
ReplyDeleteexcelent post , I have similar issues but without LOB column, table was partitioned and advanced compression on db 12.2.0.1.0 !
ReplyDeleteThis is Good information about this topic..I like it.. wordpress database fix ..Keep it Up!
ReplyDelete