Blog dedicated to Oracle Applications (E-Business Suite) Technology; covers Apps Architecture, Administration and third party bolt-ons to Apps

Monday, March 31, 2008

ORA-03136 in alert log

A lot of ORA-3136 started appearing in alert log after the latest 10g upgrade:

Thu Mar 20 02:52:16 2008
WARNING: inbound connection timed out (ORA-3136)
Thu Mar 20 02:58:00 2008

Metalink Note 345197.1, suggests setting the value of INBOUND_CONNECT_TIMEOUT_ TO 120 or higher in listener.ora and in sqlnet.ora to prevent this.

Symptoms

The Oracle Net 10g parameters SQLNET.INBOUND_CONNECT_TIMEOUT and INBOUND_CONNECT_TIMEOUT_listenername default to 0 (indefinite) in 10.1. To address Denial of Service (DOS) issues, the parameters were set to have a default of 60 (seconds) in Oracle 10.2.

If applications are longer than 60 secs to authenticate with the Oracle database, the errors occur.

The following may be seen in the alert log: WARNING: inbound connection timed out (ORA-3136)

SQLNET.INBOUND_CONNECT_TIMEOUT is set to a value in seconds and determines how long a client has to provide the necessary authentication information to a database.

INBOUND_CONNECT_TIMEOUT_listenername is set to a value in seconds and determines how long a client has to complete its connect request to the listener after the network connection has been established.

To protect both the listener and the database server, Oracle Corporation recommends setting INBOUND_CONNECT_TIMEOUT_listenername in combination with the SQLNET.INBOUND_CONNECT_TIMEOUT parameter.

Cause

Whenever default timeouts are assigned to a parameter, there may be cases where this default does not work well with a particular application. However, some type of timeout on the connection establishment is necessary to combat Denial of Service attacks on the database. In this case, SQLNET.INBOUND_CONNECT__TIMEOUT and INBOUND_CONNECT_TIMEOUT_listenername were given default values of 60 seconds in Oracle 10.2. It is these timeout values that can cause the errors described in this note.


Also note that it is possilbe the reason the database is slow to authenticate, may be due to an overloaded Oracle database or node.

Solution

Set the parameters SQLNET.INBOUND_CONNECT_TIMEOUT and INBOUND_CONNECT_TIMEOUT_listenername to 0 (indefinite) or to an approprate value for the application yet still combat DOS attacks (120 for example).

These parameters are set on the SERVER side:
listener.ora: INBOUND_CONNECT_TIMEOUT_listenername
sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT

Further tuning of these parameters may be needed is the problem persists.

ORA-01986: OPTIMIZER_GOAL is obsolete

I got a call for issue with a form. After the 10g upgrade, a standard form Launch Contracts which takes 1 second to launch was taking 30 minutes to open. A trace on this form showed

Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 173 (APPS) (recursive depth: 2)
unable to set optimizer goal
ORA-01986: OPTIMIZER_GOAL is obsolete

parse error offset: 33

Bug 5854184 describes this issue and advises:

5386974.992 KAT Alt 100 05-May-2006 11.5.10 LAUNCH CONTRACTS IS VERY SLOW
Fix: Check if you have index WF_NOTIFICATIONS_N1 for table WF_NOTIFICATIONS
If you have this index please rebuild the index, analyze the table and the index. This will allow the optimizer to choose the best plan to execute the
Query.

Rebuilding the index didn’t help. Analyzing the table and index did not help. Gathering statistics with DBMS_STATS package on table did the trick:

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => NULL, TABNAME =>'WF_NOTIFICATIONS');

PL/SQL procedure successfully completed.

SQL> exit

Once I ran this, the form started opening in less than 1 second.

Friday, March 28, 2008

make: Fatal error: Don't know how to make target '$FND_TOP/lib/wfresgcp.o'

While applying patches on a cloned instance the relink of WFRESGEN failed with following error on the web tier:

make -f $APPL_TOP/admin/$TWO_TASK/out/link_fnd_24844.mk $FND_TOP/bin/WFRESGEN

Starting link of fnd executable 'WFRESGEN' on Fri Mar 28 16:21:26 EDT 2008
make: Fatal error: Don't know how to make target `$FND_TOP/lib/wfresgcp.o'
Done with link of fnd executable 'WFRESGEN' on Fri Mar 28 16:21:26 EDT 2008

Relink of module "WFRESGEN" failed.

There was no file called $FND_TOP/lib/wfresgcp.o. To fix this I copied this file from the concurrent tier to the web tier. Relink of WFRESGEN completed successfully after this.

Thursday, March 27, 2008

Interesting post on max_dump_file_size

From Oracle Database reference:

MAX_DUMP_FILE_SIZE

Property Description
Parameter type String
Syntax MAX_DUMP_FILE_SIZE = { integer [K | M] | UNLIMITED }
Default value UNLIMITED
Modifiable ALTER SESSION, ALTER SYSTEM
Range of values 0 to unlimited, or UNLIMITED
Basic No

MAX_DUMP_FILE_SIZE specifies the maximum size of trace files (excluding the alert file). Change this limit if you are concerned that trace files may use too much space.

  • A numerical value for MAX_DUMP_FILE_SIZE specifies the maximum size in operating system blocks.

  • A number followed by a K or M suffix specifies the file size in kilobytes or megabytes.

  • The special value string UNLIMITED means that there is no upper limit on trace file size. Thus, dump files can be as large as the operating system permits.

I found a very interesting post on Doug's Oracle blog about issues faced while setting this parameter.

Wednesday, March 26, 2008

OS Watcher (OSW) and Lite Onboard Monitor (LTOM)

While searching for something else, I came across Metalink Note 370936.1 (Previous Announcements from New in the Knowledge Base). What captured my attention was this:

December 12, 2007 - Oracle's Center of Expertise Releases New Documents

The following four new white papers have just been released by Oracle's Center of Expertise:

I checked the second and third white papers, both of which are written by Roger Snyde from Oracle Support's Center of Expertise. These white papers describe a tool called OSW (OS Watcher) . Oracle Support’s Center of Expertise has developed OSWatcher, a script-based tool for Unix and Linux systems that runs and archives output from a number of operating system monitoring utilities, such as vmstat, top, iostat, mpstat and ps.

OSWatcher is available from Metalink as note 301137.1. It is a shell script tool and will run on Unix and Linux servers. It operates as a background process and runs the native operating system utilities at user-settable intervals, by default 30 seconds, and retains an archive of the output for a user settable period, defaulting to 48 hours. This value may be increased in order to retain more information when evaluating performance, and to capture baseline information during important cycle-end periods.

Oracle recommends customers download and install OSWatcher on all production and test servers that need to be monitored.

While going through 301137.1, I found the mention of another tool called LTOM(The embedded Lite Onboard Monitor): To collect database metrics in addition to OS metrics consider running LTOM. The Lite Onboard Monitor (LTOM) is a java program designed as a real-time diagnostic platform for deployment to a customer site. LTOM differs from other support tools, as it is proactive rather than reactive. LTOM provides real-time automatic problem detection and data collection. LTOM runs on the customer's UNIX server, is tightly integrated with the host operating system and provides an integrated solution for detecting and collecting trace files for system performance issues. The ability to detect problems and collect data in real-time will hopefully reduce the amount of time it takes to solve problems and reduce customer downtime.

Both OSW and LTOM now provide a graphing utility to graph the data collected. This greatly reduces the need to manually inspect all the output files.

Sample graph from OSW:

Sample graphs from LTOM:




Tuesday, March 25, 2008

E-Business Suite Diagnostics XML Files

The latest E-Biz Diagnostic patch is 6459585 (E-Business Suite Diagnostic Pack February 2008). I came across metalink note 357745.1:

This document contains the most recent versions of the xml files used by diagnostics tools within Oracle Diagnostics. XML files are used by Oracle Support to hold data that requires being regularly updated such as maintainance packs, family packs, patches and other supporting data. These XML files are used by diagnostics tools executable in Oracle Diagnostics to verify for example if a particular family pack is installed.

In order for diagnostics tools to check your environment against the latest information available from Oracle, these xml files are updated and released on an ongoing basis. This means that the latest available xml files will always be available from this note while the latest E-Business Suite diagnostics patch will include the latest available xml files at the time of the patch build.

So, it is a good practice to visit this metalink note, if you are applying the latest diagnostic pack. As a post step you can download the latest xml files from this note and unzip them in your $IZU_TOP/html directory.

Monday, March 24, 2008

Xlib: Invalid MIT-MAGIC-COOKIE-1 key

When I executed xhost + command on a Solaris box as root. I got the following error:

root@apps11i # DISPLAY=apps11i:1
root@apps11i # export DISPLAY
root@apps11i # xhost +
Xlib: connection to "apps11i:1.0" refused by server
Xlib: Invalid MIT-MAGIC-COOKIE-1 key
xhost: unable to open display "apps11i:1"

This error is the correct behavior because xhost + command can be executed by that user who has started the X session. Solaris is protecting a particular user's processes by disallowing other users to control it. In my case it was a vnc session started through a different user. Once I logged in as that user

root@apps11i # su - applmgr
applmgr@apps11i # xhost +
access control disabled, clients can connect from any host

Thursday, March 20, 2008

adsrtal.sh gives error: Cannot Complete Applications Logon

Anand pinged me this error:

adapcctl.sh start

adapcctl.sh version 115.54

Cannot complete applications logon. You may have entered an invalid applications password, or there may have been a database connect error.
Apache Web Server Listener is not running.
Starting Apache Web Server Listener (dedicated HTTP) ...
Cannot complete applications logon. You may have entered an invalid applications password, or there may have been a database connect error.
Cannot complete applications logon. You may have entered an invalid applications password, or there may have been a database connect error.

Reason was that GUEST user was end dated. Once that was updated as null, apache started fine.

Metalink Note:363662.1 has the answer:


Symptoms

On 11.5.X
When attempting to start services using adstrtal.sh
the following error occurs.

ERROR
adapcctl.sh version 115.XX


Cannot complete applications logon. You may have entered an invalid applications password, or
there may have been a database connect error.
Stopping Apache Web Server Listener (dedicated HTTP) ...
Cannot complete applications logon. You may have entered an invalid applications password, or
there may have been a database connect error.

adapcctl.sh: exiting with status 0

.end std out.

This error can occur with the following scripts:
adapcctl.sh
adfrmctl.sh
adrepctl.sh
adfmcctl.sh
adfmsctl.sh




STEPS
The issue can be reproduced at will with the following steps:
Running adstrtal.sh or any of the other scripts listed above





Cause

Guest password was changed or end dated




Solution

To implement the solution, please execute the following steps:

1. Use the System Administrator Responsibility with the navigation of Security/User/Define and check to see if the GUEST user has been end dated. If the user has been end dated, clear the ending date field so the GUEST user account is useable again. If the user has not been end dated, then proceed to step 2.

2. Check the system profile option of "Guest User Password" and confirm that it is correct by attempting to login to applications using that username and password.

3. Confirm that the guest password in the '.dbc' files within the $FND_TOP/secure directory matches both the password specified by the system profile option and verified by step 2 above.

To reiterate:
The Guest username and password combination should match in all three locations:

--> .dbc file
--> FND_USER for Guest
--> system profile option %GUEST%

Wednesday, March 19, 2008

FND_INSTALL_PROCESSES correct values for STATUS, CONTROL_CODE

FND_INSTALL_PROCESSES is the table which stores the details of all the adworkers current state. The two columns which store the current state of each worker are:

CONTROL_CODE and STATUS

Valid values for CONTROL_CODE
W - Waiting

Vaid values for STATUS

F - Failed
R - Running
W - Waiting
C - Completed

Will update this post as I dig more.

ksdpec: called for event 13740 prior to event group initialization

Anand pinged me today to ask why the following lines were appearing in alert log:

SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization

Answer is given in Metalink Note 342405.1

"Event 13740 is an event id used when tracing ASH. It is detected when mapping the SGA during startup, so this message is only informational. There is no need to worry. There is no way
currently in the code of disabling this message"

ASH is Active Scripting History which is a new feature in 10g. More details about this feature are available in metalink note 243132.1:

Performance tuning and problem diagnosis are the two most challenging and important management tasksthat any database administrator performs. In line with the primary drive of the server manageability effort, the Autometic Database Diagnostic Monitor (ADDM) attempts to make, performing these two tasks, a lot simpler and easier. ADDM employs an iterative top-down approach and drives a rule-based expert system, to identify bottlenecks in a system and suggest relevant recommendations to tackle them.

ASH acquires the information it requires to sample the active session’s activity from the database kernel’s session state objects. The quantity of information sampled by ASH could be quite voluminous, and therefore, ASH maintains a fixed sized circular buffer in the database System Global Area (SGA). The fixed sized circular buffer will be allocated during database start-up time. Since the information collected by the ASH infrastructure compliments the information present in SWRF snapshots, and can be used for drilldown purposes during problem diagnosis or performance tuning, the ASH data is also periodically flushed to disk.

The flushing and purging policies of ASH, including the way ASH respects SWRF baselines, are
completely tied with SWRF policies. Still, flushing the entire content of ASH on to disk could be too populous to be feasible, and therefore, only one out of every ten active session samples will be flushed todisk.

In addition to ADDM using the ASH to achieve its objectives, the ASH contents will also be displayed in the Oracle Enterprise Manager (EM) / Mozart [OEM-SWRF] performance screen. The graph that summarizes the ASH contents in the EM performance screen will be a stacked graph showing the distribution of the elapsed database time onto various wait times and CPU time, during every minute.

The ASH infrastructure will provide a single V$ view (V$ACTIVE_SESSION_HISTORY), to view the in-memory contents of ASH. The ASH infrastructure will also provide a way to dump the in-memory contents of its circular buffer onto an external file in a human readable format. The ASH dump file can be transported to another system, imported onto a user table, and analyzed using the ADDM in that system.

ASH Memory Size
==============
Size of ASH Circular Buffer = Max [Min [ #CPUs * 2 MB, 5% of Shared Pool Size, 30MB ], 1MB ]

Catagorising the Active Session
============================
i. Present inside a user call
ii. Not a recursive session
iii. Not waiting for the ‘IDLE’ wait-event
iv. If it is a background process, not waiting for its usual timer-event
v. If it is a parallel slave, not waiting for the PX_IDLE wait event.

So will not see any info if a process is waitting for "SQL*Net message from client".

Lets analyze one of the case with ASH
==============================

Let start with ONLINE analysis
========================
I have open two session. first one (SID 16) is running the DML like

SQL> delete test where rowid='AAAKB9AAEAAAAAiAAA';

From second session (SID 15) run the same DML, and it is obvious that second
session will wait for first session to commit.

Lets check the info in V$ACTIVE_SESSION_HISTORY.
------------------------------------------------
Run the following script.
SQL> select SESSION_ID,NAME,P1,P2,P3,WAIT_TIME,CURRENT_OBJ#,CURRENT_FILE#,CURRENT_BLOCK#
from v$active_session_history ash, v$event_name enm
where ash.event#=enm.event#
and SESSION_ID=&SID and SAMPLE_TIME>=(sysdate-&minute/(24*60));

Input is

Enter value for sid: 15
Enter value for minute: 1 /* How many minutes activity you want to see */

output is 59 lines as it is wiatting more than 1 minute more than 1 minute

SESSION_ID NAME P1 P2 P3 WAIT_TIME CURRENT_OBJ# CURRENT_FILE# CURRENT_BLOCK#
---------- ------------------------------ ---------- ---------- ---------- ---------- ------------ ------------- --------------
15 enq: TX - row lock contention 1415053318 589825 143 0 41085 4 34
15 enq: TX - row lock contention 1415053318 589825 143 0 41085 4 34
15 enq: TX - row lock contention 1415053318 589825 143 0 41085 4 34
continue .............
15 enq: TX - row lock contention 1415053318 589825 143 0 41085 4 34
15 enq: TX - row lock contention 1415053318 589825 143 0 41085 4 34
15 enq: TX - row lock contention 1415053318 589825 143 0 41085 4 34

So you have object details with problem info.


Lets do the OFFLINE analysis of ASH
==============================
So if your are not able to find the problem online, you can dump the ASH to a trace.

Command would be like below: where level means minute. lets dump for 10 minutes history

1. SQL> alter session set events 'immediate trace name ashdump level 10';
or
2. SQL> alter system set events 'immediate trace name ashdump level 10';
or
3. SQL> oradebug setmypid
SQL> oradebug dump ashdump 10;

So you will get the trace file in udump.

Output of trace would be like

<<>
2594829169,1,161390,"07-18-2003 16:05:21.098717000",13,1,0,"",65535,0,0,2,0,0,0,4294967295,0,0,2,35,100,0,0,10
05855,0,"oracle@usunrat21 (MMNL)","","",""

Oracle has provide an utility under $ORACLE_HOME/rdbms/demo (Location may change), by which you can
upload the ASH trace dump to a database table and do the analysis.

Script Details
============
1. "ashdump_loader" -> Main script to run with one argument. Argument is the the name of tracefile.
2. "ashdump_table.sql" -> It would be called by the main script to create the table
called SYSTEM.ACTIVE_SESSION_HISTORY_DUMP.
3. "ashdump_sqlldr.ctl" -> It is the third script called by main script to load the trace into
table called SYSTEM.ACTIVE_SESSION_HISTORY_DUMP.

You must use the same oracle version to compare the OFFLINE trace with the other view.

Tuesday, March 18, 2008

ORA-7445 in alert log when you open Project Quick Entry form

In one of the 11.5.10.2 instances which was upgraded to 10.2.0.3 Database, the following behavior was observed:

Whenever Project Quick Entry form was accessed, it gave the error:

FRM-40735: ON-ERROR trigger raised unhandled exception ORA-01001.

APP-FND-01564: error 1000 in audfrm

Cause: audfm failed due to ORA-01000 maximum open cursors exceeded
The SQL statement being executed at the time of the error was &SQLSTMT and was executed from the file &ERRFILE.

In alert log the following ORA-7445 was logged:

ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [drurnew()+184] [SIGSEGV] [Address not mapped to object] [0x000000040] [] []
Current SQL statement for this session:
INSERT INTO PA_PROJECT_CTX_SEARCH (PROJECT_ID, CTX_DESCRIPTION, PROGRAM_REQUEST_ID, PROGRAM_APPLICATION_ID, PROGRAM_ID, PROGRAM_UPDATE_DATE, CREATED
_BY, CREATION_DATE, LAST_UPDATED_BY, LAST_UPDATE_DATE, LAST_UPDATE_LOGIN) VALUES
(:B2 , :B1 , NULL, NULL, NULL, NULL, FND_GLOBAL.USER_ID, SYSDATE, FND_GLOBAL.US
ER_ID, SYSDATE, FND_GLOBAL.LOGIN_ID)
----- PL/SQL Call Stack -----
object line object
handle number name
3ec258df8 5 CTXSYS.TEXTINDEXMETHODS
3ec258df8 625 CTXSYS.TEXTINDEXMETHODS
----- Call Stack Trace -----

We logged an SR for this issue and got this response:

CAUSE DETERMINATION
===================
This is unpublished Bug 5587976.

The error is due to a problem with the intermedia functionality.

CAUSE JUSTIFICATION
===================
BUG 5587976 : APPSST: ORA-07445: EXCEPTION ENCOUNTERED: CORE DUMP [DRURNEW()+129] [SIGSEGV]

PROPOSED SOLUTION(S)
====================
Download and install Patch 5587976

Bug has been fixed in 10.2.0.4 patchset

PROPOSED SOLUTION JUSTIFICATION(S)
==================================
SURE 461767.1 : ORA-7445 [Drurnew()] Encountered During Applications Upgrade

SOLUTION / ACTION PLAN
======================
-- To implement the solution, please execute the following steps::
Download and install Patch 5587976

Even after applying this patch the issue was occuring. The DBA team ran gather schema statistics on PA schema from backend.

execute fnd_stats.gather_schema_stats('PA');

After this the issue was resolved. This issue was not occuring in dev/test instances which did not have this patch. It is possible that the error was coming solely due to the absence of statistics.

Monday, March 17, 2008

Blue Screen of Death IRQL_NOT_LESS_OR_EQUAL

For the past few days, I was getting the blue screen of death whenever I shutdown my XP laptop. The error was IRQL_NOT_LESS_OR_EQUAL. After googling and trying to decipher it, I found that the most probable reason is a bad driver. Some more searches later, I found that it is caused by some versions of Logitech Mouseware. So I decided to take a chance and removed Logitech Mouseware from my laptop. After a reboot, the blue screen stopped coming whenever I shutdown the laptop. The MS Knowledge base page which describes this issue is: http://support.microsoft.com/kb/810980/en-us

Friday, March 14, 2008

java.sql.SQLException: ORA-06502: PL/SQL: numeric or value error: character to number conversion error

We got this error in Apache jserv.log file:

[Oct 31, 2007 10:19:12 AMEDT]: 1193840352057:Thread[Thread-1045,10,main]:-1:-1:ebiz11i.justanexample.com:192.168.2.32:8168:16680:UNEXPECTED:[fnd.framework.OAExcept
ion]:Application: FND, Message Name: FND_GENERIC_MESSAGE. Tokens: MESSAGE = java
.sql.SQLException: ORA-20001: Oracle error -6502: ORA-06502: PL/SQL: numeric or
value error: associative array shape is not consistent with session parameters

has been detected in fnd_global.put(PERMISSION_CODE,FND_PERMIT_0001).

ORA-06512: at "APPS.APP_EXCEPTION", line 72
ORA-06512: at "APPS.FND_GLOBAL", line 240
ORA-06512: at "APPS.FND_GLOBAL", line 428
ORA-06512: at "APPS.FND_GLOBAL", line 447
ORA-06512: at "APPS.FND_GLOBAL", line 2376
ORA-06512: at "APPS.FND_SESSION_MANAGEMENT", line 821
ORA-06512: at "APPS.FND_SESSION_MANAGEMENT", line 1127
ORA-06512: at "APPS.FND_AOLJ_UTIL", line 421

ORA-06512: at line 1

We applied DB patch 5890966.

Oracle advised us to follow the steps given in unpublished metalink note 388315.1


Symptoms

When user tries to log in to Applications this does not succeed, but raises the error

You have encountered an unexpected Exception. Please contact system administrator for assistance.

The stack trace reported when selecting Click here for exception details (*) and log entries in the jserv.log report the following error:


java.sql.SQLException: ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at "APPS.FND_SESSION_MANAGEMENT", line 1800
ORA-06512: at "APPS.FND_SESSION_MANAGEMENT", line 302
ORA-06512: at "APPS.FND_SESSION_MANAGEMENT", line 455
ORA-06512: at line 1

(*) This link is only shown when FND: Diagnostics profile is set to Yes for this user.

Cause
This error is raised when the profile ICX_SESSION_TIMEOUT [ICX:Session Timeout] has a decimal value (e.g. 30.33333333) instead of an integer value (e.g. 30). When an user has languages setting which use ',' instead of '.' as decimal separator (like FR, NL, DE) the STRING to NUMBER conversion for this profile option value fails with the ORA-6502 PL/SQL: numeric or value error: character to number conversion error seen here.

The value for ICX_SESSION_TIMEOUT profile is set when running Autoconfig. In the script afwebprf.sql the following calculation is done to obtain the value.

(%s_sesstimeout%)/(60*1000)

This is to make the ICX_SESSION_TIMEOUT equal to jserv session timeout.

The default value for %s_sesstimeout% is 1800000 which gives a result of '30' which is fine. However when a different value like 2000000 is used here this calculation will result in '33.33333333' which is not correct. The value should be truncated so an integer value is saved here. This behavior has been reported in bug 5478368.

Solution
The error can be resolved by making sure that profile ICX_SESSION_TIMEOUT [ICX:Session Timeout] has an integer value. When it's not possible to run Autoconfig the following work-around can be implemented.

Open the maintenance form for the profile options and retrieve the ICX:Session Timeout profile

Remove the decimal part from the value, like '33.33333' becomes '33'

Save the changes

Bounce Apache server

However since the profile option is calculated during an Autoconfig run the same error will reappear when Autoconfig is run.

The permanent solution is the following

Open the context editor to maintain the .xml file storing the values used by Autoconfig

Change the value for %s_sesstimeout% to the default 1800000 or a value for which the calculation also returns an integer value (like 1860000 = 31, 1920000 = 32, etc)

Run Autoconfig to have this change implemented

Bounce Apache server

Thursday, March 13, 2008

FRM-407530 ORA-06512 after implementing shared APPL_TOP

Recently an instance was upgraded from 11.5.8 to 11.5.10 and shared APPL_TOP was implemented. After the shared appl_top steps, forms wouldn't launch. Every time a form was accessed it would throw up the FRM-40735 ORA-06512 error. If the forms were manually compiled, they would not compile and throw up errors. After a lot of research, it was found that all the forms were calling a CUSTOM binary which was not copied by the merge process of shared APPL_TOP as it was not a part of core Apps. Once this binary was copied, all the forms could be compiled without issues.

Wednesday, March 12, 2008

ORA-7445 in alert log, due to FRM-40735 ORA-01001 in Projects Quick Entry form

Yesterday, I discussed with Akhilesh about an issue being faced in one of the production instances, which has started logging ORA-7445 errors whenever projects quick entry form is accessed which throws FRM-40735: ON-ERROR trigger raised unhandled error: ORA-01001. We analyzed the trace files generated by ORA-7445 :

ORA-07445: exception encountered: core dump [drurnew()+184] [SIGSEGV] [Address n
ot mapped to object] [0x000000040] [] []
Current SQL statement for this session:
INSERT INTO PA_PROJECT_CTX_SEARCH (PROJECT_ID, CTX_DESCRIPTION, PROGRAM_REQUEST_
ID, PROGRAM_APPLICATION_ID, PROGRAM_ID, PROGRAM_UPDATE_DATE, CREATED_BY, CREATIO
N_DATE, LAST_UPDATED_BY, LAST_UPDATE_DATE, LAST_UPDATE_LOGIN) VALUES (:B2 , :B1
, NULL, NULL, NULL, NULL, FND_GLOBAL.USER_ID, SYSDATE, FND_GLOBAL.USER_ID, SYSDA
TE, FND_GLOBAL.LOGIN_ID)
----- PL/SQL Call Stack -----
object line object
handle number name
3ec258df8 5 CTXSYS.TEXTINDEXMETHODS
3ec258df8 625 CTXSYS.TEXTINDEXMETHODS
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------

A query of ORA-7445/ORA-600 lookup utility revealed that this is a known bug # 5587976. This bug is triggered by a bad return from some NLS functions. This fix signals such
problems as an NLS error rather than causing the process to dump.

However we are still exploring why this error is not occuring in Production support instances and have asked Oracle to explain the bug in more detail.

Tuesday, March 11, 2008

The requested URL /oa_servlets/Appslogin was not found on this server.

While browsing through open TARs, I found one raised by the DBAs for an issue after upgrading an 11.5.8 instance to 11.5.10.2 and then implementing shared application file system. Here's the issue:

Go to: http://clone1.justanexample.com:8000
Click on the link: E-Business Home Page.
You get this error:

Not Found

The requested URL /oa_servlets/Appslogin was not found on this server.

HTTP-404 or not found error.

I had solved a similar problem which I have described in a previous post. So I worked on the same lines and found that OAM worked fine without issues. I then checked the value of s_login_page by giving the command:

grep s_login_page $CONTEXT_FILE

It was correctly set to
http://clone1.justanexample.com:8000/oa_servlets/Appslogin

I was wondering what else could be wrong, when I looked again. Appslogin was spelled with a small l. Appslogin should be spelt AppsLogin with capital L. I immediately tested for the URL by replacing small l with capital L:

http://clone1.justanexample.com:8000/oa_servlets/AppsLogin

It worked fine. So solving the problem would require changing the value of s_login_page so that it contains AppsLogin instead of Appslogin and then running autoconfig. A very simple problem, but difficult to trace, unless you train your mind to observe.

Monday, March 10, 2008

Minimize usage of utl_file_dir

I found a very good metalink note 206272.1 about utl_file package. They have recommended using CREATE DIRECTORY feature instead of using utl_file_dir directories which needs the DB to be bounced every time a new directory is added. Here's a copy paste from the note:

PURPOSE
Introduces the Oracle 9i Release 2 way of using the UTL_FILE package.

SCOPE & APPLICATION

This article is intended for programmers who are familiar with the usage of the
UTL_FILE package.

NEW METHOD OF USING UTL_FILE PACKAGE IN 9.2 9iR2

In the past(i.e. pre 9iR2), accessible directories for the UTL_FILE functions were specified in the initialization file using the UTL_FILE_DIR parameter. However ,UTL_FILE_DIR access is not recommended in 9iR2. It is recommended that you use the CREATE DIRECTORY feature, which replaces UTL_FILE_DIR. Directory objects offer more flexibility and granular control to the UTL_FILE application administrator, can be maintained dynamically (that is, without shutting down the database), and are consistent with other Oracle tools.CREATE DIRECTORY
privilege is granted only to SYS and SYSTEM by default.

EXAMPLE
-------
Step 1) Connect as sys

Step 2) Create a directory:

SQL> CREATE DIRECTORY log_dir AS 'C:\NEW';

Directory created.

Step 3) Grant read and write on the directory

SQL> grant read on directory log_dir to public;
SQL> grant write on directory log_dir to public;

Grant succeeded.

Step 4) Create function as a NON-SYS USER

SQL>CONNECT SCOTT/

create or replace function f1 return number is
v_output_file1 utl_file.file_type;
begin
v_output_file1 := utl_file.fopen('LOG_DIR', 'NEW.txt', 'a');
utl_file .put_line(v_output_file1, 'NATURE and Beauty');
utl_file.fclose_all;
return 1;
end;
/

Step 6) Create a bind variable

SQL>var x number

Step 7) Execute the function

SQL> exec :x:=f1;

PL/SQL procedure successfully completed.

NOTE
Care must be taken to use uppercase for the DIRECTORY object in the call to
utl_file.fopen, else you may encounter the ORA-29280: invalid directory path
error.

DBA_DIRECTORIES is the dictionary table which can give you details about all the directory objects created till now.

select directory_name,directory_path
from dba_directories;

The above query will give you the details of all directory object names and the paths associated with them.

There is no command called ALTER DIRECTORY yet. To modify a directory path you need to give the command:

CREATE OR REPLACE DIRECTORY directory_name as 'path'

Post ATG_PF.H RUP6 patches

Following is the list of post ATG_PF.H RUP6 patches which I recommend:


Post ATG_PF.H RUP6 patches

Size in KB

6168660

AOL Patch

694

6318531

AOL Patch

285

6145463

AOL Patch

275

4619025

Patch to fix invalid

22

3865683

AD patch

283

5194357

XML Gateway patch

1242

6496781

AOL Patch

288

6265836

AOL Patch

40888

6620452

AOL Patch

46048

6412999

Workflow

296


The list is based on metalink search for keywords RUP7, post RUP6 and metalink note 459353.1

Friday, March 7, 2008

Apps binaries coring after cloning instance on Solaris 8 to Solaris 10

After cloning an instance which is on Solaris 8 to an instance which is on Solaris 10, various binaries are coredumping and creating huge core files in /var/crash/cores:

# ls -lh
total 22662940
-rw------- 1 root root 11M Feb 29 17:53 1204325626-ARGLTP-10853-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 17:54 1204325682-ALECDC-14421-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 18:16 1204326978-ALECDC-22443-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 18:37 1204328275-ALECDC-1648-66128-54055-devserver-sun4u
-rw------- 1 root root 11M Feb 29 18:53 1204329219-ARGLTP-24665-66128-54055-devserver-sun4u
-rw------- 1 root root 19M Feb 29 18:55 1204329356-PAVDVC-2528-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 18:59 1204329554-ALECDC-16012-66128-54055-devserver-sun4u
-rw------- 1 root root 14M Feb 29 19:04 1204329851-PARGDR-11673-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 19:20 1204330841-ALECDC-28963-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 19:42 1204332128-ALECDC-11910-66128-54055-devserver-sun4u
-rw------- 1 root root 11M Feb 29 19:53 1204332829-ARGLTP-20966-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 20:03 1204333423-ALECDC-24081-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 20:24 1204334694-ALECDC-4385-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 20:46 1204335982-ALECDC-13323-66128-54055-devserver-sun4u
-rw------- 1 root root 11M Feb 29 20:54 1204336441-ARGLTP-6842-66128-54055-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 21:07 1204337250-ALECDC-20612-66128-54055-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:21 1204338103-MSCSDW-10817-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:22 1204338132-MSCSDW-10747-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:22 1204338177-MSCSLD.new-10840-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:24 1204338281-MSCSLD.new-21677-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:24 1204338285-MSCSLD.new-22151-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:27 1204338427-MSCSLD.new-27500-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:27 1204338428-MSCSLD.new-28456-63904-63988-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:28 1204338491-MSCSDW-10741-63904-63988-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 21:29 1204338539-ALECDC-4860-66128-54055-devserver-sun4u
-rw------- 1 root root 35M Feb 29 21:31 1204338662-MSCSDW-10748-63904-63988-devserver-sun4u
-rw------- 1 root root 9.9M Feb 29 21:50 1204339826-ALECDC-19716-66128-54055-devserver-sun4u
-rw------- 1 root root 11M Feb 29 21:53 1204340029-ARGLTP-236-66128-54055-devserver-sun4u
-rw------- 1 root root 35M Feb 29 22:00 1204340414-MSCSLD.new-22508-63904-63988-devserver-sun4u

I tried executing these binaries from command line to see if they crashed, but they worked fine:

# PASGLT

Usage: Example:
PASGLT PASGLT
applsys/fnd
0 0
Y Y
07-JUN-91
10001
7
18:56:00
12-AUG-91
Y
[] Y

Note:
Batch name if blank ("") means all elegible batches.
All reschedule information can be blank.

ORACLE error 1012 in FDPCLS

Cause: FDPCLS failed due to ORA-01012: not logged on.

The SQL statement being executed at the time of the error was: and was executed from the file .


# $ALR_TOP/bin/ALECDC
You have not provided the required arguments for this program.
Usage: ALECDC 0 Y
appst117@tsgsd1007 # $AR_TOP/bin/ARGLTP
Usage: /ebiz11i/erpapp/appl/ar/11.5.0/bin/ARGLTP oracleid/password 0 Y
start_date(Canonical) post_thru_date(Canonical) gl_posted_date(Canonical) Canonical=YYYY/MM/DD HH24:MI:SS
report_only(Y|N) summary(Y|N) run_je_imp(Y|N) day_interval(POSITIVE INTEGER)
posting_control_id [debug_flag(Y|N|I)]
ARGLTP: Too few arguments.
Program exited with error.
ORACLE error 1012 in FDPCLS

Cause: FDPCLS failed due to ORA-01012: not logged on.

The SQL statement being executed at the time of the error was: and was executed from the file .
appst117@tsgsd1007 # $PA_TOP/bin/PAVDVC
Wrong number of parameters (0) received.

Usage: Example:
applsys/fnd
0 0
Y Y
4323
12-MAR-91
7
18:56:00
12-AUG-91
Y
[] Y

pavini() returned error
/ebiz11i/erpapp/appl/pa/11.5.0/bin/PAVDVC /ebiz11i/erpapp/appl/pa/11.5.0/bin/PAVDVC
Program exited with error.
ORACLE error 1012 in FDPCLS

Cause: FDPCLS failed due to ORA-01012: not logged on.

The SQL statement being executed at the time of the error was: and was executed from the file .

# $PA_TOP/bin/PARGDR
Failed to get arguments
Program exited with error.
ORACLE error 1012 in FDPCLS

Cause: FDPCLS failed due to ORA-01012: not logged on.

The SQL statement being executed at the time of the error was: and was executed from the file .

All the coredumping binaries belong to a single instance. I have suggested to the DBAs to relink all binaries through adadmin and see if the cores stop getting created. 

Metalink Note 456008.1 has information about relinkning ALECDC:

adrelink.sh force=y ranlib=y link_debug='Y' "alr ALECDC"

Metalink Note 467886.1 has information about relinking XDPCTRLS:

adrelink.sh force=y ranlib=y "xdp XDPCTRLS"

Top command CPU column, sum of all rows exceeds 100%

Recently during a performance call, someone asked me why the cpu column in a top command output doesn't total to 100%:

load averages: 16.9, 17.9, 19.3; up 63+02:24:38 18:19:12
2616 processes: 2587 sleeping, 10 zombie, 4 stopped, 15 on cpu
CPU states: 36.6% idle, 42.2% user, 21.2% kernel, 0.0% iowait, 0.0% swap
Memory: 96G phys mem, 29G free mem, 16G swap, 16G free swap

PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
1686 oraerp01 1 0 4 2820M 1532M cpu 539:23 99.65% oracle
5784 oraerp01 1 38 4 2955M 1473M cpu 674:42 99.64% oracle
20496 oraerp01 1 0 4 4353M 1345M cpu 50.8H 98.98% oracle
28909 oraerp01 1 0 4 5056M 1420M cpu 88.3H 98.29% oracle
27438 oraerp01 1 0 4 1222M 1201M cpu 72:33 97.17% oracle
23596 oraerp02 1 0 0 3242M 2236M cpu 10:29 95.91% oracle
2768 oraerp02 1 0 0 3240M 2231M cpu 52:19 95.72% oracle
22037 oraerp02 1 60 0 3239M 2233M sleep 0:38 57.39% oracle
18232 oraerp02 1 0 0 3239M 2234M cpu 1:29 51.90% oracle
18261 oraerp03 1 0 0 3280M 2271M sleep 1:24 51.62% oracle
23563 oraerp02 1 1 0 3256M 2251M cpu 0:21 36.48% oracle
20855 oraerp02 1 21 0 3245M 2239M cpu 0:21 34.23% oracle
24205 oraerp02 1 10 0 3239M 2234M cpu 0:06 23.47% oracle
23536 oraerp02 1 52 0 3275M 2260M sleep 11:21 16.36% oracle
20283 oraerp02 1 53 0 3238M 2228M sleep 14:58 13.82% oracle

The box from which the above is run is a Sun E2900 with 24 cores and 96 GB RAM. I told that person that, top is actually showing the percentage of a single CPU being used by that process. I have tried to verify from various sources whether my answer was correct and haven't come up with anything conclusive. Will keep looking for the real answer, which may involve going into the source code of top command.

Thursday, March 6, 2008

LD_LIBRARY_PATH_64

On Solaris SPARC, the two separate dynamic linker programs for 32-bit applications and for 64-bit applications are: /usr/lib/ld.so.1 and /usr/lib/sparcv9/ld.so.1.

In x86, for the AMD64 architecture, the dynamic linker programs for 32-bit applications and 64-bit applications are: /usr/lib/ld.so.1 and /usr/lib/amd64/ld.so.1.

At runtime, both dynamic linkers search the same list of colon-separated directories specified by the LD_LIBRARY_PATH environment variable. However, the 32-bit dynamic linker binds only to 32-bit libraries, while the 64-bit dynamic linker binds only to 64-bit libraries. So directories containing both 32-bit and 64-bit libraries can be specified via LD_LIBRARY_PATH, if needed.

The 64-bit dynamic linker's search path can be completely overridden using the LD_LIBRARY_PATH_64 environment variable.

If both LD_LIBRARY_PATH and LD_LIBRARY_PATH_64 are defined, 32-bit linking will be done using LD_LIBRARY_PATH, and 64-bit linking with LD_LIBRARY_PATH_64.

Monday, March 3, 2008

ORA-04063: PACKAGE BODY "SYS.DBMS_UTILITY" HAS ERRORS

At 7 AM on 29th February, I got an SOS call for issues on an instance. After talking to the DBAs, I checked the alert log file:

Doing block recovery for file 1325 block 84015
Block recovery from logseq 2834, block 424302 to scn 10103409529653
Fri Feb 29 06:05:45 2008
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2834 Reading mem 0
Mem# 0: /justanexample/redo1/log1a.dbf
Mem# 1: /justanexample/redo1/log1b.dbf
Block recovery completed at rba 2834.424312.16, scn 2352.1646449464
Doing block recovery for file 453 block 55097
Block recovery from logseq 2834, block 396285 to scn 10103409529653
Fri Feb 29 06:05:45 2008
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2834 Reading mem 0
Mem# 0: /justanexample/redo1/log1a.dbf
Mem# 1: /justanexample/redo1/log1b.dbf
Block recovery completed at rba 2834.424312.16, scn 2352.1646449464
Doing block recovery for file 1418 block 35982
Block recovery from logseq 2834, block 420640 to scn 10103409529653
Fri Feb 29 06:05:45 2008
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2834 Reading mem 0
Mem# 0: /justanexample/redo1/log1a.dbf
Mem# 1: /justanexample/redo1/log1b.dbf
Block recovery completed at rba 2834.424312.16, scn 2352.1646449464
Fri Feb 29 06:05:49 2008
Errors in file /justanexample/oracle/10.2.0/admin/justanexample_justanexample/bdump/justanexample_j000_9484.trc:
ORA-12012: error on auto execute of job 3404
ORA-04063: package body "SYS.DBMS_UTILITY" has errors
ORA-06508: PL/SQL: could not find program unit being called: "SYS.DBMS_UTILITY"
ORA-06512: at "SYS.DBMS_IREFRESH", line 62
ORA-06512: at "SYS.DBMS_REFRESH", line 15
ORA-06512: at "SYS.DBMS_REFRESH", line 195
ORA-06512: at line 1
Fri Feb 29 06:05:49 2008
Errors in file /justanexample/oracle/10.2.0/admin/justanexample_justanexample/bdump/justanexample_j001_16186.trc:
ORA-12012: error on auto execute of job 4155
ORA-04063: package body "SYS.DBMS_UTILITY" has errors
ORA-06508: PL/SQL: could not find program unit being called: "SYS.DBMS_UTILITY"
ORA-06512: at "APPS.WF_EVENT", line 1160
ORA-06512: at "APPS.WF_BES_CLEANUP", line 482
ORA-06512: at line 1

When I checked dba_registry, I found that Oracle packages and types had become invalid.

I asked the DBAs to follow Metalink Note 457861.1:

SQL > @?/rdbms/admin/catalog.sql
SQL > @?/rdbms/admin/catproc.sql
SQL > @?/rdbms/admin/utlrp.sql

After running catalog and catproc, which invalidated all dependencies the no. of invalid objects reached 160,000. utlrp.sql took 10 hours to complete with 24 parallel threads. Even after this 16 objects in sys were invalid. These were types:

TYPE AQ$_DESCRIPTOR
TYPE AQ$_DESCRIPTOR
TYPE AQ$_EVENT_MESSAGE
TYPE AQ$_REG_INFO
TYPE AQ$_REG_INFO
TYPE AQ$_REG_INFO_LIST
TYPE AQ$_REG_INFO_LIST
TYPE AQ$_SRVNTFN_MESSAGE
TYPE AQ$_SRVNTFN_MESSAGE
TYPE AQ$_SRVNTFN_MESSAGE
TYPE DBMS_XPLAN_TYPE
TYPE DBMS_XPLAN_TYPE_TABLE
TYPE LCR$_ROW_LIST
TYPE LCR$_ROW_UNIT TYPE MSG_PROP_T
TYPE XMLGENFORMATTYPE

We had to reset these types manually with the command:
Alter type owner.typename reset;

We are still investigating the root cause for this issue.