Oracle Blog

June 19, 2008

Statspack can hang your database

Filed under: Statspack — Vikram @ 8:13 pm
Tags: ,

It is Thursday evening and am already dreaming of the weekend. I would leave office in another 15 min. I began installing statspack on my production database. Its a straight-forward process. Just run the $ORACLE_HOME/rdbms/admin/spcreate.sql. I had done this over several times on several databases. I was sure this will not take more time and definitely will not throw surprises. I was never this wrong.

The statspack got installed smoothly. What I noticed was that there were eight new invalid objects in the SYS schema. All of them were DBMS_ packages. I quickly started compiling them.

SQL> ALTER PACKAGE '&pack_name' COMPILE BODY;

Wow, i see all of them have compiled without errors. I was happy until I re-checked the invalid objects. There were now eighteen invalid objects in the SYS schema. All were again DBMS_* packages. Most of them were DBMS_AQ* related packages. I missed my heartbeat. This was the production database and almost at the end of the business hours. I didn’t see this issue in the test environment. I didn’t want to end up fixing this at this time.

There is a bug in the metalink that says after installing statspack using spcreate.sql there is a possibility of DBMS_UTILITY package getting invalidated. This bug is for 8.1.7.4 version and on some other platform. But I am on Linux and 9.2.0.4 DB and I still see the similar issue. Thereafter all requests that depend on this package will hang. In my case this package was DBMS_AQ. All Workflow Background Process requests were pending. They were engaging all the Standard Managers and any new concurrent request will join the pending queue. Within 2 hours i saw there were 55 pending requests.I knew the database was hanging.

While all the other invalid packages were validated, the DBMS_AQ validation hangs interminably for almost an hour and then times out. I will have to validate this package and only then will the requests like ‘Workflow Background Process’ will progress. And i can’t validate this package until the requests holding the lock on the object releases it, which is unlikely. Catch-22 situation! I decided I will bounce the application and database.

Convincing the client was going to be a difficult job. Obviously he will ask questions.
Q) Why was the object invalidated?
A) Because we had installed Statspack.
Q) But Statspack is supposed to aid us in diagnosing inherent issues, not to create new issues?
A) Yes…but…you see…there is a bug…

Oh damn all!

The client agreed to bounce the database. But guess what? the trouble is not over yet. The concurrent managers decline to go down smoothly. Because the new ‘Shutdown Concurrent Manager’ request itself joins the Pending queue. What to do? Terminate the requests, and kill the managers’ pids? What if there are any important requests running that will error out due to this? There were a lot of EDI jobs running and the Client was hesitant to let me kill it. Ouph, finally i killed (yes killed!) the managers and terminated all the requests. Brought down all the services, bounced the db, validated the invalid DBMS_AQ package, ran cmclean on the db, brought up the services. The application came back to life with new vigour and zest. The client appeared happy to see his baby playing antics again. I am sure the Client doesn’t remember the issue anymore, but i will.

This brings me to the next question. How can i terminate all the managers when they do not go down smoothly? Especially when there are requests in ‘Running’ status? They will end up with ‘Error’ status upon forceful termination of managers. It should be able to restart the requests from where they ended. Is it possible? Let’s see if it is possible in my future articles.

Moral: If things ought to go wrong, they will.

————————————————

Blog at WordPress.com.