Jump to content


Photo
- - - - -

Waitng for database - file - full function


19 replies to this topic

#1 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 21 July 2004 - 02:08 PM

Gang,

One of our clients had their ProIV 5.5, AIX, Oracle 8.1.7 hang completely last night.

All transactions were committing, however, the app hung after the commit and before returning processing to the end user. It did not appear to be a record lock situation because the commits did occur.

The only message of note was:

WAITING FOR DATABASE RESPONSE - FILE - Full function

This message appeared for multiple users.

The client was able to add rows to tables outside of ProIV and COMMIT changes without any problem.

The log file in bdump showed no errors. Also, no system errors showed up.

The database did not shut down cleanly and needed to be aborted. After bringing the database back up, everything seems fine. Of course, the question looms - what caused this and will it happen again.

Oracle dutifully responded that since the message appeared in ProIV, it could not be an Oracle issue...

Has anyone dealt with this?

Thanks,

Joseph

#2 Cleve Haynes

Cleve Haynes

    Expert

  • Members
  • PipPipPipPip
  • 172 posts
  • Gender:Male

Posted 21 July 2004 - 02:23 PM

Hi

What is the full function mode SQL doing? Just a select?

It sounds like the session was waiting on a record lock, as generally the "WAITING FOR DATABASE RESPONSE" message is because ProIV is doing a "select... for update" (i.e. your file is in change mode) and can't get a lock on the desired record because another process already has that record locked.

The most common cause I have seen where many users have the "WAITING FOR DATABASE RESPONSE" message is because someone is sitting in a maintenance screen in change mode on a record that is regularly opened in change mode by other transactions.

The other is if a really long process is running and has records in change mode without a commit strategy.

You might want to do a query in Oracle if it happens again to list record locks.

Cleve

#3 Neil Hunter

Neil Hunter

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 414 posts
  • Gender:Male
  • Location:Johannesburg, South Africa

Posted 21 July 2004 - 02:42 PM

Sounds like you have a dead lock on a table or query. Maybe check for dead processes that are running on the AIX side, otherwise a dba should be able to sort you out with figuring out process locks

#4 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 21 July 2004 - 02:54 PM

Neil,

The problem was occuring all over the place. I went into a table that no one else would ever use an added a row to it. The row added, but the session hung after the commit. I know this because I killed the session and the new row was in the table.

When I did this, I did not get the record lock message, but I did hang.

I suspect the record lock message may be a red herring. However, it is interesting in that it did not return a table name, but instead the words Full function.

Regards,

Joseph

#5 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 21 July 2004 - 02:57 PM

Cleve,

Everytime I've seen a record lock in Oracle, the message WAITING FOR DATABASE RESPONSE - FILE - is followed by a table name.

The errors were generated for users in multiple places (against different tables) at once.

Regards,

Joseph

#6 Cleve Haynes

Cleve Haynes

    Expert

  • Members
  • PipPipPipPip
  • 172 posts
  • Gender:Male

Posted 21 July 2004 - 03:42 PM

Everytime I've seen a record lock in Oracle, the message WAITING FOR DATABASE RESPONSE - FILE - is followed by a table name.


If you are using embedded full function mode SQL in the function, then it will say "WAITING FOR DATABASE RESPONSE - FILE - Full Function" if it is waiting on a record lock, or if your SQL query is very slow.

Are you using any embedded SQL in your functions?

Cleve

#7 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 22 July 2004 - 10:34 AM

Hmm..

Had a quick search for "commit does not return" on Metalink and Google and there are results but I didn't quickly find anything that seemed clearly relevant to a problem with ProIV.

The following link seemed kind of close but it was with V7.. and the guy didn't seem to have got any answers!
http://aa11.cjb.net/...00/01/0260.html

My (rather obvious) guess would be you are hitting some (configurable) resource limit in Unix or Oracle but I don't know where to suggest you start looking. If that's the case though, I'm kind of surprised it's not more of a "well-known" problem.

Did you get to the bottom of it already?

I'd also suspect the "full function" bit could be a red herring and that ProIV might be showing that because it has a "generic" waiting message that simplistically says "full function" whenever there's no obviously relevant filedef - and there is no particular filedef associated with a commit call..

Regs, Richard

Edited by Richard Bassett, 22 July 2004 - 10:36 AM.

Nothing's as simple as you think

#8 Neil Hunter

Neil Hunter

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 414 posts
  • Gender:Male
  • Location:Johannesburg, South Africa

Posted 22 July 2004 - 11:25 AM

Oracle will error if tablespaces or rollback segments have reached their limit space wise.

Just did some reading up in my Oracle Admin textbook.

Steps in a commit :

1 Server process generates a SCN number (System Change Number) and is assigned a rollback segment. Then the rollback segment is marked that the transaction is committed

2 The LGWR process writes the redo log buffers to the online redo log files along with the SCN number

3 The server process releases locks held on rows and tables

4 The user is notified that the commit is complete

5 The server process marks the transaction as complete

So i would think either the LGWR process is not running or not working for some unkown reason, your redo log file is inaccesible or corrupt, or a redo log switch is being attempted and can't be done for some reason. Best to probably get a dba in there and sort it out

#9 Richard Bassett

Richard Bassett

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 696 posts
  • Location:Rural France

Posted 22 July 2004 - 01:58 PM

a redo log switch is being attempted and can't be done for some reason

Is it possible redo log archiving fell behind for some reason and hence the database could not switch online redo logs because the next log group in the cycle still had not been [completely] archived?
This is just a thought - I'm not sure I've ever actually seen what happens in that scenario - I would have hoped you'd see a meaningful message or active warnings somewhere!
Nothing's as simple as you think

#10 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 22 July 2004 - 02:41 PM

Cleve,

We have very little embedded SQL. Based on the number of places where users were encountering the error, I doubt that the embedded SQL was in play. Also, our use of embedded SQL is largely limited to reports and look ups, so it should not be creating lock problems.

Regards,

Joseph

#11 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 22 July 2004 - 02:42 PM

Neil,

Thanks for the insight. I've passed that along to the client.

Regards,

Joseph

#12 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 22 July 2004 - 02:50 PM

Richard,

Is it possible redo log archiving fell behind for some reason and hence the database could not switch online redo logs because the next log group in the cycle still had not been [completely]archived?


It's possible. Of course, the problem (thankfully) has not recurred since they bounced the database. Of course, the concern is that whatever the root cause of the issue is was most likely not addressed by simply bouncing the database.

There were no meaningful errors anywhere (bdump, root's mail, system errors, etc.)

What I may have to check though is the possibility that the SQLTIMEOUT could have been a factor. I saw an issue on SQL Server that forced me to yank SQLTIMEOUT.

The client had a large database (>10 gig). The problem was that some transactions were forcing a table to autoextend. When the table could not autoextend within 3 seconds, ProIV was immediately rolling back the operation - without error.

The user lost their transaction every time. SQL Server never reported an error, because no error had occurred.

Perhaps there is a similar potential for problem with setting this variable on the Oracle side. A hypothetical transaction might force Oracle to cycle its redo logs which it cannot complete within 3 seconds.

Does anyone else set the SQLTIMEOUT for Oracle?

Thanks,

Joseph

#13 Chris Mackenzie

Chris Mackenzie

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 368 posts
  • Gender:Male
  • Location:Bristol, United Kingdom

Posted 22 July 2004 - 03:06 PM

You mention checking some logs - di you check the oracle alert log?

That would indicate excessive activity around cycling logfiles, etc, etc.
The content and views expressed in this message are those
of the poster and do not represent those of any organisation.

#14 Joseph Bove

Joseph Bove

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 756 posts
  • Gender:Male
  • Location:Ramsey, United States

Posted 22 July 2004 - 03:37 PM

Chris,

You mention checking some logs - di you check the oracle alert log?



The oracle log I checked was the one in /u01/app/admin/database name/bdump/alert_database_name.log

It did not show anything happening around the time of the hang up. In fact the greatest majority of what was in that log was just the fact that it was cycling through redo logs.

If there are other logs that may be of help, I'd be happy to know what / where they are.

Regards,

Joseph

#15 Neil Hunter

Neil Hunter

    ProIV Guru

  • Members
  • PipPipPipPipPip
  • 414 posts
  • Gender:Male
  • Location:Johannesburg, South Africa

Posted 23 July 2004 - 08:19 AM

Richard

Yes it is possible that its unable to write the redo log, especially if over the network to another machine/backup server. Could also be that the redo log switch is happening to fast for Oracle to catch up. For eg switches to redo log 1 but while writing to redo log 1, redo log 2 needs to be written as well etc etc. This can also happen if archiving of redo logs is setup

Joseph,

Bad idea on the autoextend, resource intensive as you have found out. Rather add more datafiles. With regards to the redo logs, how often are they switching ? Every 5 minutes, once a day ? Also check the alert file for any trace files being written to in udump.

Probably best to get a certified dba in there to do some performance tuning on your database B)



Reply to this topic



  


0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users