[turn brick home, turn your ORACLE process.%sys]

Environmental Science:
HP-UX 11.11+ORACLE 9205 (HA cold dual)

Failure phenomenon:
Host SAR 110 echo results%sys occupy 90%, other 10% is occupied by%usr, the idle value is 0. According to the top -h echo occupy high CPU PID, according to PID v$process, v$sqltext v$session, check the combination query, fault is usually all connected a 1 ~ 2 client, the session in the inactive state, SQL is also very simple, such as some SQL select 1 from dual. Of course, top -h echo and a lot of Oracle process is the normal state.


The deployment of multiple sets of equipment, different city has 3 sets of equipment for both failure, fault phenomenon, of course these fault database server process corresponding to the client are different machines.

Stop the client, the server process fault will not quit, alter system kill session is unable to process, kill -9 can kill, but the next day, the same phenomenon. The client is connected, no frequent reconnection phenomenon,

v$Session_wait corresponds to the SQL*Net message from client, but normal process corresponding to SQL*Net message from is client, and the client checked IP & PORT, the underlying TCP connection is no problem, is the establish state.

The host tracking abnormal tusc, SIGALRM signals have been printed, process itself should have been dead circulation, but not yet dead, see no reason to.
( Attached to process 21638 ("oracleSZCALLDB (LOCAL=NO)") [64-bit] )
14:10:02 [21638]{5168873} #0 gettimeofday(0x800003ffbfff61d0, NULL) ................................................................... [running]
14:10:02 [21638]{5168873} #0 gettimeofday(0x800003ffbfff61d0, NULL) ................................................................... = 0
14:10:02 [21638]{5168873} #0 setitimer(ITIMER_REAL, 0x800003ffbfff61e0, NULL) ......................................................... [entry]
value.it_interval.tv_sec: 0
value.it_interval.tv_usec: 0
value.it_value.tv_sec: 275
value.it_value.tv_usec: 30000
14:10:02 [21638]{5168873} #0 setitimer(ITIMER_REAL, 0x800003ffbfff61e0, NULL) ......................................................... = 0
value.it_interval.tv_sec: 0
value.it_interval.tv_usec: 0
value.it_value.tv_sec: 275
value.it_value.tv_usec: 30000
14:10:02 [21638]{5168873} #0 sigprocmask(SIG_UNBLOCK, 0x800003ffbfff6130, NULL) ....................................................... [entry]
set: SIGALRM
14:10:02 [21638]{5168873} #0 sigprocmask(SIG_UNBLOCK, 0x800003ffbfff6130, NULL) ....................................................... = 0
set: SIGALRM
oset: NULL
14:10:02 [21638]{5168873} #0 gettimeofday(0x800003ffbfff5f20, NULL) ................................................................... [entry]
14:10:02 [21638]{5168873} #0 gettimeofday(0x800003ffbfff5f20, NULL) ................................................................... = 0
14:10:02 [21638]{5168873} #0 sigprocmask(SIG_BLOCK, 0x800003ffbfff6070, NULL) ......................................................... [entry]
set: SIGALRM
14:10:02 [21638]{5168873} #0 sigprocmask(SIG_BLOCK, 0x800003ffbfff6070, NULL) ......................................................... = 0
set: SIGALRM
oset: NULL
14:10:02 [21638]{5168873} #0 gettimeofday(0x800003ffbfff61d0, NULL) ................................................................... [entry]
14:10:02 [21638]{5168873} #0 gettimeofday(0x800003ffbfff61d0, NULL) ................................................................... = 0
14:10:02 [21638]{5168873} #0 setitimer(ITIMER_REAL, 0x800003ffbfff61e0, NULL) ......................................................... [entry]
value.it_interval.tv_sec: 0
value.it_interval.tv_usec: 0
value.it_value.tv_sec: 275
value.it_value.tv_usec: 10000
14:10:02 [21638]{5168873} #0 setitimer(ITIMER_REAL, 0x800003ffbfff61e0, NULL) ......................................................... = 0
value.it_interval.tv_sec: 0
value.it_interval.tv_usec: 0
value.it_value.tv_sec: 275
value.it_value.tv_usec: 10000
14:10:02 [21638]{5168873} #0 sigprocmask(SIG_UNBLOCK, 0x800003ffbfff6130, NULL) ....................................................... [entry]
set: SIGALRM
14:10:02 [21638]{5168873} #0 sigprocmask(SIG_UNBLOCK, 0x800003ffbfff6130, NULL) ....................................................... = 0
set: SIGALRM
oset: NULL
//Asked the host side, sigprocmask is a lock timeout signal SIGALRM, system call this cycle will consume a lot of system CPU.

The HP web site to find similar feedback, but no results.


Had good treatment suggestion or experience, please take bricks hit me. . .


Kneel to thank. . .

Started by Avery at November 19, 2016 - 1:05 PM

It is more healthy.

Posted by Avery at November 22, 2016 - 1:44 PM

Top

Posted by Melody at November 26, 2016 - 2:04 PM

How does the memory usage, page in, page out is relatively high.

Posted by Benjamin at December 09, 2016 - 2:56 PM

Memory is not a big problem.

vmstat -dS 5 1


procs memory page faults cpu
r b w avm free si so pi po fr de sr in sy cs us sy id
13 0 0 3423530 2308811 0 0 0 0 0 0 0 9159 148821 2434 6 1 92

The magnetic array IO is not the problem, avque, avwait, avsrv are not high.

Posted by Avery at December 15, 2016 - 3:14 PM

Again from the top, who has the relevant experience guide.

Posted by Avery at December 27, 2016 - 3:30 PM

Top, top, hoping to solveļ¼

Posted by Noel at January 01, 2017 - 3:38 PM