Service Desk Practitioners Forum
cancel

HELP! Application Servers using 100% CPU

SOLVED
Go to solution
Highlighted
gavin mills_2
Regular Contributor.

HELP! Application Servers using 100% CPU

Any advice will be gratefully recieved!

We have:
6 w2k3 Servers 1.5gb mem 2.8ghz processor running the app servers
1 Linux Server same spec as w2k3 servers running an oracle 9.2 DB

Almost every day for the last week and intermittently for the last few weeks, one or two of the application servers hit over 80% cpu (normal >10%). this has the effect of freezing anyone logged on to that server, not letting anyone else log on and eventually bringing down the entire service desk service.

It is the sd_serverservice.exe process that is using all the CPU.

I originally thought that this was just one of those things as it only happened once a week, then we experienced an issue with netbackup and thought that this was the cause, this now fixed and the issue with App servers is getting worse.

Has anyone experienced this themselves?

Is there anyway of seeing what is going on with the service desk service ?

I have turned rule deugging on but this does not show anything, my scheduled tasks are around 5000 which i am lead to belive is normal.

I have run the rule debug sent to me while ago by HP but its formating is simply useless.

14 REPLIES
Mark O'Loughlin
Acclaimed Contributor.

Re: HELP! Application Servers using 100% CPU

Hi,

do you have data exchange tasks running?
Does the logserver.txt file show any errors?
How many users do you have accessing the system (approx)?
What is the weighting of the application servers or are they on a 1:1 ratio?
Do you have a lot of users accessing OVSd at the same time e.g. users logging on from one location?
Does it look like memory is also being taken up on the server(s)-the default memnory allocations are generally insufficient and can be changed.

Tim Schmitt_4
Honored Contributor.

Re: HELP! Application Servers using 100% CPU

What Service Desk version do you use?
Did you recently upgrade?
gavin mills_2
Regular Contributor.

Re: HELP! Application Servers using 100% CPU

SP9 and we havent upgraded (ever)

do you have data exchange tasks running?
No

Does the logserver.txt file show any errors?
nothing at all, even with rule debug on

How many users do you have accessing the system (approx)?
during the night time about 10 per server, day time around 50 per server

What is the weighting of the application servers or are they on a 1:1 ratio?
all equal

Do you have a lot of users accessing OVSd at the same time e.g. users logging on from one location?
yes that would be 8.00am GMT but no correlation between time of day and issue


Does it look like memory is also being taken up on the server(s)-the default memnory allocations are generally insufficient and can be changed.
Memory appears to be fine during this time I have allocated 1gb to the JVM
_______________________
Tim Schmitt_4
Honored Contributor.

Re: HELP! Application Servers using 100% CPU

At the time when the process spikes over 80%, can you see that the process (sd_serverservice.exe) is using the full gigabyte of memory?

When this occurs, there are a couple of things you can do to see what happens. You can turn rule manager logging by executing this from the server/bin directory:
sd_servermanager.bat /monitor [servername] [port] com.hp.ifc.ev.dbrules.AppDBRuleManager setMonitoring true

Setting the last variable to false will turn off the rule manager logging. I think that this probably has the same information in it as turning on rule debugging through the admin panel.

You can also view the status of the threads that are created by the process. The easiest way to do this is to start the server using batch file. This will allow you to see the server console. On the server console, if you click the button named "log monitoring information", the thread information will be logged to logserver.txt. It will allow you to see which threads are active.
JaS_4
Acclaimed Contributor.

Re: HELP! Application Servers using 100% CPU

Hi Gavin,

Does those problem app servers have specific task eg. for database reporting, servicepages or login only ? Look for external events that may create additional load on the apps server.
Do you get a lot of unable to connect, or connection reset by peers or cannot connect to mail serveror any error messages that may indicate problem with networking ?
There is a known problem where if an apps server cannot connect to a mail server within a certain period, it pushes the servicedesk to the max and crashes the system but I can't remember whether it pre or post Sp9. Have a look at self solve.
You can turn on the apps server gui which will provide you with a gui that shows the performance of your server, thread, queue and etc. This can be turn on via the admin console, system panel or via -monitor parameter. If using the -monitor for service, you will need to turn on the 'interact with desktop' option from services.msc or nothing will happen.
gavin mills_2
Regular Contributor.

Re: HELP! Application Servers using 100% CPU

Thanks for all the help guys!

A sfar as connections go, the serevrs that have the problem are all for clients, we have separate serevrs

for service pages and our integrations (none of these have issues).

2 things that have been said may have something:

1) Last night one cpu on one of our boxes has hit 100% (50% overall), it spiked up at 2.40am ish with

nothing in the log at this time. However 1 hour before the log reports:

The server socket for the ITP service on port 30999 had an invalid request
Wed, 29/11/2006 00:33:14 The request was send by 167.189.28.61 with ip address 167.189.28.61 from

port 1361
Wed, 29/11/2006 00:33:14 Invalid ITP connection: Read failed due to: Connection reset by peer:

JVM_recv in socket input stream read

2) We have been steadily increasing the number of mails that are sent over the last 3 months and we do have

a lot of mail error messages although none directly tie in to the time of the issue, that does sound

similar
JaS_4
Acclaimed Contributor.

Re: HELP! Application Servers using 100% CPU

If only 1 connection reset by peer, that is quite normal, where timeout occurs and connection is shutdown as no response on the other end.
Do you integration happening with the problem server ? Multiple sd_events coming in perhaps ?
You can run the following to see how big your system has grown.
select ent_name, count(*)
2 from rep_javaobjects,
3 ifc_entities
4 where ent_oid = jav_entity
5 group by ent_name;
But I would have thought 6 apps servers would be sufficient.
I found the details of some possiblities that may cause your symphtoms.
ITSM005939 email server not responding
OV-ENSD42939 oracle tablespace running out of space.
OV-EN016405 load balancing not working from sp6.
If you have rule out everything, you might want to consider adding the 7th apps server.
gavin mills_2
Regular Contributor.

Re: HELP! Application Servers using 100% CPU

Teh serversb only deal with clients no sd events etc going tom the affected servers:

ENT_NAME COUNT(*)
Audit rules 13
Copy Attributes 13
Database Rule 149
Scheduled Tasks 5697
UI Rule 141
View info 865
Tim Schmitt_4
Honored Contributor.

Re: HELP! Application Servers using 100% CPU

The number of scheduled tasks that you are running should not be a problem. We have approximately 3000 tasks per server (4 servers) with no issue and we have similar hardware.

Can you see an correlating events in the log? Since the server seems to disallow any further connections, would it be safe to assume that the last entry in the log was the problem? Maybe it's that email bug?

How big do your log files get? Could the logs be full?

I would also check the load balancing to make sure people are dividng evenly on the servers by looking at who is logged in on the administration panel. You should be able to sort by server and take a rough count. We had to set all of our server weight ratios to 350 (instead of 1) due to a problem in load balancing we encountered.

Also, when you look at the sd_serverservice task when it is at 100%, how much memory is it using?

What java version do you use on the server?

I know that these ideas are particularly helpful but they are good things to eliminate and may help uncover the issue.
Tim Schmitt_4
Honored Contributor.

Re: HELP! Application Servers using 100% CPU

Also, not that this is a cause, but windows server 2003 isn't officially supported by HP Service Desk Service Pack 9. It becomes support in one of the Service Pack higher than 17, if memory serves. I don't think that this would present a problem but its something to note.
JaS_4
Acclaimed Contributor.

Re: HELP! Application Servers using 100% CPU

Tim is right, your tasks, rules looks reasonable.
The question seem to be why is your sd service working hard at certain times. At 2:30 am, it should be a low period unless this is the time where some user runs big report, multiple updates, db maintenance or something.
Is there any pattern to the 100% utilizations ?
Have you spoken to the users if any around the 2:30am peak as to what they were up to ?
I was thinking maybe it was a java garbage collection issue but as it does spike when it tries to recover memory but your will seeing more spikes than you currently are.
gavin mills_2
Regular Contributor.

Re: HELP! Application Servers using 100% CPU

thanks for the informtion guys HP sent me the hotfix for the SMTP servers issue yesterday. Even though there was no evidence that it was the issue we applied it as a process of elimination, and guess what; no spikes last night! So hopefully we may have the fix!

in answer to your questions though:

Can you see an correlating events in the log?
No this has been the most frustraing thing, no errors (well out of the ordinary) that tie in with the time of the spike,this is with debug on aswell, the only error thats is consistent is the "flush error" when the CPU is maxed out.

How big do your log files get? Could the logs be full?
We have Patrol monitoring that interogates, segments and archives the logs every few hundred K.

I would also check the load balancing:
Never moves out of around a 2 or 3 differential between servers


Also, when you look at the sd_serverservice task when it is at 100%, how much memory is it using?
Barely any, we have ahd memory isseu issues in the past and this was my first port of call.


What java version do you use on the server? 1.3.1
MarkvL
Acclaimed Contributor.
Solution

Re: HELP! Application Servers using 100% CPU

Hi Gavin,

Glad to see that the hot fix for ITSM005939 solved the issue.

Mark
HP Support
If you find that this or any post resolves your issue, please be sure to mark it as an accepted solution.
JaS_4
Acclaimed Contributor.

Re: HELP! Application Servers using 100% CPU

Hi Gavin,

Glad to hear ITSM005939 helped. I was running out of ideas. I would have thought you would have more unable to connect to mail server errors in the logs.
Like you say, it's a process of elimination.