I've had a re-occuring problem twice in the past three weeks where suddenly the database rules on my server stop executing entirely. The app server works fine otherwise -- just the db rules are affected.
I am able to solve the issue by restarting the app server, but of course the db rules that got backed up are then lost.
When the issue was occurring, I checked the "Scheduled Tasks" in the database, and they didn't seem out of control. Still, could a scheduled task get "backed up" and cause ALL db rules to stop in the queue as well? Can anyone think of another reason that this behavior might occur?
I would certainly appreciate any help you can provide.
I have seen this behaviour once on a Solaris app server that got swamped with mail (email loop) and the DB rules stopped working until a restart of the server. How many scheduled tasks are listed? Have a look at CPU usage around the time of the failure to see if there was excessive activity.
Thanks for the reply Jonathon. I usually see between 3 (on a weekend) to 20 scheduled tasks in the database. There's only one db rules that creates a scheduled task in our system -- I suppose I could live without waiting 24 hours to send a satisfaction survey, but it would be nice if that functionality worked as advertised.
As far as CPU utilization goes, it was strange -- the CPU on the app server was hovering between 1 and 2% at most. Generally there didn't appear to be anything happening, which is the opposite of what I'd expect if it was blocking up.
We have had similar problems recently and sometimes the logfile showed java.lang.OutofMemoryErrors. It started after many unused DB rules were deleted resulting in rule deserialization problems. HP provided a tool that we used to identify and delete these "hidden" rules that could not be deseriarilzed. SD4.5 SP7
Nothing stranger than usual appears in the server log. The rule manager complains infrequently about one db rule trying to update the assignment with a person not in the current workgroup, so I can at least account for the error messages it's throwing, and there aren't that many.
I haven't deleted any db rules lately either, so I don't think that's a part of the issue. Still I may have to contact HP support again soon to see if there are any recommendations...
I had the same problem when upgrading from sp12 to sp14. Reason: we use some java-programs, and because they not contained "system.exit()", since sp13 they remained in the memory and after some time agent didn't work longer. There has been no message in the logfiles.
After changing the java-programs and recompiling it was ok. wilfried
We are on SP12 and have experienced some intermittent issues with DB rules not firing. What was strange for us was that it just started happening one day. We talked with support and they provided a Hotfix file for ITSM7202 which is in SP13. http://openview.hp.com/ecare/getsupportdoc?docid=ITSM007202
Another change related to rules is ITSM006772 which is in SP11.
We are planning on moving to SP16 as well to hopefully make SD a little more stable.
That's very interesting. I was thinking the same thing as far as upgrading to SP16 to hopefully improve stability plus there are some date fixes from SP10 that I'd be glad to see, but I hate not knowing what's the cause of this db rule issue. (I worry it'll just crop up again unless I can explain it!)
I think I'm having a slightly different issue, because when these things hit, NO db rules run, period. I've also seen that issue where some db rules just seem not to run occasionally, but the server going haywire is my larger concern right now.
Chris - Did you ever get this issue resolved? We are still seeing intermittent issues where they don't fire. I've had a couple tickets open with support, but It's very difficult to capture any data on it. We aren't going to upgrade in 4.5; we are waiting on 5.10 before we do anything.
I eventually was forced to open a case with HP a month ago. The engineer was leaning towards Java memory issues with the server, which I thought I had already fixed. But, I went ahead with his ideas anyway, and I haven't had any problems for one month now. Which is better than the reliability I had been seeing, so hopefully we're on to something.
What I mean by Java memory issues is we had to go into the "installservice.bat" and "sd_server.bat" and update the memory sizes that Java uses when launching. I had already done this, but I may not have run the "reinstall_service.bat" to update the service afterward.. So far things have been working alright since making that change. If you want more details about the specific settings, let me know.
That sounds good, Chris. I have already done this to allow more memory to the processes. Did support offer information about how much to allocate? That was the part I wasn't clear on. Our servers have 8GB of RAM on them. I've got 5 App Server instances running on each with the following settings: MaxNewSize: 333 NewSize: 167 XMS (Initial Heap): 500 XMX (Max Heap): 1000
I set the MaxNew and New values based on a 2/3 and 1/3 of the XMS value. I had read this on another forum post and an HP document. How was your set? I've got more RAM to spare so I could increase it if necessary.
I have never gotten clear guidance anywhere about how much memory is necessary with those settings. Basically, the only advice has been, "If you run into problems, increase the memory size." Not terribly helpful, I know.
I only have one app server on our box here, and the settings are at:
On top of that, I took it upon myself to schedule a restart of the app server service weekly since it's a short outage and load balances to the backup server if it runs into issues. So far the combo of the two seems to be working.
...And that's the state of my confidence in this application.
This is a huge bug and there is no response since more than one year the first time I opened a similar case. Check for your scheduled jobs on SD server. If it is almost 1000 Scheduled jobs exits Rules are staring not to work until you flush all these jobs.