The community will be in read-only from Monday 11:59pm (PT) to Wednesday 7:30am (PT)
The community will be in read-only from Monday 11:59pm (PT) to Wednesday 7:30am (PT)
Project and Portfolio Management Practitioners Forum
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to restart the service node in Production

Highlighted
PPMRam
Collector

Unable to restart the service node in Production

Hi Team,

 

We have some problem with notification service, for this we have stopped the servide node. But when we are trying to start the node gettting error. Please find the servr log and kSupport. Please do needful help ASAP.

 

Thanks,
Nagesh S

11 REPLIES
TurboMan
Member

Re: Unable to restart the service node in Production

Hi,

 

Before starting the node make sure you remove all tmp and work folders for all nodes under <ITG_HOME>/server/<Nodes>.

And run ./kUpdateHtml.sh then start the nodes.

 

Hope it helps.

TM

Jim Esler
Honored Contributor

Re: Unable to restart the service node in Production

TurboMan, please clarify. Are you suggesting deleting the contents of tmp and work directories for nodes that are running? How would this help get the one node restarted? Also, why would it be useful to run kUpdateHtml.sh if no changes have been made to server.conf?

 

PPMRam, the log file seems to be indicating there are problems connecting to the database. Database connection issues can also cause problems with services. Make sure the database is healthy and configured to support the load you are placing on it.

Niraj Prabhu
Frequent Visitor

Re: Unable to restart the service node in Production

Jim,

 

How many user nodes and service nodes per application server? and how about the services configuration, are they set for failover mechanism?

 

Did you stopping all the PPM nodes and then starting services node and then followed by user node?

 

Niraj P.
AlexSavencu
Honored Contributor

Re: Unable to restart the service node in Production

Hi,

 

first of all, the logs you uploaded contain some confidential information - you may not want to post such information here.

 

Second, you should upload a log file which contains the startup sequence of the problematic server.

 

cheers

alex


--remember to kudos people who helped solve your problem
TurboMan
Member

Re: Unable to restart the service node in Production

Jim,

I am suggesting to remove two directories (tmp and work) under server/<nodes> folder. These folders will be created after the server started.
I always do that after stopping the server. (for example under server/node1 directory tmp and work directories).
I am just trying to find out whether the clustering is configured properly.
That's why I am suggesting to run kUpdateHtml, it will find out if something went wrong while configuring the clustering.
Did I make myself clearer?

Govardhan07
Regular Collector

Re: Unable to restart the service node in Production

I am not sure but you can try this.

 

Restart the whole server once.

 

Regards,

Govardhan

philipwood
Regular Collector

Re: Unable to restart the service node in Production

Hi Jim,

 

If you are feeling uncertain or unsafe log a call.


The logs shows the service node not even managing to deploy the basic PPM web applications.


If i were faced with this situation I would suspect that the node was somehow corrupted and try to recreate it.

Under normal conditions all the nodes are identical in configuration (and most important information is kept in the database and not the filesystem).


What I'm proposing would be:

1) Make sure the damaged node is off.

2) Copy its entire filesystem somewhere for backup and maybe autopsy.

i.e. move [base]/server/[DAMAGED_NODE_NAME] somewhere.
NOTE: be careful of the user this is done as - to prevent issues with filesystem permission use the same account that the node is run under (if started from a windows service verify the account that the service executes as).

3) Then switch another node off to ensure none of its files are locked.
i.e. something like:
sh./kStop.sh -now -name [DONOR_NODE_NAME] -user [adminuser]

4) Now copy the entire filesystem of the "donor" node to the same name as the damaged node
i.e. copy [base]/server[DONOR_NODE_NAME] to [base]/server/[DAMAGED_NODE_NAME]
NOTE: be careful of the user this is done as - to prevent issues with filesystem permission use the same account that the node is run under (if started from a windows service verify the account that the service executes as).

5) Delete everything under [base]/server/[DAMAGED_NODE_NAME]/tmp and [base]/server/[DAMAGED_NODE_NAME]/work
These are JBoss files that are generated dynamically and can be deleted *WHEN A NODE IS NOT RUNNING*
The biggest side effect of this is that the node will start a bit slower and run a bit more slowly while some of the deleted files (e.g. precompiled JSPs are regenerated).

 

Then try starting up the node.

 

If this does not solve the problem then node corruption is not the problem.
If required you could revert back to the previous filesystem.

 

Kind Regards

Philip Wood

AlexSavencu
Honored Contributor

Re: Unable to restart the service node in Production

Hi, Philip,

 

I still believe that the first thing to do is to find the root cause of the error.

 

cheers

alex


--remember to kudos people who helped solve your problem
philipwood
Regular Collector

Re: Unable to restart the service node in Production

Hi Alex,

I agree in principle.

But I usually timebox the root cause analysis - usually there is some uptime SLA in place that affects my client champion or principal's performance KPIs.

So while understanding the cause of the problem is crucial to preventing its reoccurence, if after an initial analysis I don't find an obvious cause I will attempt to restore the highest possible degree of service first and then continue with detailed analysis.

E.g. if the database connectivity goes down and the DBA can't resolve in 5-10 minutes, I'll request that he/she preserves all logs and then attempt a database restart.

Regards

Philip Wood
AlexSavencu
Honored Contributor

Re: Unable to restart the service node in Production

Hi, Philip,

 

in principle I also agree with you.

 

My bad - I did not scroll down in the server log long enough to notice the startup sequence logging.

 

Indeed, it looks like there is something corrupted in this node.

 

You should ring down the node, clean tmp and work directories, run kUpdateHtml and then restart. If this does not work, you should recreate the node or call HP Support, whichever is faster.

 

cheers

alex


--remember to kudos people who helped solve your problem
Mohit_Agrawal
Frequent Visitor

Re: Unable to restart the service node in Production

Hello Nagesh,

 

I can see in the serverLog.txt that there is some error relating to DB Connection..(Failed to obtain DB connection from data source). You should once check the below parameters in server.conf

 

com.kintana.core.server.JDBC_URL

com.kintana.core.server.DB_CONNECTION_STRING: If the JDBC_URL parameter is specified, then the security identifier (SID) of the database on which the PPM Center schema resides is requested. It is assumed that the connect string for this database is the same as the SID. However, this is not always the case.

 

Also you should check in workers.propertiesfile that every parameter for that node is correct, like port_number, host, type, connection_pool_timeout etc.

 

I think you can also once check that the @node directive in the server.conf is having all the server configuration parameters correct to that specific node in the cluster.

 

Thanks!!

Mohit Agrawal

 

 

 

//Add this to "OnDomLoad" event