HPE Content Manager (HPE RM) Discussion, Information and News
cancel

Any way to speed up the Content Indexing?

Highlighted
Ralph_Furino
Honored Contributor.

Any way to speed up the Content Indexing?

I did a massive import of over 600k electronic documents. In the import there is one title field that needs to have context indexing. In the context index queue when I finished doing the import it had about 580k (That was 6 days ago) Since then, the queue has only processed 130k. At this rate, it's going to take over a month for the queue to catch up--That is if I do not import any more electronic documents that need context indexing.

 

As you can see, I have one dedicated server to handle Content Indexing--It's a brand new box with 32GB of memory, 2 processors with 6 cores each. I am running Windows Server 2008 R2 Enterprise. The CPU usage doesn't go above 10%!

 

Any way to speed this up??

17 REPLIES
Sander Hoogwerf
Outstanding Contributor.

Re: Any way to speed up the Content Indexing?

Content indexing is a serial process and won' use more than 1 core (until ISYS optimizes their engine).

You can configure the content indexing to process larger batches (default 10). This way the queue should be processed much quicker. Also try to keep the DCI size relatively small (1 or 2GB), because larger sizes tend to slow processing down as well.

 


(Any opinions expressed in this forum are my own personal opinion and should not be construed as an official statement by DXC Technology.)

Analytics & Data Management
Application & Business Services
DXC Technology
Damitha
Regular Collector

Re: Any way to speed up the Content Indexing?

Try this registry key

HKEY_LOCAL_MACHINE\SOFTWARE\Hewlett-Packard\HP TRIM\WorkgroupServers\DatabaseEventBatchSize 

Regards,

Damitha Bogahawatta
Principal Consultant
Talent Consulting Services
Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

Damitha,

 

Thanks for the reply.... But I am running 7.1. That setting doesn't exist anymore. :smileysad:

Grundy
Acclaimed Contributor.

Re: Any way to speed up the Content Indexing?

These keys wont exist by default Ralph, you will need to create them. :) Damitha recently confirmed the keys with support/R&D, so this should be the correct one.


::::::::::::::::::::::
NOT A HP EMPLOYEE
::::::::::::::::::::::

INFORMOTION.com.au
Sander Hoogwerf
Outstanding Contributor.

Re: Any way to speed up the Content Indexing?

But be carefull when using that key if you're running Oracle database and TRIM 7.x. In that case the eventserver will (almost certainly) loop over the same records over and over again!

When insisting on using that setting with an Oracle database, you must make sure to have the total event processors running on 1 machine multiplied by batchsize doesn't exceed 1000 items or you'll be in trouble.


(Any opinions expressed in this forum are my own personal opinion and should not be construed as an official statement by DXC Technology.)

Analytics & Data Management
Application & Business Services
DXC Technology
Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

Sander,

 

Could you please explain your comment in more detail.... I would like to try the suggested solution but don't want to run in a loop with the content indexing. Would I need to look at the number of lines in the tseventdat table? or do you mean take the number in the context indexing queue and divide that by the tseventdat table??Then put that number in the registry entry as a decimal value??

 

Thanks in advance!

 

 

Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

Also, what type of key would I be entering under 'WorkgroupServers'? Would it be a REG_QWORD (64bit)  (If I am running 64bit--Which I am....) Binary Value? DWORD (32-bit)?

 

See the picture that I attached.

 

Thanks in advance!

Sander Hoogwerf
Outstanding Contributor.

Re: Any way to speed up the Content Indexing?

Ralph,

 

With Oracle, TRIM would attempt to delete the processed events from the table which results in a SQL where clause like "DELETE FROM TSEVENTDAT WHRE URI IN (1,2,3, ..., 1000) ..." immediately after processing the events (unlike 6.2 and before where once a day there was a delete based on last processed uri and timestamp).

 

Oracle will fail when there are 1000 or more items in the IN-list (ORA-01795). TRIM will then ignore the error and happily get the "next batch" of items. Unfortunately, the delete failed, so the next batch of events is identical to the previous batch. Effectively the event processor is looping over the same events again and again.

SQL Server, unlike Oracle, will accept many more items in the IN-list so won'f fail.

 

 

DWORD (32-bit) should work. Just remember to restart the workgroupserver after changing this registry key to see any effect.


(Any opinions expressed in this forum are my own personal opinion and should not be construed as an official statement by DXC Technology.)

Analytics & Data Management
Application & Business Services
DXC Technology
Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

Ok. I gave it a shot with the registry entry that was suggested. I hope that I did it correctly. (See attached image) I suspended context indexing and restarted the Workgroup server--I even rebooted just to play it safe, and resumed context indexing. I don't see how this is going to speed up the context indexing. It's still only processing 5 to 6 at a time every 30 second update.

 

Do I have the correct registry entry?? Should I have used a QWORD instead of a DWORD because I am running the 64bit version of Enterprise Studio??

 

Thanks in advance!

Grundy
Acclaimed Contributor.

Re: Any way to speed up the Content Indexing?

DWORD is correct.

Just a small thing, don't think it will affect behaviour, but the key is named DatabaseEventBatchSize. (No caps on Base)

There is also another key you can try:

 

DciProcessingIntervalMs

 

You could possibly try using this key. I don't know what the default value is or the value range.
I think the default might be 300000 (5 minutes).

Maybe you could try lowering this as well, try 60000 (1 minute).

 

 

There's also:

 

DatabaseEventPollIntervalMs

 

I think the default is 60000 (1 minute) and you could try something lower, e.g. 30000 or event 15000 should still be pretty safe on a fast database server.

 

Let us know how this goes!



::::::::::::::::::::::
NOT A HP EMPLOYEE
::::::::::::::::::::::

INFORMOTION.com.au
Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

I've made the registry entries that you suggested--I even made the intervals shorter but the content indexing is still processing 5-6 records every 30 seconds. (See pictures)

 

Is there a setting to increase the of how many records are processed?? I have plently of processing power on the Workgroup and Database servers to handle the load. The DB is running Oracle 11g with 32GB of RAM and is just sitting idle. The processor on the Workgroup server at most hits 10% (See picture)

 

 

Sander Hoogwerf
Outstanding Contributor.

Re: Any way to speed up the Content Indexing?

Have you:

  • looked at your DCI logs? Anything weird there?
  • checked the Resource Monitor on your server? Any high values on your Disk response times?
  • set the "Number of transactions in an index update file" on the DCI properties dialog to anything higher than the default (1)?

(Any opinions expressed in this forum are my own personal opinion and should not be construed as an official statement by DXC Technology.)

Analytics & Data Management
Application & Business Services
DXC Technology
Grundy
Acclaimed Contributor.

Re: Any way to speed up the Content Indexing?

As Sander suggested, what is the value in the 'Number of transactions in index update' on your DCI properties?

I assumed this was checked as I came into the thread a bit late. :)

With that volume of DCI events, I would suggest using a value of 100 to start off with.



::::::::::::::::::::::
NOT A HP EMPLOYEE
::::::::::::::::::::::

INFORMOTION.com.au
Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

I changed the DCI properties to 100 then 500 and finally left it at 1000. In 3 hours I have taken over 30000 off of the Content Indexing queue.

 

Thank You!! :smileyhappy:

 

PS - I would like to know more about registry modifications that are not documented.

TRIMGuru
Acclaimed Contributor.

Re: Any way to speed up the Content Indexing?

"I did a massive import of over 600k electronic documents. In the import there is one title field that needs to have context indexing."

 

Ralph, do you mean Word Indexing for that one particular field (title field)?

Grundy
Acclaimed Contributor.

Re: Any way to speed up the Content Indexing?


Ralph_Furino wrote:

PS - I would like to know more about registry modifications that are not documented.



There's plenty of keys that aren't documented because they can cause problems for the environment if not implemented and set correctly.

They are known internally, but the documentation is not publicly available.

If something comes up and a key might be useful or needed, support should be able to help.



::::::::::::::::::::::
NOT A HP EMPLOYEE
::::::::::::::::::::::

INFORMOTION.com.au
Ralph_Furino
Honored Contributor.

Re: Any way to speed up the Content Indexing?

Greg,

 

It's Content Indexing--Since it shows up in the queue of TRIM Enterprise Studio's content Indexing bucket. There is only a few in the Word Indexing queue.

//Add this to "OnDomLoad" event