Skip to content

allow to control Suggester rebuilds #3933

@aerofeev2k

Description

@aerofeev2k

I've noticed that pulling webapp config and immediately pushing it back triggers Suggester, which in turn brings CPU utilization up to 2000% (I do have rebuildThreadPoolSizeInNcpuPercent set to 10, and there are 40 CPU cores).

The strange thing is that the main indices have not been updated since the last time Suggester has run. I thought that Suggester had these version.txt files with the last seen index generation commit number, so it could've used that as a hint that no rescanning is necessary?

While the Suggester's spinning, I see four threads sitting in about this stack:

"ForkJoinPool-1-worker-3" #552 daemon prio=5 os_prio=0 cpu=66057.03ms elapsed=136.58s tid=0x00007fdbe8004000 nid=0xa961 runnable  [0x00007fdbe3ffd000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame(SegmentTermsEnum.java:254)
        at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame(SegmentTermsEnum.java:246)
        at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:540)
        at org.apache.lucene.index.LeafReader.docFreq(LeafReader.java:82)
        at org.apache.lucene.index.BaseCompositeReader.docFreq(BaseCompositeReader.java:149)
        at org.opengrok.suggest.SuggesterUtils.computeNormalizedDocumentFrequency(SuggesterUtils.java:108)
        at org.opengrok.suggest.SuggesterUtils.computeScore(SuggesterUtils.java:97)
        at org.opengrok.suggest.SuggesterProjectData$WFSTInputIterator.weight(SuggesterProjectData.java:606)
        at org.apache.lucene.search.suggest.SortedInputIterator.sort(SortedInputIterator.java:184)
        at org.apache.lucene.search.suggest.SortedInputIterator.<init>(SortedInputIterator.java:76)
        at org.apache.lucene.search.suggest.SortedInputIterator.<init>(SortedInputIterator.java:62)
        at org.apache.lucene.search.suggest.fst.WFSTCompletionLookup$WFSTInputIterator.<init>(WFSTCompletionLookup.java:273)
        at org.apache.lucene.search.suggest.fst.WFSTCompletionLookup.build(WFSTCompletionLookup.java:115)
        at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:266)
        at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:253)
        at org.opengrok.suggest.SuggesterProjectData.init(SuggesterProjectData.java:157)
        at org.opengrok.suggest.Suggester.lambda$getInitRunnable$1(Suggester.java:231)
        at org.opengrok.suggest.Suggester$$Lambda$458/0x0000000800422040.run(Unknown Source)
        at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(java.base@11.0.14/ForkJoinTask.java:1407)
        at java.util.concurrent.ForkJoinTask.doExec(java.base@11.0.14/ForkJoinTask.java:290)
        at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base@11.0.14/ForkJoinPool.java:1020)
        at java.util.concurrent.ForkJoinPool.scan(java.base@11.0.14/ForkJoinPool.java:1656)
        at java.util.concurrent.ForkJoinPool.runWorker(java.base@11.0.14/ForkJoinPool.java:1594)
        at java.util.concurrent.ForkJoinWorkerThread.run(java.base@11.0.14/ForkJoinWorkerThread.java:183)

I've noticed this because after one such configuration update Tomcat became complete unresponsive, even running out of 8GB of memory:

10-Apr-2022 10:48:13.479 INFO [configuration-3-thread-1] org.opengrok.indexer.configuration.RuntimeEnvironment.applyConfig Done applying configuration
10-Apr-2022 10:48:13.587 INFO [Thread-3677] org.opengrok.suggest.Suggester.init Initializing suggester
10-Apr-2022 10:48:13.858 WARNING [ForkJoinPool-102-worker-63] org.opengrok.suggest.SuggesterProjectData.initFields Fields [hist] will be ignored because they were not found in index directory MMapDirectory@/opengrok/data/index/solaris lockFactory=org.apache.lucene.store.NativeFSLockFactory@903a800
10-Apr-2022 10:48:13.539 SEVERE [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-5] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Failed to complete processing of a request
        java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "ajp-nio-0:0:0:0:0:0:0:1-8009-exec-10" java.lang.OutOfMemoryError: GC overhead limit exceeded
10-Apr-2022 11:06:06.037 SEVERE [http-nio-8080-exec-3] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Failed to complete processing of a request
        java.lang.OutOfMemoryError: GC overhead limit exceeded
10-Apr-2022 11:06:45.230 SEVERE [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Failed to complete processing of a request
        java.lang.OutOfMemoryError: GC overhead limit exceeded

Activity

aerofeev2k

aerofeev2k commented on Apr 11, 2022

@aerofeev2k
Author

Btw, with some unpredictability of the Suggester activity that I'm seeing e.g. here and in the other report where it makes two passes over same project, can Suggester rebuilds be driven completely manually? I know that I can use REST to kick it off, but is there an option to prevent it from making automatic decisions about when to rescan/reload/rebuild?

I mean, if I'm scheduling rebuilds of the main indices manually and at the time when I think it's right for my workflow, it only makes sense for me to have some sort of similarly manual control over when Suggesters uses those indices to rebuild its own DB.

changed the title [-]Suggester taking all resources after a NOOP config update[/-] [+]allow to control Suggester rebuilds[/+] on Apr 26, 2022
vladak

vladak commented on Apr 26, 2022

@vladak
Member

Currently not possible. A tunable would have to be added to avoid all the suggester rebuild() calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @vladak@aerofeev2k

        Issue actions

          allow to control Suggester rebuilds · Issue #3933 · oracle/opengrok