amazon web services - Elasticsearch cluster CPU maxing out at 100 request per second search queries -


the problem: i'm running elastic search server cluster 4 index's 1 has 4.5 million documents other has 13 million. other index's marvel , kibana , small.

whenever 150 queries per second jmeter ("testing framework") cpu max out. more turn more cpu maxes out.

from read online performance tuning has memory issues our box running out of cpu way before memory , causing 6 second response times test.

the setup:

3 x client nodes aws m3.xlarge 4 cores 16gb 3 x master nodes aws m3.medium 1 core 4gb <-i beleive 3 x data nodes   aws c3.2xlarge 8 cores 30gb 

plugins:

  aws   marvel 

document count

account-index-v1.0   4.5 m entry-index-v1.0     13.1 m 

@160 query per second using jmeter execute following query:

            cpu  load(1m) mem  %freed   iops client nodes hidden:9300 0.0    0.0    7.3    n/a     n/a hidden:9300 0.0    0.0    4.3    n/a     n/a hidden:9300 0.0    0.1    8.3    n/a     n/a  data nodes hidden:9300 99.0   10.2   11.7    69.7 gb  1.2 hidden:9300 71.0   3.0    15.0    69.6 gb  3.9 hidden:9300 16.7   0.3    12.7    69.8 gb  0.1  master nodes hidden:9300 0.3    0.0    3.0     73.0 gb  0.2 hidden:9300 0.3    0.0    7.0     73.0 gb  0.1 hidden:9300 0.3    0.0    5.0     73.0 gb  0.1 

queries

{ "match"{"event_id":"10000"}, "match"{"race_id_indexed":"10000"}, "match"{"is_test":"f"}, "match"{"status":"conf"}, "must_not":[{"match"{"type":"team"}}], "query":{"match_all":{}} }    marvel.agent.enabled: true cluster.name: vision bootstrap.mlockall: true http.enabled: true index.number_of_shards: 3 index.number_of_replicas: 1  <%if node.has_key?("ec2") %> plugin.mandatory: "cloud-aws" discovery.type: "ec2" discovery.ec2.groups: "<%= node["ec2"]["security_groups"][0] %>" discovery.ec2.ping_timeout: "120s" discovery.zen.ping.multicast.enabled: false <% else %> discovery.zen.ping.multicast.enabled: true <% end %> 

as per comment changed using filters. cpu usage went down still 60-70% @ 160rps.

here query

{     "search" : {         "filter" : {             "terms" : { "event_id" : ["10000"], race_id_indexed:["100000"], is_test:["f"], status:["conf"]}         }     } } 

the optimized query helped performance im still getting high cpu usage according marvel.

hot_threads:

::: [gazelle][rj0rbskiqsep5ozdb_u9oq][esn2-a828d755-dev][inet[/10.11.180.189:9300]]{master=false}     23.3% (116.4ms out of 500ms) cpu usage thread 'elasticsearch[gazelle][search][t#6]'      9/10 snapshots sharing following 10 elements        sun.misc.unsafe.park(native method)        java.util.concurrent.locks.locksupport.park(locksupport.java:186)        java.util.concurrent.linkedtransferqueue.awaitmatch(linkedtransferqueue.java:735)        java.util.concurrent.linkedtransferqueue.xfer(linkedtransferqueue.java:644)        java.util.concurrent.linkedtransferqueue.take(linkedtransferqueue.java:1137)        org.elasticsearch.common.util.concurrent.sizeblockingqueue.take(sizeblockingqueue.java:162)        java.util.concurrent.threadpoolexecutor.gettask(threadpoolexecutor.java:1068)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1130)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)      unique snapshot        java.util.hashtable.get(hashtable.java:433)        org.apache.log4j.hierarchy.getlogger(hierarchy.java:273)        org.apache.log4j.hierarchy.getlogger(hierarchy.java:247)        org.apache.log4j.logmanager.getlogger(logmanager.java:228)        org.apache.log4j.logger.getlogger(logger.java:104)        org.elasticsearch.common.logging.log4j.log4jesloggerfactory.newinstance(log4jesloggerfactory.java:39)        org.elasticsearch.common.logging.esloggerfactory.getlogger(esloggerfactory.java:62)        org.elasticsearch.common.logging.loggers.getlogger(loggers.java:147)        org.elasticsearch.common.logging.loggers.getlogger(loggers.java:110)        org.elasticsearch.common.logging.loggers.getlogger(loggers.java:84)        org.elasticsearch.common.component.abstractcomponent.<init>(abstractcomponent.java:38)        org.elasticsearch.common.util.bigarrays.<init>(bigarrays.java:377)        org.elasticsearch.common.util.bigarrays.withcircuitbreaking(bigarrays.java:420)        org.elasticsearch.search.internal.defaultsearchcontext.<init>(defaultsearchcontext.java:201)        org.elasticsearch.search.searchservice.createcontext(searchservice.java:536)        org.elasticsearch.search.searchservice.createandputcontext(searchservice.java:515)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:277)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)     19.6% (98ms out of 500ms) cpu usage thread 'elasticsearch[gazelle][search][t#9]'      3/10 snapshots sharing following 19 elements        org.apache.lucene.search.constantscorequery$constantscorer.nextdoc(constantscorequery.java:257)        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:192)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)      7/10 snapshots sharing following 2 elements        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)     19.5% (97.4ms out of 500ms) cpu usage thread 'elasticsearch[gazelle][search][t#7]'      3/10 snapshots sharing following 19 elements        org.apache.lucene.search.constantscorequery$constantscorer.nextdoc(constantscorequery.java:257)        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:192)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)      7/10 snapshots sharing following 10 elements        sun.misc.unsafe.park(native method)        java.util.concurrent.locks.locksupport.park(locksupport.java:186)        java.util.concurrent.linkedtransferqueue.awaitmatch(linkedtransferqueue.java:735)        java.util.concurrent.linkedtransferqueue.xfer(linkedtransferqueue.java:644)        java.util.concurrent.linkedtransferqueue.take(linkedtransferqueue.java:1137)        org.elasticsearch.common.util.concurrent.sizeblockingqueue.take(sizeblockingqueue.java:162)        java.util.concurrent.threadpoolexecutor.gettask(threadpoolexecutor.java:1068)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1130)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)  ::: [shiva][otf4wq2mry6iuixnkjr8hw][esn3-c2c5423e-dev][inet[/10.111.181.149:9300]]{master=false}     37.9% (189.3ms out of 500ms) cpu usage thread 'elasticsearch[shiva][search][t#9]'      6/10 snapshots sharing following 18 elements        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:193)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)      4/10 snapshots sharing following 19 elements        org.apache.lucene.search.constantscorequery$constantscorer.nextdoc(constantscorequery.java:257)        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:192)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)     32.2% (161ms out of 500ms) cpu usage thread 'elasticsearch[shiva][search][t#4]'      5/10 snapshots sharing following 18 elements        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:193)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)      5/10 snapshots sharing following 19 elements        org.apache.lucene.search.constantscorequery$constantscorer.nextdoc(constantscorequery.java:257)        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:192)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)     31.5% (157.6ms out of 500ms) cpu usage thread 'elasticsearch[shiva][search][t#11]'      5/10 snapshots sharing following 18 elements        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:193)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)      5/10 snapshots sharing following 19 elements        org.apache.lucene.search.constantscorequery$constantscorer.nextdoc(constantscorequery.java:257)        org.apache.lucene.search.weight$defaultbulkscorer.scoreall(weight.java:192)        org.apache.lucene.search.weight$defaultbulkscorer.score(weight.java:163)        org.apache.lucene.search.bulkscorer.score(bulkscorer.java:35)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:621)        org.elasticsearch.search.internal.contextindexsearcher.search(contextindexsearcher.java:191)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:491)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:448)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:281)        org.apache.lucene.search.indexsearcher.search(indexsearcher.java:269)        org.elasticsearch.search.query.queryphase.execute(queryphase.java:157)        org.elasticsearch.search.searchservice.loadorexecutequeryphase(searchservice.java:272)        org.elasticsearch.search.searchservice.executequeryphase(searchservice.java:283)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:776)        org.elasticsearch.search.action.searchservicetransportaction$searchquerytransporthandler.messagereceived(searchservicetransportaction.java:767)        org.elasticsearch.transport.netty.messagechannelhandler$requesthandler.run(messagechannelhandler.java:275)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)  ::: [rachel summers][fbcqmcl8qiuyabddt9tvqg][esn1-38caed17-dev][inet[/10.145.222.206:9300]]{master=false}     15.3% (76.3ms out of 500ms) cpu usage thread 'elasticsearch[rachel summers][search][t#1]'      10/10 snapshots sharing following 2 elements        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)     12.9% (64.6ms out of 500ms) cpu usage thread 'elasticsearch[rachel summers][search][t#9]'      10/10 snapshots sharing following 10 elements        sun.misc.unsafe.park(native method)        java.util.concurrent.locks.locksupport.park(locksupport.java:186)        java.util.concurrent.linkedtransferqueue.awaitmatch(linkedtransferqueue.java:735)        java.util.concurrent.linkedtransferqueue.xfer(linkedtransferqueue.java:644)        java.util.concurrent.linkedtransferqueue.take(linkedtransferqueue.java:1137)        org.elasticsearch.common.util.concurrent.sizeblockingqueue.take(sizeblockingqueue.java:162)        java.util.concurrent.threadpoolexecutor.gettask(threadpoolexecutor.java:1068)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1130)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745)     11.9% (59.6ms out of 500ms) cpu usage thread 'elasticsearch[rachel summers][search][t#10]'      10/10 snapshots sharing following 10 elements        sun.misc.unsafe.park(native method)        java.util.concurrent.locks.locksupport.park(locksupport.java:186)        java.util.concurrent.linkedtransferqueue.awaitmatch(linkedtransferqueue.java:735)        java.util.concurrent.linkedtransferqueue.xfer(linkedtransferqueue.java:644)        java.util.concurrent.linkedtransferqueue.take(linkedtransferqueue.java:1137)        org.elasticsearch.common.util.concurrent.sizeblockingqueue.take(sizeblockingqueue.java:162)        java.util.concurrent.threadpoolexecutor.gettask(threadpoolexecutor.java:1068)        java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1130)        java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)        java.lang.thread.run(thread.java:745) 


Comments

Popular posts from this blog

cakephp - simple blog with croogo -

How to group boxplot outliers in gnuplot -

bash - Performing variable substitution in a string -