Follow

RBKB-2015-0001: No flows/monitors/events after a manager cluster update

ID: RBKB-2015-0001
Name: No flows/monitors/events after a manager cluster update
Date: 10/17/2015

Issue

After applying a new update to redBorder's manager cluster, it is observed that new events of any type  (flows, monitors, IPS events, etc.) will stop appearing.

Environment

  • redBorder 3.1.38-XX (all releases)

Main Cause

Several causes can explain this behavior:

One of them could be that the topology in Samza hasn't loaded properly. It can be checked through command execution:

# rb_samza.sh -l
15/10/15 07:39:06 INFO client.RMProxy: Connecting to ResourceManager at RBe2ecp4.e2e01.domain.eg/192.168.11.73:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):0
                Application-Id      Application-Name        Application-Type          User       Queue               State         Final-State         Progress                        Tracking-URL

 

Resolution

In order to solve the Samza's topology problem it is enough to load it again executing this command:

# rb_samza.sh -e

For instance:

# rb_samza.sh -e
15/10/15 07:39:13 INFO client.RMProxy: Connecting to ResourceManager at RBe2ecp4.e2e01.domain.eg/192.168.11.73:8032
java version "1.7.0_03"
Java(TM) SE Runtime Environment (build 1.7.0_03-b04)
Java HotSpot(TM) 64-Bit Server VM (build 22.1-b02, mixed mode)
/opt/chef-server/embedded/jre/bin/java -Dlog4j.configuration=file:/opt/rb/var/rb-samza-bi/bin/log4j-console.xml -Dsamza.log.dir=/opt/rb/var/rb-samza-bi -Djava.io.tmpdir=/opt/rb/var/rb-samza-bi/tmp -Xmx768M -XX:+PrintGCDateStamps -Xloggc:/opt/rb/var/rb-samza-bi/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10241024 -d64 -cp /opt/rb/etc/hadoop:/opt/rb/var/rb-samza-bi/lib/RoaringBitmap-0.4.5.jar:/opt/rb/var/rb-samza-bi/lib/activation-1.1.jar:/opt/rb/var/rb-samza-bi/lib/aether-api-0.9.0.M2.jar:/opt/rb/var/rb-samza-bi/lib/aether-connector-file-0.9.0.M2.jar:/opt/rb/var/rb-samza-bi/lib/aether-connector-okhttp-0.0.9.jar:/opt/rb/var/rb-samza-bi/lib/aether-impl-0.9.0.M2.jar:/opt/rb/var/rb-samza-bi/lib/aether-spi-0.9.0.M2.jar:/opt/rb/var/rb-samza-bi/lib/aether-util-0.9.0.M2.jar:/opt/rb/var/rb-samza-bi/lib/airline-0.5.jar:/opt/rb/var/rb-samza-bi/lib/akka-actor_2.10-2.1.2.jar:/opt/rb/var/rb-samza-bi/lib/antlr4-runtime-4.0.jar:/opt/rb/var/rb-samza-bi/lib/aopalliance-1.0.jar:/opt/rb/var/rb-samza-bi/lib/asm-3.1.jar:/opt/rb/var/rb-samza-bi/lib/avro-1.7.4.jar:/opt/rb/var/rb-samza-bi/lib/aws-java-sdk-1.8.11.jar:/opt/rb/var/rb-samza-bi/lib/aws-java-sdk-core-1.8.11.jar:/opt/rb/var/rb-samza-bi/lib/bcprov-jdk15on-1.52.jar:/opt/rb/var/rb-samza-bi/lib/bytebuffer-collections-0.1.5.jar:/opt/rb/var/rb-samza-bi/lib/c3p0-0.9.1.2.jar:/opt/rb/var/rb-samza-bi/lib/classmate-1.0.0.jar:/opt/rb/var/rb-samza-bi/lib/commons-beanutils-1.7.0.jar:/opt/rb/var/rb-samza-bi/lib/commons-beanutils-core-1.8.0.jar:/opt/rb/var/rb-samza-bi/lib/commons-cli-1.2.jar:/opt/rb/var/rb-samza-bi/lib/commons-codec-1.4.jar:/opt/rb/var/rb-samza-bi/lib/commons-collections-3.2.1.jar:/opt/rb/var/rb-samza-bi/lib/commons-compress-1.4.1.jar:/opt/rb/var/rb-samza-bi/lib/commons-configuration-1.6.jar:/opt/rb/var/rb-samza-bi/lib/commons-dbcp2-2.0.1.jar:/opt/rb/var/rb-samza-bi/lib/commons-digester-1.8.jar.
...
uid_monitor.beam.factory\":\"net.redborder.samza.indexing.tranquility.MonitorBeamFactory\",\"systems.druid_monitor.beam.batchSize\":\"2000\",\"task.inputs\":\"kafka.rb_flow_post,kafka.rb_state_post,kafka.rb_monitor,kafka.rb_social,kafka.rb_hashtag\",\"serializers.registry.metrics.class\":\"org.apache.samza.serializers.MetricsSnapshotSerdeFactory\",\"task.checkpoint.replication.factor\":\"1\",\"systems.druid_flow.beam.maxPendingBatches\":\"5\",\"systems.kafka.samza.msg.serde\":\"json\",\"systems.kafka.consumer.zookeeper.connect\":\"RBe2ecp4.e2e01.domain.eg:2181,RBe2ecp5.e2e01.domain.eg:2181,RBe2ecp1.e2e01.domain.eg:2181\",\"systems.druid_state.beam.indexGranularity\":\"300000\",\"systems.druid_social.beam.maxPendingBatches\":\"5\",\"redborder.kafka.rb_flow_post.partitions\":\"2\",\"systems.druid_social.beam.batchSize\":\"2000\",\"systems.druid_hashtag.beam.factory\":\"net.redborder.samza.indexing.tranquility.HashtagBeamFactory\",\"systems.druid_state.samza.factory\":\"com.metamx.tranquility.samza.BeamSystemFactory\"}, JAVA_OPTS -> ) for application_1444869365157_0002
2015-10-15 07:39:18 ClientHelper [INFO] set package url to scheme: "file" port: -1 file: "/opt/rb/var/rb-samza-bi/app/rb-samza-bi.tar.gz" for application_1444869365157_0002
2015-10-15 07:39:18 ClientHelper [INFO] set package size to 129772467 for application_1444869365157_0002
2015-10-15 07:39:18 ClientsHelper [INFO] set memory request to 1024 for application_1444869365157_0002
2015-10-15 07:39:18 ClientHelper [INFO] set cpu core request to 1 for application_1444869365157_0002
2015-10-15 07:39:18 ClientHelper [INFO] set command to List(export SAMZA_LOG_DIR= && ln -sfn  logs && exec ./__package/bin/run-am.sh 1>logs/stdout 2>logs/stderr) for application_1444869365157_0002
2015-10-15 07:39:18 ClientHelper [INFO] set app ID to application_1444869365157_0002
2015-10-15 07:39:18 ClientHelper [INFO] submitting application request for application_1444869365157_0002
2015-10-15 07:39:18 YarnClientImpl [INFO] Submitted application application_1444869365157_0002
2015-10-15 07:39:18 JobRunner [INFO] waiting for job to start
2015-10-15 07:39:18 JobRunner [INFO] job started successfully - Running
2015-10-15 07:39:18 JobRunner [INFO] exiting

After the loading, we are going to check again if the topology is already correct:

# rb_samza.sh  -l
15/10/15 07:39:22 INFO client.RMProxy: Connecting to ResourceManager at RBe2ecp4.e2e01.domain.eg/192.168.11.73:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):2
                Application-Id      Application-Name        Application-Type          User       Queue               State         Final-State         Progress                        Tracking-URL
application_1444869365157_0001          enrichment_1                   Samza          root     default             RUNNING           UNDEFINED               0%      RBe2ecp2.e2e01.domain.eg:56811
application_1444869365157_0002            indexing_1                   Samza          root     default            ACCEPTED           UNDEFINED               0%                                 N/A

 Now events should appear again.

Have more questions? Submit a request

Comments

Powered by Zendesk