Follow

RBKB-2015-0002: Problems updating or reading S3 contents

ID: RBKB-2015-0002
Name: Problems updating or reading S3 contents
Date: 10/17/2015

Issue

On rare occasions the data storaged in S3 may are not accessible or they aren't updated correctly affecting to several services like the transformation of Real-Time data into historical data, degrading the service or the obtention of images for the dashboards.

Environment

  • redBorder 3.1.38-XX (all releases)

Main Cause

The main cause seems to be based on the service S3 itself, riak in our case. In order to verify its status it must execute the following: 

# rb_riak_status.rb
================================= Membership ==================================
Status Ring Pending Node
-------------------------------------------------------------------------------
joining 0.0% 0.0% 'riak@192.168.105.104'
joining 0.0% 0.0% 'riak@192.168.105.105'
valid 93.8% 34.4% 'riak@192.168.105.101'
valid 4.7% 32.8% 'riak@192.168.105.103'
valid 1.6% 32.8% 'riak@192.168.105.106'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:2 / Down:0

It can be noted that riak is in an anomalous status, in this case two nodes haven't completed their joint and three of them are valid but with an incorrect distribution.

Resolution

First of all you have to execute riak-admin to check which changes are pending and how these will be implemented. Use the option "cluster plan" for this ckecking.

 

# riak-admin cluster plan
=============================== Staged Changes ================================
Action         Details(s)
-------------------------------------------------------------------------------
join           'riak@192.168.105.104'
join           'riak@192.168.105.105'
-------------------------------------------------------------------------------

NOTE: Applying these changes will result in 2 cluster transitions

###############################################################################
                         After cluster transition 1/2
###############################################################################

================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      34.4%     34.4%    'riak@192.168.105.101'
valid      32.8%     32.8%    'riak@192.168.105.103'
valid       0.0%      0.0%    'riak@192.168.105.104'
valid       0.0%      0.0%    'riak@192.168.105.105'
valid      32.8%     32.8%    'riak@192.168.105.106'
-------------------------------------------------------------------------------
Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

WARNING: Not all replicas will be on distinct nodes

Partitions reassigned from cluster changes: 38
  18 reassigned from 'riak@192.168.105.101' to 'riak@192.168.105.103'
  20 reassigned from 'riak@192.168.105.101' to 'riak@192.168.105.106'

Transfers resulting from cluster changes: 38
  18 transfers from 'riak@192.168.105.101' to 'riak@192.168.105.103'
  20 transfers from 'riak@192.168.105.101' to 'riak@192.168.105.106'

###############################################################################
                         After cluster transition 2/2
###############################################################################

================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      34.4%     20.3%    'riak@192.168.105.101'
valid      32.8%     20.3%    'riak@192.168.105.103'
valid       0.0%     20.3%    'riak@192.168.105.104'
valid       0.0%     20.3%    'riak@192.168.105.105'
valid      32.8%     18.8%    'riak@192.168.105.106'
-------------------------------------------------------------------------------
Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Transfers resulting from cluster changes: 50
  4 transfers from 'riak@192.168.105.103' to 'riak@192.168.105.101'
  4 transfers from 'riak@192.168.105.103' to 'riak@192.168.105.104'
  4 transfers from 'riak@192.168.105.106' to 'riak@192.168.105.101'
  4 transfers from 'riak@192.168.105.101' to 'riak@192.168.105.103'
  5 transfers from 'riak@192.168.105.106' to 'riak@192.168.105.104'
  4 transfers from 'riak@192.168.105.101' to 'riak@192.168.105.106'
  4 transfers from 'riak@192.168.105.103' to 'riak@192.168.105.106'
  4 transfers from 'riak@192.168.105.106' to 'riak@192.168.105.103'
  5 transfers from 'riak@192.168.105.101' to 'riak@192.168.105.105'
  4 transfers from 'riak@192.168.105.103' to 'riak@192.168.105.105'
  4 transfers from 'riak@192.168.105.101' to 'riak@192.168.105.104'
  4 transfers from 'riak@192.168.105.106' to 'riak@192.168.105.105'
 

 Next you need to commit the currently staged cluster changes:

# riak-admin cluster commit
Cluster changes committed
 

After that, check again riak's status:

# rb_riak_status.rb
================================= Membership ==================================
Status Ring Pending Node
-------------------------------------------------------------------------------
valid 34.4% 20.3% 'riak@192.168.105.101'
valid 32.8% 20.3% 'riak@192.168.105.103'
valid 0.0% 20.3% 'riak@192.168.105.104'
valid 0.0% 20.3% 'riak@192.168.105.105'
valid 32.8% 18.8% 'riak@192.168.105.106'
-------------------------------------------------------------------------------
Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

================================== Claimant ===================================
Claimant: 'riak@192.168.105.101'
Status: up
Ring Ready: true

============================== Ownership Handoff ==============================
Owner: riak@192.168.105.101
Next Owner: riak@192.168.105.103

Index: 365375409332725729550921208179070754913983135744
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 822094670998632891489572718402909198556462055424
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1164634117248063262943561351070788031288321245184
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1278813932664540053428224228626747642198940975104
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.101
Next Owner: riak@192.168.105.104

Index: 274031556999544297163190906134303066185487351808
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 730750818665451459101842416358141509827966271488
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1073290264914881830555831049026020342559825461248
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1187470080331358621040493926581979953470445191168
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.101
Next Owner: riak@192.168.105.105

Index: 182687704666362864775460604089535377456991567872
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 639406966332270026714112114313373821099470487552
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 981946412581700398168100746981252653831329677312
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1096126227998177188652763624537212264741949407232
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1438665674247607560106752257205091097473808596992
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.101
Next Owner: riak@192.168.105.106

Index: 91343852333181432387730302044767688728495783936
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 548063113999088594326381812268606132370974703616
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1004782375664995756265033322492444576013453623296
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1347321821914426127719021955160323408745312813056
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.103
Next Owner: riak@192.168.105.101

Index: 114179815416476790484662877555959610910619729920
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 228359630832953580969325755111919221821239459840
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 570899077082383952423314387779798054553098649600
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 1027618338748291114361965898003636498195577569280
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.103
Next Owner: riak@192.168.105.104

Index: 45671926166590716193865151022383844364247891968
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 388211372416021087647853783690262677096107081728
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 844930634081928249586505293914101120738586001408
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 1301649895747835411525156804137939564381064921088
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.103
Next Owner: riak@192.168.105.105

Index: 296867520082839655260123481645494988367611297792
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 411047335499316445744786359201454599278231027712
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 753586781748746817198774991869333432010090217472
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 1210306043414653979137426502093171875652569137152
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.103
Next Owner: riak@192.168.105.106

Index: 205523667749658222872393179600727299639115513856
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 319703483166135013357056057156686910549735243776
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 662242929415565384811044689824565743281594433536
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 1118962191081472546749696200048404186924073353216
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.106
Next Owner: riak@192.168.105.101

Index: 342539446249430371453988632667878832731859189760
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 685078892498860742907977265335757665463718379520
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 799258707915337533392640142891717276374338109440
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1141798154164767904846628775559596109106197299200
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.106
Next Owner: riak@192.168.105.103

Index: 251195593916248939066258330623111144003363405824
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 593735040165679310520246963290989976735222595584
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 707914855582156101004909840846949587645842325504
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1050454301831586472458898473514828420377701515264
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.106
Next Owner: riak@192.168.105.104

Index: 159851741583067506678528028578343455274867621888
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 502391187832497878132516661246222288006726811648
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 616571003248974668617179538802181898917346541568
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 959110449498405040071168171470060731649205731328
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1415829711164312202009819681693899175291684651008
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

-------------------------------------------------------------------------------
Owner: riak@192.168.105.106
Next Owner: riak@192.168.105.105

Index: 68507889249886074290797726533575766546371837952
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

Index: 525227150915793236229449236757414210188850757632
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 867766597165223607683437869425293042920709947392
Waiting on: [riak_kv_vnode,riak_pipe_vnode]

Index: 1324485858831130769622089379649131486563188867072
Waiting on: [riak_kv_vnode]
Complete: [riak_pipe_vnode]

-------------------------------------------------------------------------------

============================== Unreachable Nodes ==============================
All nodes are up and reachable 

After all nodes seem to be ok, you should execute partition tranfers:

# riak-admin transfers
'riak@192.168.105.106' waiting to handoff 19 partitions
'riak@192.168.105.105' waiting to handoff 32 partitions
'riak@192.168.105.104' waiting to handoff 32 partitions
'riak@192.168.105.103' waiting to handoff 21 partitions
'riak@192.168.105.101' waiting to handoff 31 partitions

Active Transfers:

transfer type: ownership_transfer
vnode type: riak_kv_vnode
partition: 274031556999544297163190906134303066185487351808
started: 2015-10-13 07:41:39 [1.76 min ago]
last update: 2015-10-13 07:43:24 [804.64 ms ago]
total size: unknown
objects transferred: 247234

                         2364 Objs/s
  riak@192.168.105.101     =======>    riak@192.168.105.104
        |===========================================| N/A%
                          11.00 MB/s

transfer type: ownership_transfer
vnode type: riak_kv_vnode
partition: 548063113999088594326381812268606132370974703616
started: 2015-10-13 07:43:19 [5.49 s ago]
last update: 2015-10-13 07:43:23 [919.67 ms ago]
total size: unknown
objects transferred: 23220

                         5076 Objs/s
  riak@192.168.105.101     =======>    riak@192.168.105.106
        |===========================================| N/A%
                          2.21 MB/s

transfer type: ownership_transfer
vnode type: riak_kv_vnode
partition: 753586781748746817198774991869333432010090217472
started: 2015-10-13 07:40:03 [3.36 min ago]
last update: 2015-10-13 07:43:24 [700.04 ms ago]
total size: unknown
objects transferred: 247390

                         1230 Objs/s
  riak@192.168.105.103     =======>    riak@192.168.105.105
        |===========================================| N/A%
                          5.47 MB/s

transfer type: ownership_transfer
vnode type: riak_kv_vnode
partition: 411047335499316445744786359201454599278231027712
started: 2015-10-13 07:41:33 [1.85 min ago]
last update: 2015-10-13 07:43:24 [607.71 ms ago]
total size: unknown
objects transferred: 245496

                         2224 Objs/s
  riak@192.168.105.103     =======>    riak@192.168.105.105
        |===========================================| N/A%
                          1.61 MB/s

transfer type: ownership_transfer
vnode type: riak_kv_vnode
partition: 1415829711164312202009819681693899175291684651008
started: 2015-10-13 07:42:11 [1.23 min ago]
last update: 2015-10-13 07:43:24 [761.57 ms ago]
total size: unknown
objects transferred: 245323

                         3366 Objs/s
  riak@192.168.105.106     =======>    riak@192.168.105.104
        |===========================================| N/A%
                          1.29 MB/s

 Riak service (S3) should be available and ready again to serve or to storage data.

 

Have more questions? Submit a request

Comments

Powered by Zendesk