大叔问题定位分享(49)hbase集群重启后master初始化失败

hbase集群重启后异常,发现是master初始化失败导致的,在master启动日志中发现问题原因为

2022-05-26 14:06:15,645 WARN org.apache.hadoop.hbase.master.HMaster: hbase:namespace,,1607716627354.56dafb9f3eadaae9e95d5b05f3142a34. is NOT online; state={56dafb9f3eadaae9e95d5b05f3142a34 state=OPEN, ts=1637906648217, server=hadoop-server1,16020,1637905629938}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.

出现问题的原因是HBase:NAMESPACE这一对HBase非常关键的表的区域不能在线。查看该区域的详细信息如下:

[En]

The reason for the problem is that the region of hbase:namespace, a table that is very critical to hbase, cannot be online. The details of viewing this region are as follows:

hbase(main):003:0> get 'hbase:meta', 'hbase:namespace,,1607716627354.56dafb9f3eadaae9e95d5b05f3142a34.'
COLUMN                                                 CELL
 info:regioninfo                                       timestamp=1637905641937, value={ENCODED => 56dafb9f3eadaae9e95d5b05f3142a34, NAME => 'hbase:namespace,,1607716627354.56dafb9f3eadaae9e95d5b05f3142a34.', STARTKEY => '', ENDKEY => ''}
 info:seqnumDuringOpen                                 timestamp=1637905641937, value=\x00\x00\x00\x00\x00\x00\x00o
 info:server                                           timestamp=1637905641937, value=hadoop-server1:16020
 info:serverstartcode                                  timestamp=1637905641937, value=1637905629938
 info:sn                                               timestamp=1637905640873, value=hadoop-server1,16020,1637905629938
 info:state                                            timestamp=1637905641937, value=OPEN
1 row(s)
Took 0.0737 seconds

尝试手动将该区域恢复为在线状态

[En]

Try to restore the region to online manually

hbase(main):034:0> assign '56dafb9f3eadaae9e95d5b05f3142a34'
ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
    at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2998)
    at org.apache.hadoop.hbase.master.MasterRpcServices.assignRegion(MasterRpcServices.java:564)
    at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
    at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
    at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
For usage try 'help "assign"'

陷入无休止的循环。要恢复区域,需要MASTER先启动,但要启动MASTER,区域必须处于在线状态。

[En]

Fall into an endless loop. To restore region, you need master to start first, but to start master, it is necessary that the region is in online state.

有两种方法可以做到这一点。

[En]

There are two ways to do this.

  • 一个是备份难于删除的命名空间目录/user/hbase/data/hbase/命名空间,然后从hbase:meta中恢复hbase:命名空间。
    [En]

    * one is to back up the namespace directory / user/hbase/data/hbase/namespace, which is difficult to delete and then restore the hbase:namespace from the hbase:meta.

  • 一是使用hbck 2工具
    [En]

    * one is to use the hbck2 tool

执行命令

[En]

Execute a command

hbase hbck -j hbase-hbck2-1.1.0.jar assigns 56dafb9f3eadaae9e95d5b05f3142a34

操作完成后,hbase:NAMESPACE恢复,master启动成功。

[En]

After the operation, hbase:namespace resumes and master starts successfully.

Original: https://www.cnblogs.com/barneywill/p/16380977.html
Author: 匠人先生
Title: 大叔问题定位分享(49)hbase集群重启后master初始化失败

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/6937/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

发表回复

登录后才能评论
免费咨询
免费咨询
扫码关注
扫码关注
联系站长

站长Johngo!

大数据和算法重度研究者!

持续产出大数据、算法、LeetCode干货,以及业界好资源!

2022012703491714

微信来撩,免费咨询:xiaozhu_tec

分享本页
返回顶部