大叔问题定位分享(50)hbase有一个region一直处于rit状态(非超时)

【自取】最近整理的,有需要可以领取学习:

HMaster页面上Regions In Transition部分有一个region一直处于transition状态,但是没有超时,而是不断重试,1s会重试4-5次,region信息

NS1:TB1,4120J5402AAD3N76TRTffUlocation1618464157000,1637905603483.47f541c30ccfd046c5366274fdf56e7d.

master报错日志如下

2022-05-26 17:58:18,934 WARN org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: IOException during HDFSBlocksDistribution computation. for region = 47f541c30ccfd046c5366274fdf56e7d
java.io.FileNotFoundException: File does not exist: hdfs://nameservice1/user/hbase/data/NS1/TB1/e4da96749cbc1d574a78365b77590a25/cf/0b4c33a2ff4440ecb5b67005d33dfd12
    at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1499)
    at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1492)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1507)
    at org.apache.hadoop.hbase.regionserver.StoreFileInfo.getReferencedFileStatus(StoreFileInfo.java:352)
    at org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistributionInternal(StoreFileInfo.java:321)
    at org.apache.hadoop.hbase.regionserver.StoreFileInfo.computeHDFSBlocksDistribution(StoreFileInfo.java:315)
    at org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1221)
    at org.apache.hadoop.hbase.regionserver.HRegion.computeHDFSBlocksDistribution(HRegion.java:1189)
    at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder.internalGetTopBlockLocation(RegionLocationFinder.java:198)
    at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1$1.call(RegionLocationFinder.java:81)
    at org.apache.hadoop.hbase.master.balancer.RegionLocationFinder$1$1.call(RegionLocationFinder.java:78)
    at org.apache.hbase.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
    at org.apache.hbase.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
    at org.apache.hbase.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

签入HDFS。目标文件不存在。

[En]

Check in hdfs. The target file does not exist.

hdfs dfs -ls /user/hbase/data/NS1/TB1/e4da96749cbc1d574a78365b77590a25/cf/
-rw-r–r– 3 hbase hbase 8686520819 2021-11-26 17:57 /user/hbase/data/NS1/TB1/e4da96749cbc1d574a78365b77590a25/cf/187f2a9fd2354e71a5fb916a6a7d40f8

尝试重新创建空文件

[En]

Try to recreate an empty file

hdfs dfs -touch /user/hbase/data/IOT_PROD/T_LOCATION/e4da96749cbc1d574a78365b77590a25/cf/0b4c33a2ff4440ecb5b67005d33dfd12

发现错误,因为空文件不是合法的hfile文件,并报告了格式错误。

[En]

An error was found because the empty file is not a legitimate hfile file and the format error is reported.

2022-05-26 22:10:49,499 WARN org.apache.hadoop.hbase.regionserver.HRegion: Failed initialize of region= NS1:TB1,4120J5402AAD3N76TRTffUlocation1618464157000,1637905603483.47f541c30ccfd046c5366274fdf56e7d., star
ting to roll back memstore
java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://nameservice1/user/hbase/data/NS1/TB1/47f541c30ccfd046c5366274fdf
56e7d/cf/0b4c33a2ff4440ecb5b67005d33dfd12.e4da96749cbc1d574a78365b77590a25
    at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1079)
    at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:940)
    at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:896)
    at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7221)
    at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7180)
    at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7152)
    at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7110)
    at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7061)
    at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
    at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
    at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://nameservice1/user/hbase/data/NS1/TB1/47f541c30ccfd046c5366274fdf56e7d/cf/0
b4c33a2ff4440ecb5b67005d33dfd12.e4da96749cbc1d574a78365b77590a25
    at org.apache.hadoop.hbase.regionserver.HStore.openStoreFiles(HStore.java:590)
    at org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:557)
    at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:303)
    at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5708)
    at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1043)
    at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1040)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    ... 3 more
Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file hdfs://nameservice1/user/hbase/data/NS1/TB1/47f541c30ccfd046c5366274fdf56e7d/cf/0b4c33a2ff4440ecb5b670
05d33dfd12.e4da96749cbc1d574a78365b77590a25
    at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:545)
    at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:579)
    at org.apache.hadoop.hbase.regionserver.StoreFileReader.<init>(StoreFileReader.java:108)
    at org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:108)
    at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:282)
    at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:368)
    at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:476)
    at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:703)
    at org.apache.hadoop.hbase.regionserver.HStore.lambda$openStoreFiles$1(HStore.java:573)
    ... 6 more
Caused by: java.lang.IllegalArgumentException
    at java.nio.Buffer.position(Buffer.java:244)
    at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:405)
    at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:532)
    ... 14 more
</init></init></init>

开始查看HBase源代码,并尝试编写一个空的hfile

[En]

Start looking at the hbase source code and try to write an empty hfile

import junit.framework.TestCase
import org.apache.hadoop.fs.Path
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.fs.HFileSystem
import org.apache.hadoop.hbase.io.hfile.{CacheConfig, HFile, HFileContextBuilder}

class HFileGenerator extends TestCase {
  var conf = HBaseConfiguration.create();
  var fs = HFileSystem.get(conf);
  def testGenerate : Unit = {
    var cacheConf = new CacheConfig(conf);
    var f = new Path("/tmp", "test");
    var context = new HFileContextBuilder().withIncludesTags(false).build();
    var w = HFile.getWriterFactory(conf, cacheConf).withPath(fs, f).withFileContext(context).create();
    w.close();
  }
}

将空的hfile写入

[En]

Write an empty hfile to the

/user/hbase/data/NS1/TB1/e4da96749cbc1d574a78365b77590a25/cf/0b4c33a2ff4440ecb5b67005d33dfd12

有一份新的错误报告。

[En]

There is a new error report.

java.io.IOException: java.io.IOException: java.io.FileNotFoundException: File does not exist: /user/hbase/data/NS1/TB1/e4da96749cbc1d574a78365b77590a25/cf/1d6a7331097b40268c214d3b1260cb68

如果重复上述过程,则Region初始化成功,RIT状态被解析。

[En]

If you repeat the above process, region initializes successfully and rit status is resolved.

Original: https://www.cnblogs.com/barneywill/p/16381726.html
Author: 匠人先生
Title: 大叔问题定位分享(50)hbase有一个region一直处于rit状态(非超时)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/6935/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

发表回复

登录后才能评论
免费咨询
免费咨询
扫码关注
扫码关注
联系站长

站长Johngo!

大数据和算法重度研究者!

持续产出大数据、算法、LeetCode干货,以及业界好资源!

2022012703491714

微信来撩,免费咨询:xiaozhu_tec

分享本页
返回顶部