读书人

flush后split跟compact后split

发布时间: 2012-11-06 14:07:00 作者: rapoo

flush后split和compact后split
什么时候split?
当某store所有文件总大小大于某个值时split,注意,并不是判断某个storefile大小大于某个值。
什么时候compact?
检查整个region内所有store中任一个store 的总storefile是不是太多了(大于hbase.hstore.blockingStoreFiles(7)),太多了则会先进行compact。

flush会遍历这个region的所有store,并一个个执行。
compact遍历这个region找到符合条件的store进行compact.


1. 在flush之后会判断是否需要split和compact
这里的split有一个判断条件,先计算这tableRegionsCount(regionserver上的这个table的online的region个数),
然后循环计算此region的所有store是否太大,这是通过getSizeToCheck方法计算出一个size,若当前的store总大小大于这个值,则表示此region需要split.

getSizeToCheck的计算方法首先判断tableRegionsCount是否等于0,若是则返回hbase.hregion.max.filesize ,若不是,则计算Math.min(getDesiredMaxFileSize(),
this.flushSize * (tableRegionsCount * tableRegionsCount)。

      boolean shouldCompact = region.flushcache();      // We just want to check the size      boolean shouldSplit = region.checkSplit() != null;      if (shouldSplit) {        this.server.compactSplitThread.requestSplit(region);      } else if (shouldCompact) {        server.compactSplitThread.requestCompaction(region, getName());      }



  private long flushSize;  @Override  protected void configureForRegion(HRegion region) {    super.configureForRegion(region);    this.flushSize = region.getTableDesc() != null?      region.getTableDesc().getMemStoreFlushSize():      getConf().getLong(HConstants.HREGION_MEMSTORE_FLUSH_SIZE,        HTableDescriptor.DEFAULT_MEMSTORE_FLUSH_SIZE);  }  @Override  protected boolean shouldSplit() {    if (region.shouldForceSplit()) return true;    boolean foundABigStore = false;    // Get count of regions that have the same common table as this.region    int tableRegionsCount = getCountOfCommonTableRegions();    // Get size to check    long sizeToCheck = getSizeToCheck(tableRegionsCount);    for (Store store : region.getStores().values()) {      // If any of the stores is unable to split (eg they contain reference files)      // then don't split      if ((!store.canSplit())) {        return false;      }      // Mark if any store is big enough      long size = store.getSize();      if (size > sizeToCheck) {        LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +          " size=" + size + ", sizeToCheck=" + sizeToCheck +          ", regionsWithCommonTable=" + tableRegionsCount);        foundABigStore = true;        break;      }    }    return foundABigStore;  }  /**   * @return Region max size or <code>count of regions squared * flushsize, which ever is   * smaller; guard against there being zero regions on this server.   */  long getSizeToCheck(final int tableRegionsCount) {    return tableRegionsCount == 0? getDesiredMaxFileSize():      Math.min(getDesiredMaxFileSize(),        this.flushSize * (tableRegionsCount * tableRegionsCount));  }  /**   * @return Count of regions on this server that share the table this.region   * belongs to   */  private int getCountOfCommonTableRegions() {    RegionServerServices rss = this.region.getRegionServerServices();    // Can be null in tests    if (rss == null) return 0;    byte [] tablename = this.region.getTableDesc().getName();    int tableRegionsCount = 0;    try {      List<HRegion> hri = rss.getOnlineRegions(tablename);      tableRegionsCount = hri == null || hri.isEmpty()? 0: hri.size();    } catch (IOException e) {      LOG.debug("Failed getOnlineRegions " + Bytes.toString(tablename), e);    }    return tableRegionsCount;  }


2. compact后split
CompactionRequest.run中,compact完成之后,若完成了compact,则继续判断是否需要compact,判断的依据是if (s.getCompactPriority() <= 0) 表示7减去当前storefile的文件数是否<=0,也就是还有许多文件需要compact。
否则则进行split,在CompactSplitThread.requestSplit中,if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) ,首先判断系统设置的hbase.regionserver.regionSplitLimit(此参数可以限制整个系统总的region数)总region数是否大于当前在线的region数,若大于就不会split,再判断是否有这个region所有store中7-文件数>=1的store,两者都符合则split.

有一个疑问:难道不需要判断一下文件大小再split吗???

        boolean completed = r.compact(this);        long now = EnvironmentEdgeManager.currentTimeMillis();        LOG.info(((completed) ? "completed" : "aborted") + " compaction: " +              this + "; duration=" + StringUtils.formatTimeDiff(now, start));        if (completed) {          server.getMetrics().addCompaction(now - start, this.totalSize);          // degenerate case: blocked regions require recursive enqueues          if (s.getCompactPriority() <= 0) {            server.compactSplitThread              .requestCompaction(r, s, "Recursive enqueue");          } else {            // see if the compaction has caused us to exceed max region size            server.compactSplitThread.requestSplit(r);          }        }


  public synchronized boolean requestSplit(final HRegion r) {    // don't split regions that are blocking    if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) {      byte[] midKey = r.checkSplit();      if (midKey != null) {        requestSplit(r, midKey);        return true;      }    }    return false;  }


  private boolean shouldSplitRegion() {    return (regionSplitLimit > server.getNumberOfOnlineRegions());  }

    this.regionSplitLimit = conf.getInt("hbase.regionserver.regionSplitLimit",        Integer.MAX_VALUE);


  public int getCompactPriority() {    int count = Integer.MAX_VALUE;    for(Store store : stores.values()) {      count = Math.min(count, store.getCompactPriority());    }    return count;  }



读书人网 >开源软件

热点推荐