1. http://www.slideshare.net/cloudera/hbase-hug-presentation
2. http://hbase.apache.org/book.html#hbase.hregion.memstore.mslab.enabled
hbase.hregion.memstore.mslab.enabled
Enables the MemStore-Local Allocation Buffer, a feature which works to prevent heap fragmentation under heavy write loads. This can reduce the frequency of stop-the-world GC pauses on large heaps.
Default: true
3. http://hbase.apache.org/book/upgrade0.92.html
3.4.2. MSLAB is ON by default
In 0.92.0, the hbase.hregion.memstore.mslab.enabled flag is set to true (See Section 11.3.1.1, “Long GC pauses”). In 0.90.x it was false
. When it is enabled, memstores will step allocate memory in MSLAB 2MB chunks even if the memstore has zero or just a few small elements. This is fine usually but if you had lots of regions per regionserver in a 0.90.x cluster (and MSLAB was off), you may find yourself OOME'ing on upgrade because the thousands of regions * number of column families * 2MB MSLAB (at a minimum) puts your heap over the top. Set hbase.hregion.memstore.mslab.enabled
to false
or set the MSLAB size down from 2MB by settinghbase.hregion.memstore.mslab.chunksize
to something less.
4. http://hbase.apache.org/book/jvm.html#mslab
11.3.1. The Garbage Collector and Apache HBase
In his presentation, Avoiding Full GCs with MemStore-Local Allocation Buffers, Todd Lipcon describes two cases of stop-the-world garbage collections common in HBase, especially during loading; CMS failure modes and old generation heap fragmentation brought. To address the first, start the CMS earlier than default by adding -XX:CMSInitiatingOccupancyFraction
and setting it down from defaults. Start at 60 or 70 percent (The lower you bring down the threshold, the more GCing is done, the more CPU used). To address the second fragmentation issue, Todd added an experimental facility, , that must be explicitly enabled in Apache HBase 0.90.x (Its defaulted to be on in Apache 0.92.x HBase). See hbase.hregion.memstore.mslab.enabled
to true in your Configuration
. See the cited slides for background and detail[25]. Be aware that when enabled, each MemStore instance will occupy at least an MSLAB instance of memory. If you have thousands of regions or lots of regions each with many column families, this allocation of MSLAB may be responsible for a good portion of your heap allocation and in an extreme case cause you to OOME. Disable MSLAB in this case, or lower the amount of memory it uses or float less regions per server.
5.
The Best News of All
After producing the above graph, I let the insert workload run overnight, and then continued for several days. In all of this time, there was not a single GC pause that lasted longer than a second. The fragmentation problem was completely solved for this workload!
How to try it
The MSLAB allocation scheme is available in Apache HBase 0.90.1, and part of CDH3 Beta 4 released last week. Since it is relatively new, it is not yet enabled by default, but it can be configured using the following flags:
Configuration | Description |
---|---|
hbase.hregion.memstore.mslab.enabled | Set to true to enable this feature |
hbase.hregion.memstore.mslab.chunksize | The size of the chunks allocated by MSLAB, in bytes (default 2MB) |
hbase.hregion.memstore.mslab.max.allocation | The maximum size byte array that should come from the MSLAB, in bytes (default 256KB) |
6. http://inking007.tistory.com/entry/Performance-Tuning-Hbase
'nosql' 카테고리의 다른 글
hbase - block, block cache 공부 (0) | 2013.06.07 |
---|---|
hbase 사용사례 발견 (0) | 2013.06.07 |
Apache Sqoop) 기존 RDB 데이터를 Hadoop storage(hbase, hive)으로 저장하기 (1) | 2013.06.03 |
hbase에서 nproc나 ulimit -n (open file number)의 크기를 변경해야 하는 이유 (0) | 2013.06.03 |
[hbase] Bizosys의 Hadoop, Hbase 튜닝 사례 - Intel article (0) | 2013.05.21 |