org.apache.hadoop.zebra.io
Class KeyDistribution

java.lang.Object
  extended by org.apache.hadoop.zebra.io.KeyDistribution

public class KeyDistribution
extends Object

Class used to convey the information of how on-disk data are distributed among key-partitioned buckets. This class is used by the MapReduce layer to calculate intelligent splits.


Method Summary
 BlockDistribution getBlockDistribution(RawComparable key)
           
 RawComparable[] getKeys()
          Get the list of sampling keys
 long getMinStepSize()
          Get the minimum split step size from all tables in union
 long length()
          Get the total unique bytes contained in the key-partitioned buckets.
static KeyDistribution merge(KeyDistribution[] sourceKeys)
          Merge the key samples Algorithm: select the smallest key from all clean source ranges and ranges subsequent to respective dirty ranges.
 int resize(BlockDistribution lastBd)
           
 int size()
          Get the size of the key sampling.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

length

public long length()
Get the total unique bytes contained in the key-partitioned buckets.

Returns:
The total number of bytes contained in the key-partitioned buckets.

size

public int size()
Get the size of the key sampling.

Returns:
Number of key samples.

getMinStepSize

public long getMinStepSize()
Get the minimum split step size from all tables in union


getKeys

public RawComparable[] getKeys()
Get the list of sampling keys

Returns:
A list of sampling keys

getBlockDistribution

public BlockDistribution getBlockDistribution(RawComparable key)

merge

public static KeyDistribution merge(KeyDistribution[] sourceKeys)
                             throws IOException
Merge the key samples Algorithm: select the smallest key from all clean source ranges and ranges subsequent to respective dirty ranges. A dirty range is a range that has been partially needed by one or more of the previous final ranges.

Parameters:
sourceKeys - key samples to be merged
Returns:
the merged key samples
Throws:
IOException

resize

public int resize(BlockDistribution lastBd)


Copyright © 2012 The Apache Software Foundation