org.apache.pig.builtin
Class Distinct
java.lang.Object
org.apache.pig.EvalFunc<DataBag>
org.apache.pig.builtin.Distinct
- All Implemented Interfaces:
- Algebraic
public class Distinct
- extends EvalFunc<DataBag>
- implements Algebraic
Find the distinct set of tuples in a bag.
This is a blocking operator. All the input is put in the hashset implemented
in DistinctDataBag which also provides the other DataBag interfaces.
Methods inherited from class org.apache.pig.EvalFunc |
finish, getArgToFuncMapping, getCacheFiles, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, outputSchema, progress, setPigLogger, setReporter, warn |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Distinct
public Distinct()
exec
public DataBag exec(Tuple input)
throws IOException
- Description copied from class:
EvalFunc
- This callback method must be implemented by all subclasses. This
is the method that will be invoked on every Tuple of a given dataset.
Since the dataset may be divided up in a variety of ways the programmer
should not make assumptions about state that is maintained between
invocations of this method.
- Specified by:
exec
in class EvalFunc<DataBag>
- Parameters:
input
- the Tuple to be processed.
- Returns:
- result, of type T.
- Throws:
IOException
getFinal
public String getFinal()
- Description copied from interface:
Algebraic
- Get the final function.
- Specified by:
getFinal
in interface Algebraic
- Returns:
- A function name of f_final. f_final should be an eval func parametrized by
the same datum as the eval func implementing this interface.
getInitial
public String getInitial()
- Description copied from interface:
Algebraic
- Get the initial function.
- Specified by:
getInitial
in interface Algebraic
- Returns:
- A function name of f_init. f_init should be an eval func.
getIntermed
public String getIntermed()
- Description copied from interface:
Algebraic
- Get the intermediate function.
- Specified by:
getIntermed
in interface Algebraic
- Returns:
- A function name of f_intermed. f_intermed should be an eval func.
getDistinct
protected DataBag getDistinct(Tuple input)
throws IOException
- Throws:
IOException
Copyright © 2012 The Apache Software Foundation