org.apache.pig.builtin
Class TOBAG
java.lang.Object
org.apache.pig.EvalFunc<DataBag>
org.apache.pig.builtin.TOBAG
public class TOBAG
- extends EvalFunc<DataBag>
This class takes a list of items and puts them into a bag
T = foreach U generate TOBAG($0, $1, $2);
It's like saying this:
T = foreach U generate {($0), ($1), ($2)}
Output schema:
The output schema for this udf depends on the schema of its arguments.
If all the arguments have same type and same inner
schema (for bags/tuple columns), then the udf output schema would be a bag
of tuples having a column of the type and inner-schema (if any) of the
arguments.
If the arguments are of type tuple/bag, then their innerschmea, including
the alias names should match.
If these conditions are not met the output schema will be a bag with null
inner schema.
example 1
grunt> describe a;
a: {a0: int,a1: int}
grunt> b = foreach a generate TOBAG(a0,a1);
grunt> describe b;
b: {{int}}
example 2
grunt> describe a;
a: {a0: (x: int),a1: (x: int)}
grunt> b = foreach a generate TOBAG(a0,a1);
grunt> describe b;
b: {{(x: int)}}
example 3
grunt> describe a;
a: {a0: (x: int),a1: (y: int)}
-- note that the inner schema is different because the alises (x & y) are different
grunt> b = foreach a generate TOBAG(a0,a1);
grunt> describe b;
b: {{NULL}}
Constructor Summary |
TOBAG()
|
Methods inherited from class org.apache.pig.EvalFunc |
finish, getArgToFuncMapping, getCacheFiles, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, progress, setPigLogger, setReporter, warn |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TOBAG
public TOBAG()
exec
public DataBag exec(Tuple input)
throws IOException
- Description copied from class:
EvalFunc
- This callback method must be implemented by all subclasses. This
is the method that will be invoked on every Tuple of a given dataset.
Since the dataset may be divided up in a variety of ways the programmer
should not make assumptions about state that is maintained between
invocations of this method.
- Specified by:
exec
in class EvalFunc<DataBag>
- Parameters:
input
- the Tuple to be processed.
- Returns:
- result, of type T.
- Throws:
IOException
outputSchema
public Schema outputSchema(Schema inputSch)
- Description copied from class:
EvalFunc
- Report the schema of the output of this UDF. Pig will make use of
this in error checking, optimization, and planning. The schema
of input data to this UDF is provided.
- Overrides:
outputSchema
in class EvalFunc<DataBag>
- Parameters:
inputSch
- Schema of the input
- Returns:
- Schema of the output
Copyright © 2012 The Apache Software Foundation