Breaking News
Loading...

UDF





UDF:
There are times when pigs built in operators and functions will not suffice.
Pig provides the ability to implement your own
1.Filter:
   Ex: res = FILTER bag BY udfFilter(post)
2.Load Function:
   Ex: res = load ‘file.txt’ using udfload();
3.Eval:
Ex: res=FOREACH bag GENERATE udfEval($1)

 Implement Custom Eval function:
Eval is the most common type of function.It looks like as below
Public abstract class EvalFunc<T> {
            Public abstract T exec(Tuple Input) throws IOException;
}

Input:


Convert the input data into capitals.

Code:


Write the above java code and generate a jar out of it.

Write a pig script myscript.pig as shown below:
REGISTER myUDF.jar
A = LOAD ‘student_data’ as (nme:chararray,age:int,gpa:float) ;
B = FOREACH A GENERATE UPPER(name);
Dump B;

Implement Custom Filter Function:
Write a custom filter function which will remove records with the provided value of more than 15 characters.
Filtered = FILTER posts BY isShort(post);

Steps to implement a custom filter:
1. Extend FilterFunc class and implement exec method
2. Register Jar with pig script
3. Use custom filter function in the pig script

Java Code:


Compile the java code with filter function and package it into a jar file.
Register the jar file in pig script
REGISTER Hadoopsamples.jar
Path of the jar file can be either absolute or relative to the execution path.
Path must not be wrapped with quotes.
Add JAR file to the Java’s classpath.
Pig locates functions by looking on classpath for fully qualified class name.
Filtered = FILTER posts BY pig.IsShort(post);
Alias can be added to the function name using DEFINE operator.
DEFINE isshort pig.isShort();

Pig Script:
--customfilter.pig
REGISTER Hadoopsamples.jar
DEFINE isShort pig.IsShort();
Posts = LOAD ‘/training/data/user-post.txt’ USING PigStorage(‘,’)
               AS (user:chararray,post:chararray,date:long);
Filtered = FILTER posts BY isShort(post);
Dump filtered;
- See more at: http://labstrikes.blogspot.in/2012/08/adsense-middle-blog-post.html#sthash.gQgSkqx8.dpuf
 
Toggle Footer