Breaking News
Loading...

Ex: Pig Aggregation





Input:



Step 1: Enter into grunt shell
Pig –x local


Step 2: Load data
log = LOAD ‘/var/lib/hadoop-0.20/inputs/pigfile1’ AS (user, id, welcome);

On loading data and on executing dump command on the above log, data is stored as shown below.


Step 3: Group the log by user id
grpd= GROUP log BY user;
On dumping grpd,grpd contains the below content


Step 4: 
cntd= FOREACH grpd GENERATE group, COUNT(log);


Step 5:  Store the output to a file
STORE cntd INTO ‘/var/lib/hadoop-0.20/inputs/pigfile1output2’;



The above is the final output.



- See more at: http://labstrikes.blogspot.in/2012/08/adsense-middle-blog-post.html#sthash.gQgSkqx8.dpuf
 
Toggle Footer