Step 1: Enter into grunt shell
Pig –x local
Step 2: Load data
log = LOAD ‘/var/lib/hadoop-0.20/inputs/pigfile1’ AS (user, id, welcome);
On loading data and on executing dump command on the above log, data is stored as shown below.
Step 3: Group the log by user id
grpd= GROUP log BY user;
On dumping grpd,grpd contains the below content
cntd= FOREACH grpd GENERATE group, COUNT(log);
Step 5: Store the output to a file
STORE cntd INTO ‘/var/lib/hadoop-0.20/inputs/pigfile1output2’;
The above is the final output.