Pig Functions Examples


Below is one of the good collection of examples for most frequently used functions in Pig. Pig Functions Examples.

Contents

LOAD

DESCRIEBE/EXPLAIN/ILLUSTRATE

FOREACH

GROUP

STORE

LIMIT

ORDER

DISTINCT

JOIN

JOIN USING MULTIPLE KEYS

OUTER JOINS

SELF JOIN

COUNT NUMBER OF ROWS IN SELF JOIN’S OUTPUT

SAMPLE

PARALLEL

UDF:REGISTER

UDF:DEFINE

CALLING JAVA STATIC FUNCTIONS

FLATTEN

REPLACE EMPTY BAG WITH CONSTANT BAG

NESTED FOREACH

ORDER BY THE group

SAMPLE SCRIPTS

EXEXUTE PIG SCRIPT COMMAND LINE

PIG VERSION

LOAD

DESCRIEBE/EXPLAIN/ILLUSTRATE

FOREACH

GROUP

STORE

LIMIT

ORDER

DISTINCT

JOIN

JOIN USING MULTIPLE KEYS

OUTER JOINS:

SELF JOIN:

COUNT NUMBER OF ROWS IN SELF JOIN’S OUTPUT:

SAMPLE:

PARALLEL:

UDF:REGISTER

UDF:DEFINE

CALLING JAVA STATIC FUNCTIONS:

FLATTEN

REPLACE EMPTY BAG WITH CONSTANT BAG:

NESTED FOREACH

–NOTE: DOESN’T SEEM TO WORK; top3 spits > 3 records

ORDER BY THE group:

SAMPLE SCRIPTS:

[cloudera@localhost ~]$ pig -e ‘illustrate -script scripts/illustrate.pig’;

ERROR:

OTHER:

EXEXUTE PIG SCRIPT COMMAND LINE:

PIG VERSION

–HADOOP VERSION

Note: above command lists the HDFS directory; results are same as executing the command: hadoop fs -ls

Thank You Ganesh Pillai for such a good documentation of these commands with examples.


Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *


Review Comments
default image

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA Hadoop in Dec/2016 December 22, 2016

.