Breaking News
Loading...

Pig Data Types




Data Models:
•Supports 4 basic types
Atom: a simple atomic value (int, long, double, string)
              ex: „Edureka.
Tuple: a sequence of fields that can be any of the data types
              ex: („Edureka., „Bangalore.)
Bag: a collection of tuples of potentially varying structures
             ex: {(„Educomp), („Edureka, „Bangalore.)}

A bag is one of the data models present in Pig. It is an unordered collection of tuples with possible duplicates. Bags are used to store collections while grouping. The size of bag is the size of the local disk, this means that the size of the bag is limited. When the bag is full, then Pig will spill this bag into local disk and keep only some parts of the bag in memory. There is no necessity that the complete bag should fit into memory. We represent bags with “{}”. - See more at: http://www.edureka.in/blog/hadoop-interview-questions-pig/#sthash.coe2gfBo.dpuf
 A bag is one of the data models present in Pig. It is an unordered collection of tuples with possible duplicates. Bags are used to store collections while grouping. The size of bag is the size of the local disk, this means that the size of the bag is limited. When the bag is full, then Pig will spill this bag into local disk and keep only some parts of the bag in memory. There is no necessity that the complete bag should fit into memory. We represent bags with “{}”.
A bag is one of the data models present in Pig. It is an unordered collection of tuples with possible duplicates. Bags are used to store collections while grouping. The size of bag is the size of the local disk, this means that the size of the bag is limited. When the bag is full, then Pig will spill this bag into local disk and keep only some parts of the bag in memory. There is no necessity that the complete bag should fit into memory. We represent bags with “{}”. - See more at: http://www.edureka.in/blog/hadoop-interview-questions-pig/#sthash.coe2gfBo.dpuf
A bag is one of the data models present in Pig. It is an unordered collection of tuples with possible duplicates. Bags are used to store collections while grouping. The size of bag is the size of the local disk, this means that the size of the bag is limited. When the bag is full, then Pig will spill this bag into local disk and keep only some parts of the bag in memory. There is no necessity that the complete bag should fit into memory. We represent bags with “{}”. - See more at: http://www.edureka.in/blog/hadoop-interview-questions-pig/#sthash.coe2gfBo.dpuf
A bag is one of the data models present in Pig. It is an unordered collection of tuples with possible duplicates. Bags are used to store collections while grouping. The size of bag is the size of the local disk, this means that the size of the bag is limited. When the bag is full, then Pig will spill this bag into local disk and keep only some parts of the bag in memory. There is no necessity that the complete bag should fit into memory. We represent bags with “{}”. - See more at: http://www.edureka.in/blog/hadoop-interview-questions-pig/#sthash.coe2gfBo.dpuf

Map: an associative array, the key must be a char array but the value can be any type
             ex: {Name:.Edureka.}

Pig Data Types:

Does Pig give any warning when there is a type mismatch or missing field?

No, Pig will not show any warning if there is no matching field or a mismatch. If you assume that Pig gives such a warning, then it is difficult to find in log file. If any mismatch is found, it assumes a null value in Pig.


- See more at: http://labstrikes.blogspot.in/2012/08/adsense-middle-blog-post.html#sthash.gQgSkqx8.dpuf
 
Toggle Footer