Top "Apache-pig" questions

Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.

Conditional Filter in GROUP BY in Pig

I have the following dataset in which I need to merge multiple rows into one if they have the same …

hadoop apache-pig hadoop-streaming
Storing results of UNION in PIG in a single file

I have a PIG Script which produces four results I want to store all of them in a single file. …

hadoop apache-pig hdfs
pig to hadoop issue: Server IPC version 7 cannot communicate with client version 4

I am trying to get pig started and failing: $ pig 2013-05-10 18:03:22,972 [main] INFO org.apache.pig.Main - Apache …

hadoop apache-pig
How can I add a header row to files created from Pig (Hadoop)?

I'm writing a pig latin script similar to the following: A = load 'data' using PigStorage('\t'); store A into …

hadoop apache-pig
Storing data to SequenceFile from Apache Pig

Apache Pig can load data from Hadoop sequence files using the PiggyBank SequenceFileLoader: REGISTER /home/hadoop/pig/contrib/piggybank/java/…

hadoop apache-pig
In Apache Pig, select DISTINCT rows based on a single column

Let's say I have a table such as the one below, that may or may not contain duplicates for a …

group-by apache-pig distinct
Pig - ERROR 1045: AVG as multiple or none of them fit. Please use an explicit cast

I have a comma seperated .txt file, I want to DUMP the AVG age of all Males. records = LOAD 'file:/…

hadoop mapreduce apache-pig bigdata
How do I suppress the bloat of useless information when using the DUMP command while using grunt via 'pig -x local'?

I'm working with PigLatin, using grunt, and every time I 'dump' stuffs, my console gets clobbered with blah blah, blah …

dump apache-pig gruntjs verbosity
Pig keeps trying to connect to job history server (and fails)

I'm running a Pig job that fails to connect to the Hadoop job history server. The task (usually any task …

hadoop apache-pig
Installing PIG on single node

I installed Hadoop (1.0.2) for a single node on Windows 7 with Cygwin, and it is working. However, I cannot get PIG (0.10.0) …

hadoop apache-pig