Could not infer COUNT function

JasonA picture JasonA · Mar 22, 2012 · Viewed 13k times · Source

I'm trying to write a pig latin script to pull the count of a dataset that I've filtered.

Here's the script so far:

/* scans by title */

scans           = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans    = FILTER scans BY (title MATCHES 'proactiv');
scancount       = FOREACH productscans GENERATE COUNT($0);
DUMP scancount;

For some reason, I get the error:

Could not infer the matching function for org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an explicit cast.

What am I doing wrong here? I'm assuming it has something to do with the type of the field I'm passing in, but I can't seem to resolve this.

TIA, Jason

Answer

Chris White picture Chris White · Mar 23, 2012

Is this what you're looking for (group by all to bring everything into one bag, then count the items):

scans           = LOAD '/hive/scans/*' USING PigStorage(',') AS (thetime:long,product_id:long,lat:double,lon:double,user:chararray,category:chararray,title:chararray);
productscans    = FILTER scans BY (title MATCHES 'proactiv');
grouped         = GROUP productscans ALL;
count           = FOREACH grouped GENERATE COUNT(productscans);
dump count;