how to read a parquet file, in a standalone java code?

teddy teddy picture teddy teddy · Feb 19, 2015 · Viewed 26.1k times · Source

the parquet docs from cloudera shows examples of integration with pig/hive/impala. but in many cases I want to read the parquet file itself for debugging purposes.

is there a straightforward java reader api to read a parquet file ?

Thanks Yang

Answer

rishiehari picture rishiehari · Jan 21, 2017

Old method: (deprecated)

AvroParquetReader<GenericRecord> reader = new AvroParquetReader<GenericRecord>(file);
GenericRecord nextRecord = reader.read();

New method:

ParquetReader<GenericRecord> reader = AvroParquetReader.<GenericRecord>builder(file).build();
GenericRecord nextRecord = reader.read();

I got this from here and have used this in my test cases successfully.