Json to avro conversion

shree11 picture shree11 · Apr 17, 2015 · Viewed 32.1k times · Source

I'm converting Json to avro. I have json data in JSONArray. So while converting it into byte array i'm facing the problem.

below is my code:

static byte [] fromJsonToAvro(JSONArray json, String schemastr) throws Exception {

ExcelToJson ejj = new ExcelToJson();
List<String> list = new ArrayList<String>();


if (json != null) { 
    int len = json.length();
    for (int i=0;i<len;i++){ 
        list.add(json.get(i).toString());
    } 
}


InputStream input = new ByteArrayInputStream(list.getBytes()); //json.toString().getBytes()

 DataInputStream din = new DataInputStream(input); 
                  .
                  . 
                  .//rest of the logic

So how can i do it? How to convert JsonArray object to bytes(i.e., how to use getBytes() method for JsonArray objects). The above code giving an error at list.getBytes() and saying getBytes() is undifined for list.

Answer

Keegan picture Keegan · May 12, 2015

Avro works at the record level, bound to a schema. I don't think there's such a concept as "convert this JSON fragment to bytes for an Avro field independent of any schema or record".

Assuming the array is part of a larger JSON record, if you're starting with a string of the record, you could do

public static byte[] jsonToAvro(String json, String schemaStr) throws IOException {
    InputStream input = null;
    DataFileWriter<GenericRecord> writer = null;
    Encoder encoder = null;
    ByteArrayOutputStream output = null;
    try {
        Schema schema = new Schema.Parser().parse(schemaStr);
        DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>(schema);
        input = new ByteArrayInputStream(json.getBytes());
        output = new ByteArrayOutputStream();
        DataInputStream din = new DataInputStream(input);
        writer = new DataFileWriter<GenericRecord>(new GenericDatumWriter<GenericRecord>());
        writer.create(schema, output);
        Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
        GenericRecord datum;
        while (true) {
            try {
                datum = reader.read(null, decoder);
            } catch (EOFException eofe) {
                break;
            }
            writer.append(datum);
        }
        writer.flush();
        return output.toByteArray();
    } finally {
        try { input.close(); } catch (Exception e) { }
    }
}