Question populating nested records in Avro using a GenericRecord

user770215 picture user770215 · Mar 30, 2011 · Viewed 7.7k times · Source

Suppose I’ve got the following schema:

{
 "name" : "Profile",
 "type" : "record",
 "fields" : [
  { "name" : "firstName", "type" : "string" },
  { "name" : "address" , "type" : {
   "type" : "record",
   "name" : "AddressUSRecord",
   "fields" : [
    { "name" : "address1" , "type" : "string" },
    { "name" : "address2" , "type" : "string" },
    { "name" : "city" , "type" : "string" },
    { "name" : "state" , "type" : "string" },
    { "name" : "zip" , "type" : "int" },
    { "name" : "zip4", "type": "int" }
   ]
  }
 }
]
}

I’m using a GenericRecord to represent each Profile that gets created. To add a firstName, it’s easy to do the following:

Schema  sch =  Schema.parse(schemaFile);
DataFileWriter<GenericRecord> fw = new DataFileWriter<GenericRecord>(new GenericDatumWriter<GenericRecord>()).create(sch, new File(outFile));
GenericRecord r = new GenericData.Record(sch);
r.put(“firstName”, “John”);
fw.append(r);

But how would I set the city, for example? How do I represent the key as a string that the r.put method can understand?

Thanks

Answer

user770215 picture user770215 · Mar 31, 2011

For the schema above:

GenericRecord t = new GenericData.Record(sch.getField("address").schema());
t.put("city","beijing");
r.put("address",t);