Avro field default values

Kesh picture Kesh · Apr 8, 2014 · Viewed 66.9k times · Source

I am running into some issues setting up default values for Avro fields. I have a simple schema as given below:

data.avsc:

{
 "namespace":"test",
 "type":"record",
 "name":"Data",
 "fields":[
    { "name": "id", "type": [ "long", "null" ] },
    { "name": "value", "type": [ "string", "null" ] },
    { "name": "raw", "type": [ "bytes", "null" ] }
 ]
}

I am using the avro-maven-plugin v1.7.6 to generate the Java model.

When I create an instance of the model using: Data data = Data.newBuilder().build();, it fails with an exception:

org.apache.avro.AvroRuntimeException: org.apache.avro.AvroRuntimeException: Field id type:UNION pos:0 not set and has no default value.

But if I specify the "default" property,

{ "name": "id", "type": [ "long", "null" ], "default": "null" },

I do not get this error. I read in the documentation that first schema in the union becomes the default schema. So my question is, why do I still need to specify the "default" property? How else do I make a field optional?

And if I do need to specify the default values, how does that work for a union; do I need to specify default values for each schema in the union and how does that work in terms of order/syntax?

Thanks.

Answer

Y.H. picture Y.H. · Apr 30, 2014

The default value of a union corresponds to the first schema of the union (Source). Your union is defined as ["long", "null"] therefor the default value must be a long number. null is not a long number that is why you are getting an error.

If you still want to define null as a default value then put null schema first, i.e. change the union to ["null", "long"] instead.