Apache Avro: map uses CharSequence as key

Mellon picture Mellon · Nov 1, 2013 · Viewed 8.5k times · Source

I am using Apache Avro.

My schema has map type:

{"name": "MyData", 
  "type" :  {"type": "map", 
              "values":{
                   "type": "record",
                   "name": "Person",
                   "fields":[
                      {"name": "name", "type": "string"},
                      {"name": "age", "type": "int"},

                ]
                }
               }
}

After compile the schema, the genated Java class use CharSequence as the key for the Map MyData.

It is very inconvenient to use CharSequence in Map as key, is there a way to generate String type key for Map in Apache Avro?

P.S.

Problem is that, for example dataMap.containsKey("SOME_KEY") will returns false even though there is such key there, just because it is CharSequence. Besides, put an map entry with a existing key doesn't relpace the old one. That's why I say it is inconvenient to use CharSequence as key.

Answer

Alex A. picture Alex A. · Nov 8, 2013

This JIRA discussion is relevant. The main point of CharSequence still being used is backwards-compatability.

And like Charles Forsythe pointed out, there has been added a workaround for when String is necessary, by setting the string property in the schema.

 { "type": "string", "avro.java.string": "String" }

The default type here is their own Utf8 class. In addition to manual specification and the pom.xml setting, there is even an avro-tools compile option for it, the -string option:

java -jar avro-tools.1.7.5.jar compile -string schema /path/to/schema .