Groovy: validate JSON string

ewok picture ewok · Jan 26, 2018 · Viewed 10.2k times · Source

I need to check that a string is valid JSON in Groovy. My first thought was just to send it through new JsonSlurper().parseText(myString) and, if there was no exception, assume it was correct.

However, I discovered that Groovy will happily accept trailing commas with JsonSlurper, but JSON doesn't allow trailing commas. Is there a simple way to validate JSON in Groovy that adhere's to the official JSON spec?

Answer

Szymon Stepniak picture Szymon Stepniak · Jan 26, 2018

JsonSlurper class uses JsonParser interface implementations (with JsonParserCharArray being a default one). Those parsers check char by char what is the current character and what kind of token type it represents. If you take a look at JsonParserCharArray.decodeJsonObject() method at line 139 you will see that if parser sees } character, it breaks the loop and finishes decoding JSON object and ignores anything that exists after }.

That's why if you put any unrecognizable character(s) in front of your JSON object, JsonSlurper will throw an exception. But if you end your JSON string with any incorrect characters after }, it will pass, because parser does not even take those characters into account.

Solution

You may consider using JsonOutput.prettyPrint(String json) method that is more restrict if it comes to JSON it tries to print (it uses JsonLexer to read JSON tokens in a streaming fashion). If you do:

def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}...'

JsonOutput.prettyPrint(jsonString)

it will throw an exception like:

Exception in thread "main" groovy.json.JsonException: Lexing failed on line: 1, column: 48, while reading '.', no possible valid JSON value or punctuation could be recognized.
    at groovy.json.JsonLexer.nextToken(JsonLexer.java:83)
    at groovy.json.JsonLexer.hasNext(JsonLexer.java:233)
    at groovy.json.JsonOutput.prettyPrint(JsonOutput.java:501)
    at groovy.json.JsonOutput$prettyPrint.call(Unknown Source)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
    at app.JsonTest.main(JsonTest.groovy:13)

But if we pass a valid JSON document like:

def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'

JsonOutput.prettyPrint(jsonString)

it will pass successfully.

The good thing is that you don't need any additional dependency to validate your JSON.

UPDATE: solution for multiple different cases

I did some more investigation and run tests with 3 different solutions:

  • JsonOutput.prettyJson(String json)
  • JsonSlurper.parseText(String json)
  • ObjectMapper.readValue(String json, Class<> type) (it requires adding jackson-databind:2.9.3 dependency)

I have used following JSONs as an input:

def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'

Expected result is that first 4 JSONs fail validation and only 5th one is correct. To test it out I have created this Groovy script:

@Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.9.3')

import groovy.json.JsonOutput
import groovy.json.JsonSlurper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature

def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'

def test1 = { String json ->
    try {
        JsonOutput.prettyPrint(json)
        return "VALID"
    } catch (ignored) {
        return "INVALID"
    }
}

def test2 = { String json ->
    try {
        new JsonSlurper().parseText(json)
        return "VALID"
    } catch (ignored) {
        return "INVALID"
    }
}

ObjectMapper mapper = new ObjectMapper()
mapper.configure(DeserializationFeature.FAIL_ON_TRAILING_TOKENS, true)

def test3 = { String json ->
    try {
        mapper.readValue(json, Map)
        return "VALID"
    } catch (ignored) {
        return "INVALID"
    }
}

def jsons = [json1, json2, json3, json4, json5]
def tests = ['JsonOutput': test1, 'JsonSlurper': test2, 'ObjectMapper': test3]

def result = tests.collectEntries { name, test ->
    [(name): jsons.collect { json ->
        [json: json, status: test(json)]
    }]
}

result.each {
    println "${it.key}:"
    it.value.each {
        println " ${it.status}: ${it.json}"
    }
    println ""
}

And here is the result:

JsonOutput:
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

JsonSlurper:
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

ObjectMapper:
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

As you can see the winner is Jackson's ObjectMapper.readValue() method. What's important - it works with jackson-databind >= 2.9.0. In this version they introduced DeserializationFeature.FAIL_ON_TRAILING_TOKENS which makes JSON parser working as expected. If we wont set this configuration feature to true as in the above script, ObjectMapper produces incorrect result:

ObjectMapper:
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
 INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
 VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}

I was surprised that Groovy's standard library fails in this test. Luckily it can be done with jackson-databind:2.9.x dependency. Hope it helps.