I need to check that a string is valid JSON in Groovy. My first thought was just to send it through new JsonSlurper().parseText(myString)
and, if there was no exception, assume it was correct.
However, I discovered that Groovy will happily accept trailing commas with JsonSlurper
, but JSON doesn't allow trailing commas. Is there a simple way to validate JSON in Groovy that adhere's to the official JSON spec?
JsonSlurper
class uses JsonParser
interface implementations (with JsonParserCharArray
being a default one). Those parsers check char by char what is the current character and what kind of token type it represents. If you take a look at JsonParserCharArray.decodeJsonObject()
method at line 139 you will see that if parser sees }
character, it breaks the loop and finishes decoding JSON object and ignores anything that exists after }
.
That's why if you put any unrecognizable character(s) in front of your JSON object, JsonSlurper
will throw an exception. But if you end your JSON string with any incorrect characters after }
, it will pass, because parser does not even take those characters into account.
You may consider using JsonOutput.prettyPrint(String json)
method that is more restrict if it comes to JSON it tries to print (it uses JsonLexer
to read JSON tokens in a streaming fashion). If you do:
def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}...'
JsonOutput.prettyPrint(jsonString)
it will throw an exception like:
Exception in thread "main" groovy.json.JsonException: Lexing failed on line: 1, column: 48, while reading '.', no possible valid JSON value or punctuation could be recognized.
at groovy.json.JsonLexer.nextToken(JsonLexer.java:83)
at groovy.json.JsonLexer.hasNext(JsonLexer.java:233)
at groovy.json.JsonOutput.prettyPrint(JsonOutput.java:501)
at groovy.json.JsonOutput$prettyPrint.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at app.JsonTest.main(JsonTest.groovy:13)
But if we pass a valid JSON document like:
def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'
JsonOutput.prettyPrint(jsonString)
it will pass successfully.
The good thing is that you don't need any additional dependency to validate your JSON.
I did some more investigation and run tests with 3 different solutions:
JsonOutput.prettyJson(String json)
JsonSlurper.parseText(String json)
ObjectMapper.readValue(String json, Class<> type)
(it requires adding jackson-databind:2.9.3
dependency)I have used following JSONs as an input:
def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'
Expected result is that first 4 JSONs fail validation and only 5th one is correct. To test it out I have created this Groovy script:
@Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.9.3')
import groovy.json.JsonOutput
import groovy.json.JsonSlurper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature
def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'
def test1 = { String json ->
try {
JsonOutput.prettyPrint(json)
return "VALID"
} catch (ignored) {
return "INVALID"
}
}
def test2 = { String json ->
try {
new JsonSlurper().parseText(json)
return "VALID"
} catch (ignored) {
return "INVALID"
}
}
ObjectMapper mapper = new ObjectMapper()
mapper.configure(DeserializationFeature.FAIL_ON_TRAILING_TOKENS, true)
def test3 = { String json ->
try {
mapper.readValue(json, Map)
return "VALID"
} catch (ignored) {
return "INVALID"
}
}
def jsons = [json1, json2, json3, json4, json5]
def tests = ['JsonOutput': test1, 'JsonSlurper': test2, 'ObjectMapper': test3]
def result = tests.collectEntries { name, test ->
[(name): jsons.collect { json ->
[json: json, status: test(json)]
}]
}
result.each {
println "${it.key}:"
it.value.each {
println " ${it.status}: ${it.json}"
}
println ""
}
And here is the result:
JsonOutput:
VALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
JsonSlurper:
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
ObjectMapper:
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
As you can see the winner is Jackson's ObjectMapper.readValue()
method. What's important - it works with jackson-databind
>= 2.9.0
. In this version they introduced DeserializationFeature.FAIL_ON_TRAILING_TOKENS
which makes JSON parser working as expected. If we wont set this configuration feature to true
as in the above script, ObjectMapper produces incorrect result:
ObjectMapper:
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
I was surprised that Groovy's standard library fails in this test. Luckily it can be done with jackson-databind:2.9.x
dependency. Hope it helps.