I have used the "JsonConvert.Deserialize(json)" method of Json.NET so far which worked quite well and to be honest, I didn't need anything more than this.
I am working on a background (console) application which constantly downloads the JSON content from different URLs, then deserializes the result into a list of .NET objects.
using (WebClient client = new WebClient())
{
string json = client.DownloadString(stringUrl);
var result = JsonConvert.DeserializeObject<List<Contact>>(json);
}
The simple code snippet above doesn't probably seem perfect, but it does the job. When the file is large (15,000 contacts - 48 MB file), JsonConvert.DeserializeObject isn't the solution and the line throws an exception type of JsonReaderException.
The downloaded JSON content is an array and this is how a sample looks like. Contact is a container class for the deserialized JSON object.
[
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
}
]
My initial guess is it runs out of memory. Just out of curiosity, I tried to parse it as JArray which caused the same exception too.
I have started to dive into Json.NET documentation and read similar threads. As I haven't managed to produce a working solution yet, I decided to post a question here.
UPDATE: While deserializing line by line, I got the same error: " [. Path '', line 600003, position 1." So downloaded two of them and checked them in Notepad++. I noticed that if the array length is more than 12,000, after 12000th element, the "[" is closed and another array starts. In other words, the JSON looks exactly like this:
[
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
}
]
[
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
},
{
"firstname": "sometext",
"lastname": "sometext"
}
]
As you've correctly diagnosed in your update, the issue is that the JSON has a closing ]
followed immediately by an opening [
to start the next set. This format makes the JSON invalid when taken as a whole, and that is why Json.NET throws an error.
Fortunately, this problem seems to come up often enough that Json.NET actually has a special setting to deal with it. If you use a JsonTextReader
directly to read the JSON, you can set the SupportMultipleContent
flag to true
, and then use a loop to deserialize each item individually.
This should allow you to process the non-standard JSON successfully and in a memory efficient manner, regardless of how many arrays there are or how many items in each array.
using (WebClient client = new WebClient())
using (Stream stream = client.OpenRead(stringUrl))
using (StreamReader streamReader = new StreamReader(stream))
using (JsonTextReader reader = new JsonTextReader(streamReader))
{
reader.SupportMultipleContent = true;
var serializer = new JsonSerializer();
while (reader.Read())
{
if (reader.TokenType == JsonToken.StartObject)
{
Contact c = serializer.Deserialize<Contact>(reader);
Console.WriteLine(c.FirstName + " " + c.LastName);
}
}
}
Full demo here: https://dotnetfiddle.net/2TQa8p