When integrating Large Language Models (LLMs) into real-time applications, streaming structured JSON responses can be tricky. LLMs generate content progressively, meaning responses are streamed in chunks. However, this often results in incomplete or malformed JSON, making it difficult to parse and use the data reliably.
I faced this challenge while building a real-time chatbot API that returned structured responses, such as:
- A suggested response for the user.
- A list of recommended actions based on the conversation context.
However, since JSON arrived in fragments, parsing failed until the entire response was complete. To solve this, I implemented a solution in C# that streams data safely, repairs malformed JSON, and ensures clients receive a fully correct response.

The Problem: Handling Malformed JSON in Streaming Data
Imagine you’re building a real-time chatbot for customer support. The bot generates responses in JSON format and streams them as they become available.
However, due to network latency and chunked responses, the JSON often arrives incomplete, like this:
{
"suggested_response": "Hello! How can I assi
and
{
"suggested_response": "Hello! How can I assist you today?",
"recommended_actions": [
This is not valid JSON, and if we attempt to deserialize it immediately, it will result in an error.
Key Challenges
- Handling Incomplete JSON
- Since LLMs stream responses in small pieces, we must process the JSON incrementally without breaking the parser.
- Fixing Malformed JSON in Real-Time
- As JSON arrives in fragments, we need a way to repair it dynamically before attempting to parse it.
- Preserving Data Integrity
- Some JSON repair libraries alter the content inside strings, which could modify key data.
- Some content, such as LaTeX expressions or escape sequences, could be incorrectly modified by automated repair tools.
- We must ensure that the final data matches what the LLM originally generated.
- Sending a Final Verified Response
- Since streaming can be unreliable, we need a fallback mechanism to send a complete and corrected JSON object at the end.
The Solution: Using JSON Repair with a Fallback Mechanism
To solve these challenges, I implemented a streaming JSON repair system in C# using the JsonRepairSharp library. This system:
- Streams JSON responses incrementally, allowing the client to get updates in real-time.
- Repairs malformed JSON as new data arrives.
- Prevents data corruption by extracting only the new content from each chunk.
- Sends a final complete response at the end to ensure accuracy.
public async Task ProcessChatResponse(ChatRequest req, CancellationToken cancellation)
{
Response.Headers.Add(HeaderNames.ContentType, "text/event-stream");
Response.Headers.Add("Cache-Control", "no-cache");
StringBuilder finalJson = new();
string previousResponse = "";
var responseStream = _ChatService.GetChatStream(req, cancellation);
await foreach (var chunk in responseStream)
{
if (cancellation.IsCancellationRequested)
break;
finalJson.Append(chunk);
// Attempt to repair JSON dynamically
JsonRepairSharp.JsonRepair.Context = JsonRepairSharp.JsonRepair.InputType.LLM;
string repairedJson = JsonRepairSharp.JsonRepair.RepairJson(finalJson.ToString());
if (string.IsNullOrWhiteSpace(repairedJson))
continue;
var chatResponse = ChatResponseModel.Deserialize(repairedJson);
if (chatResponse?.SuggestedResponse == null)
continue;
// Extract only the newly generated text to avoid duplication
string newContent = ExtractNewContent(previousResponse, chatResponse.SuggestedResponse);
if (!string.IsNullOrEmpty(newContent))
{
string jsonResponse = new PartialResponseDto { Type = "chat_update", Content = newContent }.GetJSON();
await Response.WriteAsync($"data: {jsonResponse}\n\n", cancellation);
await Response.Body.FlushAsync(cancellation);
}
previousResponse = chatResponse.SuggestedResponse;
}
// Send final complete JSON to ensure correctness
string completeJson = CleanJsonString(finalJson.ToString());
var finalChatResponse = ChatResponseModel.Deserialize(completeJson);
if (finalChatResponse?.SuggestedResponse != null)
{
string jsonResponse = new PartialResponseDto { Type = "final_response", Content = finalChatResponse.SuggestedResponse }.GetJSON();
await Response.WriteAsync($"data: {jsonResponse}\n\n", cancellation);
await Response.Body.FlushAsync(cancellation);
}
// Signal completion
await Response.WriteAsync("data: [DONE]\n\n", cancellation);
await Response.Body.FlushAsync(cancellation);
}
To ensure structured JSON responses, I used a simple DTO class:
public class PartialResponseDto
{
/// <summary>
/// Specifies the type of response, e.g., "chat_update" for intermediate updates or "final_response" for the complete response.
/// </summary>
public string Type { get; set; } = "";
/// <summary>
/// Contains the chatbot-generated content or message.
/// </summary>
public string Content { get; set; } = "";
/// <summary>
/// Converts the object to a JSON-formatted string.
/// </summary>
public string GetJSON()
{
return Newtonsoft.Json.JsonConvert.SerializeObject(this);
}
}
How This Solution Works
1. Streaming JSON in Chunks
- The API streams JSON responses incrementally instead of waiting for the full response.
2. Repairing Malformed JSON Dynamically
- As new data arrives,
JsonRepairSharp.JsonRepair.RepairJson()
fixes incomplete JSON structures to make them parseable.
3. Extracting Only New Content
ExtractNewContent()
ensures that only new text is streamed to the client, preventing redundant updates.
4. Sending a Final Verified Response
- Once the LLM completes its response, a final, fully repaired JSON is sent to ensure correctness.
Key Benefits of This Approach
✅ Real-time updates: Clients receive chatbot responses instantly without waiting.
✅ Resilient JSON handling: The repair mechanism prevents parsing errors due to incomplete JSON.
✅ Data integrity: The final JSON ensures the full response is correct and unaltered.
✅ Improved user experience: Chatbots feel more responsive and interactive.
Conclusion
Streaming JSON from LLMs in C# comes with challenges, but by implementing incremental JSON repair and structured streaming, we can ensure that:
✔️ Data is delivered in real-time.
✔️ Incomplete JSON is handled gracefully.
✔️ Clients always receive a complete, valid response.
This solution has significantly improved how our chatbot API handles JSON streaming. If you’re working with real-time AI applications, I hope this guide helps you build robust and error-free JSON streaming systems!
Have you encountered JSON streaming issues in your projects? Share your thoughts in the comments!

Amit Mittal is a seasoned tech entrepreneur and software developer specializing in SaaS, web hosting, and cloud solutions. As the founder of Bitss Techniques, he brings over 15 years of experience in delivering robust digital solutions, including CMS development, custom plugins, and enterprise-grade services for the education and e-commerce sectors.