Claude Imagine: The Streaming Architecture That Changed AI Interfaces Forever

A few days ago, Anthropic introduced interesting inside Claude. It’s called Claude Imagine. It allows Claude to turn plain AI answers into live, interactive widgets right inside the chat.

The magical part here is the final user experience. It feels like a live application assembling itself while Claude is still thinking.

I wanted to try building something similar of my own to replicate this behavior, but something went wrong quickly. I could see UI flickers, widgets had white flashes visible, and sliders I added were resetting every few milliseconds.

On paper, the architecture looked straightforward but when I tried to create it, it turned into a big mess.

The First Mistake Most Developers Make

My first assumption was the same one almost everyone makes. I thought Claude was simply rendering HTML inside its Markdown response. Claude actually calls a tool named show_widget and passes the UI as structured data.

The payload looks roughly like this:

{
  "i_have_seen_read_me": true,
  "title": "compound_interest",
  "loading_messages": ["Calculating rates...", "Building chart..."],
  "widget_code": "<style>...</style><div>...</div><script>...</script>"
}

There are four key fields:

i_have_seen_read_me – ensures Claude first loads design guidelines
title – a snake_case identifier for the widget
loading_messages – progress messages shown while the widget renders
widget_code – the actual HTML fragment

Now there is one interesting thing to note here is that the HTML fragment does not include <html>, <head>, or <body>. It’s just the content that should appear inside the page because the UI isn’t rendered in an iframe. It’s injected directly into the DOM of the chat interface itself which leads to the next critical piece of the system.

The Hidden Step: `read_me`

Before Claude can generate a widget, it must first call another tool: read_me. This tool loads design guidelines and UI rules for the type of widget Claude wants to generate.

Example:

{
  "modules": ["interactive", "chart"]
}

Each module injects a different portion of the internal design system:

Module	Purpose
interactive	sliders, inputs, calculators
chart	Chart.js configuration
diagram	architecture diagrams
mockup	UI components
art	SVG illustrations

This pattern is actually a type of lazy loading. Instead of including thousands of tokens of UI rules in every prompt, Claude loads only the modules it needs. Only after this step can it call show_widget.

Where Most Replications Break

Once the widget starts streaming, the client receives partial JSON fragments containing pieces of the HTML. A naive implementation does this:

container.innerHTML = htmlFragment

And that’s exactly where everything breaks. Every token update replaces the entire DOM.

It results in:

white flashes
reflows
lost slider state
constant flicker

The correct solution is incremental DOM diffing. Instead of replacing the DOM, the client compares the current tree with the new HTML and patches only the changes. Libraries like morphdom are commonly used for this purpose.

This allows to improve user experience by making sure that:

unchanged elements stay stable
new elements fade in smoothly
user interactions persist

The magical part here is that the interface doesn’t re-render but actually it evolves. That single architecture choice is what makes Claude’s generative UI feel so smooth and have great user experience.

Why the System Feels Magical

Claude’s generative UI isn’t actually a new rendering engine. It’s a clever combination of three existing ideas:

Tool calls as UI instructions
Streaming partial JSON
Incremental DOM patching

If you remove any one of these pieces it starts to break the user experience. These together allows to build interfaces that materialize while the model is still generating them.

The First Mistake Most Developers Make

The Hidden Step: read_me

Where Most Replications Break

Why the System Feels Magical

The Hidden Step: `read_me`