One Sketch Away

The Missed Call From The Future

The Missed Call from the Future

What a chat about TOON taught me about hallucination, pattern recognition, and the future of machine understanding.

There was this old internet Rajani joke: When Graham Bell invented the phone, it already had a missed call from Rajinikanth.”

rajani

I had my own version of that moment recently — not with a phone, but with an AI.


My Pocket Tutor

These days I often use chatbots to learn things faster than reading entire docs or watching tutorials on YouTube.
It’s like having a patient tutor in my pocket — one that never rolls its eyes when I ask dumb questions, can tailor its explanations to my mood, and always sounds confident, even when it shouldn’t.

So when I heard about a new data format called TOON, described as an optimized JSON for LLM tokens, I did what I always do: I asked my digital tutor, ChatGPT.

It immediately launched into a lecture — eloquent, structured, confident — and absolutely wrong.

Somewhere between “animation” and “structured humor,” I realized it had completely misunderstood me.
It was talking about cartoons, not TOON.
My tutor had just hallucinated.


The Correction

I laughed and said, “Hey buddy, could you check the internet this time?”

After a thoughtful pause, it came back

“You’re right — I jumped the gun,” it replied. “Here’s what TOON actually is…”

And then it went on to explain, with sources, how it could reason about unseen formats.
That sparked a deeper reflection with a crisp, well-researched explanation of the real TOON (Token-Oriented Object Notation) — a brand-new data format designed to shrink token usage in LLM prompts.
This time it was flawless.

There was something eerie about it — that sense of an entity reasoning about its own ignorance. It hadn’t memorized TOON; it had imagined how such a thing should work and then validated its imagination through structure. It was the digital equivalent of intuition: pattern over memory. I realized that hallucination isn’t always failure — sometimes, it’s imagination waiting for correction.


But that wasn’t what amazed me. I already knew it could read the web and summarize what it found.

What made me stop and think was something subtler:
If someone invents a new optimized format for large language models, how does the LLM itself know that it’s optimized?

It’s like inventing a new SQL syntax after the database is already running.
You announce, “Hey world, I’ve built a faster query language for you!” — and somehow the database instantly starts using it efficiently, without ever being reprogrammed or even aware that the new syntax exists.
That’s the paradox I found myself staring at.

…and it suddenly felt like that joke I began with — the future calling before the invention even happens.


A Quick Detour: What TOON Actually Is

For the uninitiated, Token-Oriented Object Notation (TOON) is described on its official GitHub as:

“Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.”
github.com/toon-format/toon

It’s a compact, human-readable format built for passing structured data to large language models, with the goal of significantly reduced token usage.
TOON mixes the tabular simplicity of CSV with the structural clarity of JSON, and borrows YAML’s indentation for readability.

Example

// JSON
{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob",   "role": "user"  }
  ]
}
# TOON
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Less punctuation, fewer repeated keys, and fewer tokens for the same information.
According to the spec, TOON typically uses 30 – 60% fewer tokens than JSON on large uniform arrays.

Efficiency Ranking (Accuracy per 1 K Tokens)

Each format’s overall performance, balancing accuracy against token cost:

TOON           ████████████████████   26.9  │  73.9% acc  │  2,744 tokens  
JSON compact   █████████████████░░░   22.9  │  70.7% acc  │  3,081 tokens  
YAML           ██████████████░░░░░░   18.6  │  69.0% acc  │  3,719 tokens  
JSON           ███████████░░░░░░░░░   15.3  │  69.7% acc  │  4,545 tokens  
XML            ██████████░░░░░░░░░░   13.0  │  67.1% acc  │  5,167 tokens

TOON achieves 73.9% accuracy (vs JSON’s 69.7%) while using 39.6% fewer tokens.

Note on CSV: Excluded from ranking as it only supports 109 of 209 questions (flat tabular data only).
While CSV is extremely token-efficient for simple tables, it cannot represent nested structures.

Beyond the benchmarks, TOON represents a quiet design shift in how humans talk to machines. Every token saved isn’t just cheaper compute — it’s more room for reasoning. For prompt engineers and AI tool builders, that’s bandwidth for thought: more context, more examples, and more expressive prompts within the same model limits. In a sense, TOON doesn’t just compress data; it compresses understanding into a denser signal — a language tuned for intelligence, not for parsers.


How LLMs Understand the Uninvented

1. LLMs Don’t ‘Know’ Formats — They Learn Patterns
LLMs like GPT-5 don’t come pre-loaded with parsers for new formats.
They understand text statistically — by learning from patterns in their training data.

So when a brand-new format like TOON appears in late 2025, it’s far too new to have been part of any model’s corpus.
That means GPT-5 (and others) wouldn’t natively “understand” TOON syntax or semantics.

However, the moment you show an example, the model can reason about it as structured text.
It doesn’t parse TOON — it adapts to it through context.

2. The Point of TOON Isn’t Built-In Knowledge — It’s Efficiency
The goal of TOON isn’t to make LLMs natively understand it, but to feed structured data in a token-optimized way that’s:

LLMs don’t need prior awareness of TOON; they just need a pattern to latch onto — enough examples in the prompt, and they generalize instantly.


Emergent Comprehension

That’s what fascinates me most.
GPT-5 was released before TOON even existed, yet it could reason about it as though it did.
That’s not memory — that’s emergent comprehension.

And maybe, as newer models like GPT-6 or Claude-Next appear, they’ll already know TOON natively because it’s part of their training set.
But GPT-5 definitely didn’t.
It was improvising understanding in real time — learning a format that was built for it, but not known to it.


The Reflection

The AI didn’t just explain TOON; it participated in the very process TOON was designed for.
The student learned the language that was built for the teacher — a language the teacher didn’t even know existed, but learned on the fly so it could teach better.

Maybe that’s what the future of intelligence looks like — systems that don’t wait to be taught but infer meaning the instant they encounter structure.

The phone didn’t exist yet, the future called — and somehow, it still answered.


(end)