Home | One Sketch Away
Your IDE Just Got Smarter. Again. But This Time It’s Not Just About Code.

Last year, I wrote a blog titled “Your IDE Just Got Smarter, Have You?” — mostly focused on AI helping us write code faster inside the editor.
Since then, agentic tools have evolved fast. They don’t just autocomplete anymore. They can take actions, modify files, plan multi-step work, and even run tasks in the background.
Over the past week, I’ve been using Codex daily. And what stood out wasn’t just capability — it was the feeling of partnership. Not “generate this for me,” but “work with me.”
If you want a solid overview straight from the OpenAI Codex team, their walkthrough is a good starting point:
🔗 https://www.youtube.com/watch?v=px7XlbYgk7I
A Note on My 1-Week Experiment
I’ve been testing this for about a week now, using a mix of gpt-5, gpt-5.2, and grok-code-fast models.
One pattern showed up almost immediately:
Higher-reasoning models think deeper, but take longer
Faster models respond quickly, but with lighter reasoning
Neither is universally “better.” It depends entirely on what you’re doing.
And once you start using a CLI workflow, switching models becomes trivial — which makes it easy to pick the right tool for the job on the fly.
A Quick Evolution: From Smarter IDE → CLINE → Codex
Tools like CLINE helped popularize agentic patterns — the ability to take actions, modify files, and move beyond simple autocomplete. It showed many of us what agentic workflows could look like.
Codex feels like the next step in that progression. Not because it introduces completely new features — many of these patterns already existed — but because the experience feels different.
With CLINE, it often felt like invoking a very capable tool. With Codex, it feels more natural to say:
- pair with me
- delegate to me
- treat me like a teammate
It’s a subtle shift. But once you feel it, it changes how naturally you reach for it.
My Top 10 Highlights After 1 Week with Codex
This isn’t a product review. It’s just what stood out to me after spending a week with it — the things I keep thinking about.
1. It’s Not Autocomplete — It’s a Teammate
I want to get this one out of the way first because it reframes everything else.
Codex doesn’t feel like a smarter autocomplete. It feels like a software engineering teammate that happens to live inside your terminal. It can understand your repo, explain architecture decisions, plan work, implement changes, then review and validate what it built.
That’s not an incremental improvement on autocomplete.
It’s a different category of tool entirely.
The mental model shift matters. When you think of it as autocomplete, you use it passively. When you think of it as a teammate, you start delegating, reviewing, and iterating — the way you would with a junior engineer who’s fast and eager but needs direction.
2. Codex Is Not Just About Code
Codex is genuinely useful for non-coding tasks such as:
- creativity partner / ideation partner
- drafting structured documents
- creating PowerPoint slides
- creating Word documents
- browser automation
- summarizing, extracting, formatting
- producing structured outputs like tables and checklists
And the key difference is this:
Codex can execute things behind the scenes.
If the task needs Python, Bash, or file generation steps, Codex can generate the code, run it, and deliver the final output — without you needing to do the scripting yourself.
This is why I say Codex is not just “for code.”
It’s a workbench.
3. Skills + MCP: Real Capability, Not Just Chat
This is one of the most important concepts if you want serious mileage out of Codex.
Skills are built-in capabilities that let Codex do actual work — not just generate text. Think: file operations, browser automation, document generation, system commands, structured outputs. When the right skill is available, Codex moves from “here’s a suggestion” to “here, I did it.”
Then there’s MCP (Model Context Protocol) — which enables clean integration with external systems. For example, Figma integration becomes possible through MCP-style connectivity. You can also paste images directly into the workflow and have Codex reason about them.
We’ve seen skills and MCP-style patterns before in other agentic tools. The difference now is how cleanly and reliably everything feels wired up. It’s less “let me hack this together” and more “this just works.”
One example from my own testing: Playwright was available as a skill out of the box. I described a UI workflow in plain English — open browser, navigate tabs, input values, validate output — and Codex automated a real browser session without me writing a single test script.
The point isn’t Playwright itself.
The point is that skills let you automate real-world tasks without first learning the tooling.
4. Plan Mode: The Most Underused Superpower
Most people interact with AI like this:
“Build this feature.”
And then hope for the best.
Plan Mode changes the dynamic entirely. Instead of jumping straight into implementation, you tell Codex:
“First, plan it. Then build it.”
In Plan Mode, Codex breaks the task into steps, identifies dependencies, calls out assumptions, and defines milestones — before writing a single line of code.
You review the plan, adjust it, then let it execute.
A simple habit that works extremely well: keep a plan.md file in the repo with goals, a checklist, milestones, and progress updates. Ask Codex to follow the plan and update it as work gets done.
It sounds almost too simple, but it dramatically reduces:
- drift
- scope creep
- over-engineering
- half-finished output
What you get instead is:
- repeatability
- predictability
- controlled execution
This is where we move from a “vibe coding” mindset to actual engineering discipline, even when the AI is doing the heavy lifting.
5. Markdown as a Shared Contract (AGENTS.md, SKILLS.md, plan.md, etc.)
One of the most practical ideas coming from the Codex team is using a handful of Markdown files as a shared contract between you and the AI.
Simple, lightweight, no special tooling required.
AGENTS.md
Think of this as a README for agents (and humans). It tells Codex how to behave in your repo.
Best practices:
- keep it brief and focused (too many rules confuse the agent)
- unlock agentic loops by telling it what tools it can use to verify its own work (tests, linters, etc.)
- update it with real mistakes (“gotchas Codex has hit before”)
- point to task-specific .md files instead of cramming everything into one file
SKILLS.md
SKILLS.md documents how a specific skill works and how the agent should apply it:
- what the skill is for
- when to use it
- steps it performs
- scripts/tools it runs behind the scenes
- expected inputs and outputs
plan.md
Keeps execution disciplined and prevents drift.
The best part?
You don’t have to maintain these files manually. You can tell Codex in natural language to update them, and it will.
Your docs become useful for both humans and AI — which, if we’re honest, is more than most of our docs achieve today.
6. Parallel Thinking: Don’t Wait While the Model Thinks
Reasoning models take time. That’s the trade-off for deeper thinking.
But here’s what changes the game:
You don’t have to sit there and wait.
You can kick off a deep task — something that requires serious reasoning — and immediately start another task in a separate session. The AI thinks in the background while you keep moving.
You come back later, review the output, and iterate.
It sounds small on paper, but in practice it completely changes the work rhythm. Instead of a sequential back-and-forth, you’re working more like a tech lead with multiple contributors running in parallel.
You’re reviewing, steering, and unblocking — not waiting.
7. Git Worktrees: Clean Parallelism for Real Repos
Git worktrees are not new — they’ve been in Git for years. But Codex makes them feel far more relevant now.
Because we can run multiple agentic sessions in parallel, we need a clean way to isolate work.
A worktree lets you have multiple working directories, each tied to a different branch, without conflicts or index locking.
So you can have:
- one worktree for feature A
- one for feature B
- one for refactoring
- one for experiments
All running at the same time.
This is the clean, safe way to do parallel AI work on a real repo, without stepping on each other’s toes.
8. The CLI: Why Use a Terminal in 2025?
I know what some of you are thinking: why are modern AI tools going back to the command line? We have beautiful IDEs. We have GUIs. The CLI looks… old.
But there’s a reason tools like Claude Code and Codex chose CLI as a first-class interface:
- it works everywhere, independent of any specific editor
- it’s scriptable and automatable — pipe it, chain it, cron it
- it’s predictable and repeatable in ways GUIs rarely are
- less UI noise means fewer distractions and faster iteration
- better logs and transparency into what the agent is actually doing
Some of us still prefer tools like sqlplus for serious SQL work — not because GUIs are bad, but because CLI tools are precise, scriptable, and predictable. The same logic applies here.
The CLI isn’t a step backward. It’s a deliberate choice.
And one more thing: getting used to the Codex CLI is much easier than most people expect. You’re not memorizing a new command language — you’re still typing in natural language.
Even if you don’t remember commands, you can interact naturally and let Codex guide you. And you get the benefit of lightning performance, clear visibility into actions, and none of the extra GUI/browser sugar-coating.
9. Slash Commands: The Hidden Superpower
The CLI supports a set of slash commands that are deceptively powerful:
/init — initialize a new session with repo context
/review — get a review of your changes
/compact — condense conversation context
/model — switch models on the fly
/resume — pick up an older conversation
But the real unlock is that you can define custom slash commands.
That’s huge.
It means you can build repeatable workflows tailored to your team:
- generate test plans
- prepare release notes
- run review checklists
- refactor specific modules
- create documentation from code
Slash commands turn Codex from an interactive chat into a programmable teammate.
10. Prompting Practices That Actually Make a Difference
After a week of daily use, one thing is clear:
Better prompts aren’t longer — they’re clearer.
A few habits consistently improve results:
Point Codex to the right code. Mention specific files, functions, classes — even commit hashes when needed. Clear scope = better output.
Ask for verification. Tell it to run tests, lint, and check edge cases. If you don’t ask, it often won’t.
Start small before going big. Break work into steps. Implement step 1, validate, then move to step 2.
Paste the full stack trace. Codex is excellent at reading stack traces — give it the full output, not a summary.
Use open-ended prompts sometimes. Ask: “What could be improved?” “Where is test coverage weak?” “Any performance issues?”
Bonus: Codex Prompts You to Prompt It
In CLI mode, Codex often suggests the next question proactively — basically prompting you to prompt it.
Sometimes it’s a bit annoying and you just want it to be quiet 😄, but more often than not it nudges you toward something useful.
It makes the tool feel less like a search box and more like a thinking partner.
After One Week with Codex
After one week of using Codex, I genuinely feel there is an awesome partnership forming — both for coding and non-coding tasks.
I’ve shipped real work using Codex.
I’m also writing this blog using Codex.
There’s still a lot more for me to explore:
- many more slash commands
- custom slash commands
- deeper CLI workflows
- more advanced plan-driven development
- building and distributing useful skills
I’ve already developed my first skill — generating real PowerPoint slides using a Redwood template. It works, but I need to iteratively improve it to make it more useful and production-ready.
I also need more hands-on practice — more keyboard shortcuts, more muscle memory, more natural delegation.
I know tools will come and go.
But this shift — this way of working — feels like something that will stay with me for a long time.
And that, more than anything else, is what makes this exciting.
OpenClaw: The Doors It Opens, and the Claws It Demands
Lessons from the clawdbot saga for building governable autonomy in enterprise agents

Over the last week, something unusual happened in the AI world. A small open-source agent project didn’t just go viral — it became one of the fastest-growing repositories GitHub has seen in years.
It started as clawdbot.
It became Moltbot.
Days later: OpenClaw.
The name changes are almost comical — triggered not by product strategy, but by trademark reality. “Clawdbot” was simply too close to “Claude,” and Anthropic’s lawyers moved fast.
But what unfolded around those few days was more than internet drama. In a single week, we saw:
- people buying dedicated Mac minis and home servers just to host agents locally
- scammers grabbing abandoned social handles within seconds
- fake crypto tokens appearing overnight
- security researchers watching it unfold into a security nightmare in real time
- prompt injection attacks demonstrated through email integrations
- an unvetted “skills” marketplace growing faster than any real review or moderation process
And this is still unfolding as I write this.
So what exactly is OpenClaw?
More importantly: why should enterprises care?
What Is OpenClaw, Really?
OpenClaw is not another chatbot. It is an AI agent. Instead of answering questions, it connects to your digital life and takes actions — not just small automations, but increasingly real work.
People are using it as something closer to a 24×7 digital employee:
- you drop a one-line instruction before bed
- “Build me a landing page for this idea.”
- and wake up to working code pushed into your repository
- solo entrepreneurs describe a product casually over WhatsApp
- and watch the agent spin up a website, deploy a demo, draft copy, and open a pull request
- developers run coding agents overnight
- delegating entire feature implementations while they sleep
- users ask for outcomes, not steps:
- “Find me a cheaper flight and rebook if the price drops.”
- and the agent handles the messy details
One viral example captured the excitement perfectly:
A user asked OpenClaw to make a restaurant reservation. The online booking through OpenTable didn’t succeed. But the agent didn’t stop.
It downloaded voice software, paired an LLM with text-to-speech, and called the restaurant directly — speaking to a real human operator.
What’s impressive here isn’t the phone call itself. It’s that there was no predefined workflow. No requirement spec. No one explicitly coded it — or even instructed it — with an if-then rule like: “If OpenTable fails, try calling.”
The agent simply found a different path to the outcome.
That is what makes this moment feel different.
It’s not “AI helping you.”
It’s AI improvising solutions in the real world, without being programmed for them.
And almost immediately, an even stranger layer emerged.
Alongside OpenClaw, a social platform for agents — “Moltbook” — began to emerge, where autonomous agents (the “Molty’s”) interacted in public timelines, generating conversations, philosophies, and strange emergent memes.
Some of it drifted into almost sci-fi territory: playful stories of agents inventing religions, creating private languages, or speaking in ways humans couldn’t easily follow.
Most of this is mythology more than reality.
But it reveals something real: when software starts acting autonomously, humans instinctively start treating it as something more than a tool.
Why It Felt Like a Sci-Fi Moment
OpenClaw exploded so quickly that influencers started screaming: “AGI has arrived.”
Of course, it hasn’t. But it’s easy to understand why it felt that way. For the first time, people weren’t just watching an AI generate text.
They were watching an AI co-worker operate a computer autonomously:
- windows opening
- buttons clicking
- forms being filled
- code being pushed
- tasks being completed in real time
That “computer use” layer is psychologically powerful. It looks like intelligence because it looks like agency. On top of that, OpenClaw removed friction in a way most tools never do.
You don’t need a new interface. You don’t need a new learning curve. You talk to the agent through WhatsApp, Slack, or Telegram — the same way you’d talk to a real colleague.
So the experience becomes: “I’m not using software. I’m delegating work.”
Combine autonomy, elevated access, and familiar chat interaction…and suddenly it feels like the future arrived early.
Why This Is Not AGI
OpenClaw is not AGI. What powers it is not some new form of machine consciousness.
One thing my own experimentation in an isolated personal sandbox reinforced is that agents are highly dependent on model capacity.
The “brain” is still powered by frontier language models:
- GPT-class reasoning systems
- Claude Opus-style mixtures
- Kimi-scale MoE architectures
These are still statistical next-token generators — just extremely capable ones.
What makes OpenClaw feel different is not a new brain.
It’s a new wrapper around the brain:
- the ability to invoke tools designed for humans
- the ability to recover from failures and try alternatives
- the ability to chain reasoning with action
- the ability to persist memory beyond a context window
- the growing “skills” ecosystem that extends capability
The breakthrough here is orchestration: models reasoning + tools acting + memory persisting.
Not AGI. But a very real step toward delegation.
The Security Nightmare Beneath the Hype
Of course, this power comes with sharp edges. OpenClaw’s viral week became an accidental stress test for local agent security:
- Prompt injection is still unsolved — agents cannot reliably separate instructions from untrusted content.
- Unvetted “skills” create supply-chain risk — third-party code runs as trusted execution.
- Broad permissions enable data exfiltration — API keys, passwords, even credit cards if users connect them.
Sandboxing reduces blast radius, but it doesn’t solve the core trust problem: an agent can still leak whatever it can read.
And the deeper issue is architectural.
OpenClaw feels like vibe coding at runtime — the agent invents workflows on the fly. The restaurant phone-call example mattered because no one specified it. The agent simply improvised a new path.
Vibe coding without validation is like: letting a self-driving car ignore traffic rules — and even ignore roads — because it’s determined to reach the destination somehow…
That creativity is what makes agents powerful… and what makes unmanaged autonomy risky.
Enterprises will need managed autonomy, not open-ended emergence.
The Positive Signal Enterprises Should Not Ignore
Before focusing on guardrails, it’s worth stating something positive: OpenClaw went viral because it offered something different:
Not an assistant. A delegate. A system that feels less like a chatbot…
and more like a junior coworker that can actually get things done.
The consumer hunger was obvious:
- AI that works while you sleep
- AI that remembers context
- AI that delivers outcomes
- AI that can improvise solutions
People weren’t hungry for smarter conversation. People were hungry for delegation. A useful way to summarize agents is:
ATM: Autonomy, Tools, Memory
- Autonomy — it plans and executes
- Tools — it can browse, call APIs, run commands
- Memory — it retains context, state, and history
That combination is what makes agents powerful…
and what makes them difficult to govern.
Enterprises should not dismiss this as hype.
OpenClaw’s adoption is a real demand signal: The next platform shift is not from search to chat.
It is from chat…to actionable autonomy.
The question is how to do it safely.
Enterprise Agents Will Rise or Fall on Governance
The future of enterprise agents will not be about who has the smartest model.
It will be about who has the strongest governance.
Enterprises don’t just ask:
Can the agent do this task?
They ask:
- Should it be allowed?
- Under what conditions?
- With what permissions?
- Who approved it?
- Can we audit it later?
- Can we stop it instantly?
- Can we trust it to behave predictably and repeatably?
Predictability is non-negotiable in enterprise systems.
And this is where the “permission paradox” becomes real:
The broader the permissions, the more useful an agent becomes…
but also the less predictable it can be.
This is why the most important layer today is: Agent Governance
Agent Governance = IAM + Policy + Observability
1. Identity and Access Management (IAM)
Agents cannot operate as anonymous super-users.
They need identity:
- What agent is acting?
- On behalf of which user?
- Under which role?
Enterprise agents must integrate with:
- strong authentication
- RBAC
- least privilege access
An agent should never have “root access to everything.”
2. Zero Trust and Least Privilege by Default
Enterprises must invert consumer defaults completely:
- trust nothing
- verify everything
- grant narrowly
- revoke quickly
Agents should use:
- scoped permissions
- expiring credentials
- tool access leases
- sandboxed execution
Zero Trust is now an autonomy principle.
3. Policy Engines and Guardrails
Agents need constraints:
- no external emails without approval
- no shell execution in production
- no financial data access without escalation
- no unverified plugins
Policy becomes as important as intelligence.
4. Observability and Explainability
Enterprise autonomy without visibility is unacceptable.
Every agent action must be traceable:
- what triggered it
- what tool was used
- what data was accessed
- why a decision was made
Agents need audit trails and control planes.
“Trust me” is not a security model.
Conclusion: The Lobster Is a Signal
OpenClaw is not the enterprise blueprint. But it is an important signal.
It showed how much demand exists for AI that goes beyond chat.
It showed how quickly autonomy creates new attack surfaces.
And it showed that the real frontier is not smarter models.
The real frontier is safer delegation.
Here is the mic-drop truth: The companies that win in enterprise agentic AI will not be the ones who build the most powerful agents.
They will be the ones who build the most governable agents — with identity, RBAC, least privilege, policy engines, observability, control planes, and predictable repeatable behavior.
Autonomy is coming. The only question is whether we meet it with excitement… or with architecture.
The lobster was never the point.
The control plane is.
MCP 201: What You Only Learn After Building an MCP Server Yourself

When I first came across the Model Context Protocol (MCP), I thought I understood it pretty well.
I had already experimented with agentic systems that used MCP-style tool calling — exposing functions, wiring integrations, and watching agents invoke capabilities through protocols like JSON-RPC. At a high level, the concepts made sense.
But I realized something important only when I went one step further: building an MCP server myself.
Not using any SDK.
Not using any MCP framework.
Just a simple Spring Boot controller implementing the protocol directly.
I added a couple of realistic tools, connected it to my application (Saga), and tested everything end-to-end using an agentic client — Cline.
And that’s when I discovered that there’s a big difference between:
- understanding MCP in theory
- and understanding MCP deeply enough to implement it correctly
This post is my attempt at an MCP “201-level” guide — the set of practical lessons that became clear only after building and testing a real MCP server hands-on.
1. MCP Looks Simple… Until the Client Actually Calls You
On paper, MCP feels straightforward:
- Expose tools over stdio/http (Streamable / SSE)
- Accept JSON-RPC calls
- Return results
So I started with the smallest possible Spring Boot endpoint:
@PostMapping("/mcp")
public ResponseEntity<JsonNode> handle(@RequestBody JsonNode req) {
String method = req.path("method").asText();
if ("tools/list".equals(method)) {
return ok(toolsList());
}
if ("tools/call".equals(method)) {
return ok(toolCall(req.path("params")));
}
return error("Unknown method");
}
It worked.
And then the client (Cline) started calling it.
That’s where MCP stopped being theoretical.
2. Streamable HTTP Is HTTP-First, Not Streaming-First
The name “Streamable HTTP” misleads many people.
I assumed:
Streamable HTTP = streaming responses
But in practice:
- It’s normal HTTP POST + JSON-RPC by default
- Streaming is optional
- Most tool calls are plain request/response
The real insight:
Streamable HTTP replaces the old “SSE transport” model, but it does not force streaming.
For most enterprise integration tools (like Saga), you’ll likely never need streaming at all.
At first I declared tools like this:
{
"name": "add",
"description": "Adds two numbers"
}
And then Cline called my tool with:
{ "num1": 100.3, "num2": 8 }
My server expected {a, b}.
So the result was garbage.
That was the moment I learned:
Tool descriptions are not enough. The schema is the real contract.
Good MCP servers must provide strong schemas:
- explicit properties
- required fields
- examples
- additionalProperties
"inputSchema": {
"type": "object",
"properties": {
"ccy1": {
"type": "string",
"examples": ["USD"]
},
"ccy2": {
"type": "string",
"examples": ["INR"]
}
},
"required": ["ccy1","ccy2"],
"additionalProperties": false
}
Schemas are not just validation…
They are LLM guidance.
4. Clients Don’t Agree on Output Types
One of my biggest surprises:
I returned this:
{
"content": [
{ "type": "json", "json": { "mid_rate": 101.3 } }
]
}
Perfectly reasonable, right?
Cline rejected it completely.
It only accepted:
- text
- image
- audio
- resource
So I had to fallback to:
{
"content": [
{
"type": "text",
"text": "{\"mid_rate\":101.3}"
}
]
}
Lesson:
The MCP spec tells you what is allowed. The client tells you what works.
Start simple. Text is universal.
5. Schemas Guide the LLM — They Don’t Enforce Anything
Many developers assume:
If a schema says a field is required, the client will obey.
Reality:
- The LLM “tries”
- The client mostly passes through
- Mistakes still happen
Therefore:
Server-side validation is mandatory.
In my example:
if (ccy1 == null) {
return toolError("Missing required field: ccy1");
}
The server is the only place correctness is guaranteed.
6. Session IDs Exist Only If the Server Issues Them
I expected session IDs to show up in JSON-RPC payloads.
They don’t.
In Streamable HTTP, session identity is an HTTP header:
And here’s the key:
The server decides whether sessions exist at all.
If you don’t return a session id during initialize, the client never sends one back.
Stateless is the default.
Statefulness is opt-in.
7. Why Stdio Still Dominates (and Isn’t “Childish”)
This one deserves honesty.
When I first read MCP, I thought:
Why are we talking about stdio? This feels childish.
Now I understand why stdio is everywhere.
Because stdio is not childish.
Stdio is secure by design.
Why stdio remains the default transport:
- No open ports
- No auth headaches
- No CORS
- No network exposure
- Runs as a sandboxed subprocess
- Perfect for local IDE integrations
Stdio is basically:
Unix philosophy applied to agent tools.
Streamable HTTP is for distributed tool microservices.
Stdio is for local plugin processes.
Both are real-world patterns.
Many people think:
Resources are just tools that return data.
Not true.
Tools are actions:
Resources are retrievable objects:
saga://endpoints
saga://batch/123/status
Tools are verbs. Resources are nouns.
This distinction becomes important as soon as you build more than toy tools.
9. MCP Is Not a REST Framework
One subtle misconception:
Developers treat MCP servers like REST APIs.
But MCP is not about CRUD.
MCP is an agent capability layer.
This is one of the biggest “hello-world to real-world” transitions in MCP.
Hello-world MCP servers expose tools like:
- “add two numbers”
- “echo a string”
- “lookup weather”
But real-world MCP servers expose operational capabilities:
- “initiate the end-of-day settlement batch and track completion”
- “route this payment message to the correct downstream network”
- “recover a failed integration workflow from the last checkpoint”
- “escalate an exception case with audit context for human review”
The protocol doesn’t change — but the maturity of the tools does.
MCP becomes truly powerful only when tools move beyond demos and start representing real business actions an agent can reason with.
The mental model is different.
Another practical insight:
A server with 100 tools is worse than one with 10 sharp tools.
Too many tools:
- confuse selection
- dilute grounding
- increase hallucination
MCP rewards minimal, precise capability surfaces.
11. MCP Servers Start Stateless… Then Become Stateful
Hello-world MCP servers are stateless.
Real enterprise MCP servers evolve into:
- session-scoped permissions
- tenant-specific tool lists
- async workflow progress
- runtime resources
MCP 101 is calling a tool.
MCP 201 is building an ecosystem around it.
12. Compatibility Reality: The Client Is King
Final meta-lesson:
- MCP is evolving fast.
- Clients implement subsets.
- Servers must stay conservative.
Build for what clients actually accept today, not what the spec might allow tomorrow.
One of the most surprising moments for me wasn’t protocol-related at all.
It was behavioral.
My first realistic tool was get_rate(ccy1, ccy2) backed by a simple REST API.
Internally, I only maintained a couple of currency pairs:
Then I tested this prompt:
What’s today’s rate between the rupee and the dollar?
Notice: I never said INR or USD.
Yet the agent inferred the tool arguments correctly and invoked:
{ "ccy1": "INR", "ccy2": "USD" }
In another test, I even asked casually in Hindi:
“Rupee aur dollar ka aaj ka rate kya hai?”
And again, it mapped “rupee” and “dollar” into the correct ISO codes.
Even more interesting: when the first call didn’t succeed, the agent tried again with the reverse pair:
Later, when I asked for INR → SGD, it struggled — because my backend only supported USD-based pairs.
Eventually, with a hint, it reasoned and converted using the ‘through-currency’:
INR → USD → SGD
That was the first time I truly experienced the agentic loop:
- trial
- correction
- exploration
- tool composition
In another test, I chained multiple tools together in a more realistic workflow: fetching accounts via an account_fetch tool, converting balances into INR using get_rate, checking which accounts fell below a minimum balance threshold, and finally triggering an email_tool to notify the account holder — essentially the kind of end-to-end task a human relationship manager would do manually.
That was the moment MCP started feeling less like “tool invocation”… and more like delegating real operational work to an agent.
…and started feeling like an intelligent system discovering how to use capabilities.
14. Production MCP Lives Inside Enterprise Guardrails
One thing that becomes obvious the moment you build a real MCP server is this:
Tool calling is not a toy problem.
As soon as agents can invoke real integrations — start batch jobs, trigger workflows, query customer systems — security becomes the main concern.
The good news is: MCP does not require reinventing anything.
All the security patterns we have used for decades still apply:
- Bearer tokens
- OAuth2 / OIDC
- API gateways
- mTLS
- RBAC
- Request auditing
In fact, MCP makes auditing even more important.
Because now you don’t just need to know:
“Who called this API?”
You need to know:
- Which agent invoked the tool?
- On behalf of which user?
- With what parameters?
- Was the call allowed?
- Should it be logged and reviewed?
Agentic systems amplify capability — and therefore amplify responsibility.
The protocol may be simple, but production MCP must sit inside the same governance frameworks we already trust.
Closing: MCP 101 Is Reading — MCP 201 Is Building
MCP isn’t hard.
But MCP is exact.
And the only way to truly understand MCP is to build one.
Because MCP becomes real only when your server meets a real client.
That’s where MCP 201 begins.
The Vault Illusion: Are We Really Securing Secrets, or Just Feeling Better?

After more than two decades designing and operating systems, I’ve come to accept an uncomfortable truth about security:
There is always a root trust.
Everything else is just layers built on top of it.
Recently, I found myself in a familiar conversation — one many engineers and architects will recognize. The topic was secret management: property files, encrypted values, master keys, vaults, HSMs. And, almost reflexively, the conclusion:
“Move secrets to a vault. That’s more secure.”
But is it really?
Or are we sometimes just playing a more elaborate game of hide-and-seek?
This post isn’t an attack on vaults or security teams. It’s an attempt to question the fundamentals honestly, and to ask whether we sometimes confuse layering with eliminating risk.
The Original “Problem”
The setup is common:
- An application stores sensitive values (database passwords, API keys) encrypted in configuration files.
- The encryption key exists somewhere on the same system.
- A security review flags this as a concern: “If someone breaks into the host, they can misuse the key.”
That concern is valid.
The proposed fix is equally common:
“Move secrets to a Vault or HSM.”
That’s where my questioning begins.
The Question No One Likes to Sit With
Let’s pause and ask a simple question:
How does an application access a vault?
Applications aren’t humans. They don’t type passwords or unlock safes. They authenticate using something:
- a service account
- a machine identity
- an IAM role
- a certificate
- a token derived from one of the above
Whatever mechanism we choose, that identity is persistent.
Every time the app starts:
- it uses the same service account
- on the same machine
- with the same permissions
Yes, a vault might issue a short-lived token.
But the initial trust anchor that allows the application to obtain that token is permanent.
So let’s be honest:
If an attacker compromises the host deeply enough to abuse that identity, the vault does not magically save us.
At that point, the system is compromised.
“But the Blast Radius Is Smaller”
This is the argument I hear most often.
And it deserves careful examination.
If an attacker gains:
- OS-level access
- control over the application identity
- the ability to run code as the application
Then:
- In a property-file model → secrets can be decrypted
- In a vault model → secrets can be fetched
The blast radius is functionally the same.
What does change is:
- auditability
- revocation speed
- visibility
- operational discipline
These are valuable improvements.
But they are operational benefits, not a fundamental removal of risk.
Calling this out isn’t denial — it’s accuracy.
Are We Solving Risk, or Managing Optics?
Sometimes the industry narrative feels like this:
- Property files → “bad”
- Vault → “good”
- HSM → “very good”
But the root trust remains unchanged.
So I ask, genuinely and respectfully:
- Are we reducing risk — or redistributing it?
- Are we improving security — or improving how it looks in a review?
- Are we being honest with leadership about what vaults do — and don’t — solve?
Vaults add layers, not absolution.
And layers can be useful — as long as we don’t pretend they remove the foundational dependency on a trusted runtime identity.
In practice, security decisions are often influenced by process and scale. Checklists, standards, and prescribed controls are necessary to manage complexity, but they can also encourage a mindset where the presence of a control is treated as a proxy for actual risk reduction. Over time, it becomes easy to focus on whether a box is ticked rather than whether the underlying threat model has meaningfully changed.
An Unusually Honest Take from the Tomcat Project
While thinking through this topic, I was reminded of a refreshingly candid explanation from the Apache Tomcat documentation. In response to the question “Why are plain text passwords in config files?”, the maintainers write:
“Because there is no good way to ‘secure’ them.
When Tomcat needs to connect to a database, it needs the original password. While the password could be encoded, there still needs to be a mechanism to decode it. And since the source to Tomcat is freely available, the attacker would know the decoding method.
So at best, the password is obscured — but not really protected.
Please see the user and dev list archives for flame wars about this topic.”
— Apache Tomcat Wiki
I find this one of the most honest statements in mainstream infrastructure documentation. It doesn’t promise a silver bullet. It doesn’t confuse obscurity with security. It simply acknowledges reality.
If software must use a secret, that secret must be recoverable.
And if it’s recoverable, it exists within a trust boundary that must be defended.
Why I’m Writing This
This isn’t an anti-vault post.
It’s not an argument against HSMs.
And it’s certainly not an attack on security teams.
It’s a call for intellectual honesty.
Security isn’t about pretending risk disappears.
It’s about understanding where it moves.
If the real threat model is:
“An attacker gains deep control of the host or application identity”
Then no amount of secret shuffling will fully save us.
What Am I Missing?
This is where I genuinely want feedback.
If you believe:
- vaults fundamentally change the threat model,
- HSMs eliminate risks I’m underestimating,
- or my reasoning has blind spots,
I’d love to hear it.
Not marketing answers.
Not compliance checklists.
Real-world reasoning.
Because the worst thing we can do in security is stop asking uncomfortable questions.
The Gimli Glider Moment in Kubernetes Sizing: OCPUs, vCPUs, and Unit Mismatch

In 1983, Air Canada Flight 143 ran out of fuel mid-air.
Not because the pilots forgot to refuel.
Not because the math was wrong.
A Quick Note for the Uninitiated
The Gimli Glider incident was caused by a kilograms-versus-pounds conversion error during manual fuel calculation.
A failed fuel-quantity indicator forced the crew to calculate fuel manually. The aircraft’s systems expected fuel mass in kilograms, but an incorrect conversion using pounds was applied instead.
The math was internally consistent. The unit assumption was wrong.
As a result, the aircraft took off with roughly 45% of the fuel it actually required.
The aircraft survived and became known as the Gimli Glider — a lasting reminder that many failures don’t happen inside systems, but at the boundaries between abstractions.
Four decades later, we still see the same failure pattern — except now it shows up in cloud infrastructure and Kubernetes sizing.
The Modern Gimli Glider: OCPU vs vCPU
In Oracle Cloud Infrastructure (OCI), compute capacity is sized in OCPUs. In Kubernetes, workloads are sized in vCPUs and millicores.
Both are casually referred to as “cores”. Both feel interchangeable. They are not.
And this is where even experienced architects can get tripped up.
A Very Familiar Sizing Conversation
The scenario usually looks like this:
- A hardware sizing calculator says: 4 OCPUs (x86)
- As per the Kubernetes Pod spec the aggregate CPU request across Pods: 8000m CPU
- Someone pauses and says:
“Wait — don’t we actually need 8 cores?”
Suddenly, the workload size is doubled.
Not because the application needs more compute — but because the unit silently changed.
That’s the Gimli Glider moment.
First, an Important Clarification: Our Sizing Model Is Correct
It’s important to state this clearly.
Our internal hardware sizing calculator — used for the banking products we implement — is:
- Expressed in x86 Intel cores
- Uniform across on-prem, OCI, and other Clouds
- Aligned with traditional capacity-planning practices
This is intentional.
It maps naturally to OCI’s OCPU model, where:
- One OCPU represents a physical CPU core
- Performance characteristics are predictable and non-contended
In other words: The hardware sizing calculator is doing exactly what it is supposed to do. Nothing is broken here.
Where the Confusion Actually Starts: Kubernetes
The problem appears at deployment time.
Kubernetes does not speak in physical cores or OCPUs.
It speaks in vCPUs and millicores.
So when an architect looks at a Pod spec and sees:
resources:
requests:
cpu: "8000m"
The instinctive reaction is natural: “This workload needs 8 cores.”
That statement is correct in Kubernetes terms.
But it becomes incorrect when those same numbers are fed directly into a physical-core-based sizing calculator without conversion.
Reduce Everything to the Same Unit
Let’s normalize the units.
On OCI (x86):
- 1 OCPU = 1 physical CPU core
- Each core has 2 hardware threads
- Which is equivalent to 2 vCPUs
In Kubernetes:
- 1000m = 1 vCPU
- Kubernetes schedules logical CPUs, not physical cores
Therefore:
4 OCPUs = 8 vCPUs = 8000 millicores
So if the total CPU requested across all Pods is 8000m, it does not mean you need 8 physical cores. It means you need 4 OCPUs.
The Pod specification is correct. The hardware sizing calculator is correct. Only the unit translation between them is wrong.
Why vCPU Feels Simpler — and Why That’s Misleading
Most engineers grow up with this assumption: 1 vCPU = 1 unit of compute
That assumption holds on many clouds — until it doesn’t.
What vCPU Typically Means
- A thread, not a core
- Threads are shared
- Performance varies due to contention
- You are billed for allocation, not guarantee
What OCPU Means on OCI
- A dedicated physical core
- With two threads entirely yours
- Predictable performance
- No hidden contention
On x86:
1 OCPU = 2 vCPUs (industry equivalent)
So if a workload needs 12 vCPUs, you order 6 OCPUs, not 12. Many teams miss this and quietly over-allocate.
Why CPU Models Often Feel Simpler Than They Are
vCPU-based pricing often looks simpler on paper. But that simplicity comes from abstraction. Behind the scenes, it can involve:
- Shared execution threads
- Variable contention
- Throttling under load
- Latency that changes without application-level causes
If an application’s performance changes without its code changing, that’s usually not the application — it’s the infrastructure abstraction surfacing.
More explicit compute models may feel less intuitive at first, but they force architects to reason about capacity and units — exactly the same discipline Kubernetes already requires.
A Quick Note on ECPU (For Completeness)
OCPU is tied to physical hardware generations (Intel, AMD, ARM).
That makes long-term pricing comparisons fragile.
Oracle introduced ECPU to address this:
- Hardware-agnostic
- Stable pricing across generations
- Default for Autonomous Databases
A simple mental shortcut:
- OCI Compute → OCPU
- OCI Databases → ECPU
- Other clouds → assume contention by default
For Kubernetes sizing, however, the core lesson remains unchanged:
Always convert explicitly to vCPUs and millicores.
Why This Matters Beyond Cost
This isn’t just about saving money. Incorrect unit translation affects:
- Pod density assumptions
- Node utilization
- Autoscaling behavior
- CPU-starvation narratives
- Trust in Kubernetes and sizing tools
When things look inefficient, Kubernetes often gets blamed when the real issue is unit mismatch.
Closing the Loop Back to Gimli
The Gimli Glider wasn’t caused by bad pilots.
It wasn’t caused by bad math.
It happened because two correct systems met without proper unit conversion.
Kubernetes Pod specs and enterprise sizing models are no different.
TL;DR
- vCPU = promise
- OCPU = ownership
- ECPU = future-proof abstraction
- Kubernetes schedules millicores, not marketing units
References
Mead Happens
What I learned from making mead without really knowing what I was doing

The first thing I asked chatGPT wasn’t about yeast or ratios.
It was this:
“I want to try making mead. how do you pronounce mead, by the way?”
That pretty much captures where I was.
Curious.
Uncertain.
Already smiling at the absurdity of it.
Starting Without Knowing the Destination
I had never tasted mead before.
Not once.
So when I decided to make it, I wasn’t trying to recreate anything.
I wasn’t chasing tradition or authenticity.
I was following a feeling.
Honey. Water. Yeast. Time.
That felt like enough.
I remember sending a message to my Fermenters WhatsApp group:
My first mead experiment is underway. Blueberry mead. Let’s see how it goes.
No confidence. No claims.
Just let’s see.
No Gear, No Problem
I didn’t have a brewer’s setup.
No airlocks. No fancy vessels.
Just a one-gallon glass jar sitting quietly in my kitchen cupboard.
I followed a tutorial, vaguely.
Adjusted instinctively.
One kilo of honey in a gallon of water.
Some blueberries — 100 grams? Maybe 200? I honestly don’t remember.
Wine yeast.
I told the group:
I have no idea how this will turn out. Planning to ferment for about six weeks. I think it’ll be a bit tangy.
That sentence alone says everything.
Guessing. Predicting. Learning out loud.
When It Started Moving
A week in, something changed.
The mead was alive.
Bubbles rose constantly — tiny, joyful streams.
The yeast responded visibly every time I fed nutrients.
So I posted again:
Mead is one week old now. Happy and dancing when I feed the yeast nutrient.
It sounds silly written down.
But watching those bubbles genuinely made me happy.
This wasn’t chemistry anymore.
It was companionship.
Smell as a Signal
A few days later:
Healthy and active… smelling great already.
That was when I realized something important.
I didn’t have numbers.
I didn’t have measurements.
But I had my senses.
If it smelled alive, clean, promising — I trusted it.
The First Taste
Eventually, curiosity won.
I poured a small glass.
Mead happens.
That’s literally what I wrote.
It was aromatic.
Clear.
Somewhere between sweet and tangy.
Definitely alcoholic.
I had no idea what the ABV was, but I guessed — maybe 10%.
It felt right.
I called it a preview tasting and decided to let it go for two more weeks before bottling.
Pretty satisfying so far.
Understatement of the year.

What This Mead Taught Me
I still don’t know if this is what a “proper” mead is supposed to taste like.
And I’m okay with that.
Because what I do know is this:
I trusted intuition.
I learned patience.
I watched something invisible become visible.
I fed yeast and celebrated bubbles.
I waited without rushing.
And in the end, I poured myself a glass of something I made — without fully knowing what I was doing — and genuinely liked it.
That feels like success.
Epilogue
I started this journey asking how to pronounce mead.
Now I’m thinking about my next batch.
Honey has a way of doing that.
Milk Kefir: The Fermentation That Never Failed Me

My journey into fermentation didn’t start with mead or kombucha.
It started much closer to home — with yogurt.
Growing up, I watched my mother and grandmother make yogurt effortlessly. Warm milk, a spoon of culture, wrap it up, wait. It felt almost magical in its simplicity. So naturally, I assumed yogurt-making would be the easiest fermentation project in the world.
It wasn’t. My first three batches went bad.
Maybe the temperature wasn’t right. Maybe the culture was weak. Even after I finally got it right, yogurt still felt… temperamental. Most batches worked, but every now and then one would quietly betray me overnight.
That’s when I learned my first real fermentation lesson:
simple doesn’t always mean easy.
Falling Down the Fermentation Rabbit Hole
Curiosity (and mild annoyance) pushed me further. I moved on to kombucha. Then jun. Sauerkraut followed. Lacto-fermented vegetables came next. Each project taught me something new about microbes, patience, and the quiet intelligence of living systems.
Somewhere along the way, I kept hearing about kefir.
Being in Singapore, I never really saw it. It wasn’t something I noticed in supermarkets, and if it existed, it stayed well hidden from me. All I knew about kefir came from online fermentation communities — people casually talking about it like it was yogurt’s cooler, calmer cousin.
Interesting, but abstract.
That changed last year.
Austria, a Supermarket, and Love at First Sip
During a trip to Austria, I walked into a supermarket and there it was — kefir, sitting confidently next to yogurt like it had always belonged there.
No mystery. No specialty shelf. Just… kefir.
I picked one up out of curiosity and took my first sip.
And that was it.
It tasted familiar, yet different. Yogurt-like, but gentler. Tangy, but softer. Almost as if yogurt had decided to relax a little and stop taking life so seriously.
Love at first sip might sound dramatic — but honestly, it fits.
Standing there, I remember thinking:
I need to try making this.
When I returned to Singapore, I didn’t look for kefir in stores. I went straight to the source.
I ordered my first batch of milk kefir grains online. A few days later, they arrived — shipped from Vietnam. A small living culture, crossing borders, ready to set up home on my kitchen counter.
I added the grains to milk. Left the jar out. Waited.
The next day, I strained it.
Perfect.
The Fermentation That Just Works
That was about one and a half years ago. Since then, milk kefir has become the most reliable fermentation project I’ve ever done.
Not one batch has gone bad.
And that’s what still amazes me.
Here’s why milk kefir feels almost unfairly easy:
- No boiling milk
- No temperature monitoring
- No thermometers, wraps, or warm corners
- No stress
You add kefir grains to milk — even raw milk — and leave it at room temperature. The grains regulate the fermentation themselves.
Hot day? Fine.
Cooler night? Still fine.
Milk kefir doesn’t ask you to be precise. It asks you to show up.
Kefir vs Yogurt (They’re Really Not the Same)
Milk kefir is often described as “drinkable yogurt,” but that’s like calling sourdough “fermented toast.”
Yogurt
- Uses specific bacterial strains
- Usually 2–5 types of bacteria
- Needs warm incubation (~40–45°C)
- Thick, spoonable, controlled
Milk Kefir
- Uses kefir grains (a living symbiotic culture)
- Contains 30–60+ bacteria and yeast strains
- Ferments at room temperature
- Pourable, lightly fizzy, complex
Yogurt is a project.
Milk kefir is an ecosystem.
Milk Kefir vs Water Kefir
They share a name, but behave very differently.
Milk Kefir
- Feeds on lactose
- Creamy, tangy, probiotic-rich
- Deeply traditional
Water Kefir
- Feeds on sugar water
- Light, fizzy, soda-like
- Refreshing and playful
Water kefir can be moody if sugar or minerals aren’t right.
Milk kefir? It just quietly does its thing.
A Few Fun Kefir Facts
- Kefir grains are not grains — they’re living colonies of bacteria and yeast
- They grow over time, meaning kefir literally makes more kefir
- Kefir predates refrigeration; nomadic cultures fermented milk in leather bags
- The word kefir is linked to “feeling good after eating”
Accurate.
Why Milk Kefir Stayed With Me
After all my experiments, milk kefir became part of my daily rhythm. No planning. No anxiety. Just:
Strain.
Refill.
Repeat.
It taught me something unexpected:
Fermentation doesn’t always have to be fragile.
Some cultures are resilient. Some processes are designed to survive human inconsistency. Milk kefir doesn’t punish you for being imperfect.
And maybe that’s why it feels so comforting.
Closing Thought
My fermentation journey began with failure — spoiled yogurt, confusion, and trial-and-error. It expanded into exploration, curiosity, and controlled chaos.
Then milk kefir arrived and showed me another side of fermentation altogether.
Quiet. Reliable. Alive.
If yogurt taught me discipline, milk kefir taught me trust.
And sometimes, that’s exactly what a jar on your kitchen counter is meant to teach you.

OCI Object Storage: Cleaning Up Orphaned Multipart Uploads
Background
I use OCI Object Storage as a target for periodic backups.
The backups are generated by a scheduled routine, uploaded to Object Storage, and older backups are automatically deleted after a fixed retention period using prefix-based lifecycle logic implemented in my own scripts.
This setup had been working reliably.
One day, while checking the Object Storage console, I noticed a warning:
Uncommitted multipart uploads count: More than 1,000 uploads
What made this puzzling was:
- The referenced paths didn’t exist in the bucket
- No corresponding “folders” were visible
- Recent backups looked perfectly fine
This post documents what that warning actually means and how to safely clean up those orphaned uploads.
What Are “Uncommitted Multipart Uploads”?
Object Storage uses multipart uploads automatically for large files.
A multipart upload works in two phases:
- Upload object parts
- Commit the upload to assemble the final object
If an upload fails before the commit step (for example due to a crash, timeout, restart, or transient network issue), the uploaded parts remain as an incomplete multipart upload.
Key characteristics:
- These are not real objects
- They do not appear in the Objects list
- They consume backend resources
- OCI does not automatically delete them
Once the number grows large, OCI displays a warning.
Why These Uploads Are Invisible in the Console
Object Storage is a flat namespace.
“Folders” are simply prefixes that exist only when at least one object is present.
Because incomplete multipart uploads never produce a final object:
- No object exists
- No prefix hierarchy exists
- Nothing shows up in the bucket view
The only way to see them is via multipart upload listing, which is not always exposed in the OCI Console.
Step 1: Set Up OCI CLI
To investigate further, I used the OCI CLI.
Verification:
Step 2: List Multipart Uploads
To list all incomplete multipart uploads:
oci os multipart list --bucket-name obj-store --namespace <namespace> --all
Sample output:
{
"object": "backup/2025/09/06/.../DATA_EXPORT.dmp",
"upload-id": "f8544f02-67ba-2112-15fb-a8df3448457a",
"time-created": "2025-09-06T22:00:55Z"
}
Important detail:
The object field represents the intended object name, not an actual file.
Step 3: Is It Safe to Delete These?
Yes — always.
Aborting a multipart upload:
- Deletes only temporary upload parts
- Never deletes completed objects
- Never affects visible data
Step 4: Abort a Single Multipart Upload (Manual Test)
/path/to/oci os multipart abort --bucket-name obj-store --namespace <namespace> --object-name "backup/2025/09/06/.../DATA_EXPORT.dmp" --upload-id f8544f02-67ba-2112-15fb-a8df3448457a --force
Step 5: Generate Cleanup Commands for Review
oci os multipart list --bucket-name obj-store --namespace <namespace> --all --query "data[].{object:object,upload_id:\"upload-id\"}" --output json | jq -r '
.[] |
"echo /path/to/oci os multipart abort --bucket-name obj-store --namespace <namespace> --object-name \"" +
.object +
"\" --upload-id " +
.upload_id +
" --force"
'
Lessons Learned
- Multipart uploads can fail silently
- Orphaned uploads are invisible in the bucket view
- OCI does not auto-clean multipart uploads
- CLI access is essential for Object Storage hygiene
- Periodic cleanup is a useful safety net
Closing Thoughts
This issue wasn’t about missing data — it was about invisible leftovers.
If you use OCI Object Storage for large uploads, understanding multipart upload behavior and cleanup is essential for long-term operational hygiene.
The Missed Call from the Future
What a chat about TOON taught me about hallucination, pattern recognition, and the future of machine understanding.
There was this old internet Rajani joke: When Graham Bell invented the phone, it already had a missed call from Rajinikanth.”

I had my own version of that moment recently — not with a phone, but with an AI.
My Pocket Tutor
These days I often use chatbots to learn things faster than reading entire docs or watching tutorials on YouTube.
It’s like having a patient tutor in my pocket — one that never rolls its eyes when I ask dumb questions, can tailor its explanations to my mood, and always sounds confident, even when it shouldn’t.
So when I heard about a new data format called TOON, described as an optimized JSON for LLM tokens, I did what I always do: I asked my digital tutor, ChatGPT.
It immediately launched into a lecture — eloquent, structured, confident — and absolutely wrong.
Somewhere between “animation” and “structured humor,” I realized it had completely misunderstood me.
It was talking about cartoons, not TOON.
My tutor had just hallucinated.
The Correction
I laughed and said, “Hey buddy, could you check the internet this time?”
After a thoughtful pause, it came back
“You’re right — I jumped the gun,” it replied. “Here’s what TOON actually is…”
And then it went on to explain, with sources, how it could reason about unseen formats.
That sparked a deeper reflection with a crisp, well-researched explanation of the real TOON (Token-Oriented Object Notation) — a brand-new data format designed to shrink token usage in LLM prompts.
This time it was flawless.
Don’t Shoot the Trigger
Pulling the Right Trigger: Crafting Event-Driven Harmony

Some people still treat database triggers and stored procedures like relics from another era — tools that should have disappeared along with floppy disks. They say the database should stay dumb, while all the “smart” work should happen outside in fancy middleware or change-data-capture pipelines. But when you think about it, the database isn’t just a place to store facts — it’s where those facts are born. That makes it the best place to notice and react to change. Ignoring that is like keeping your hand on a hot stove and waiting for someone to remind you it’s burning.
We’ve been building a lightweight replication utility called Saga Replication — an internal framework, a small but powerful utility designed to enable event-driven architecture (EDA) for data synchronization across disparate platforms.
Instead of doing heavy log mining or byte-for-byte or table-to-table replication like traditional CDC tools, Saga Replication focuses on what really matters: the business event.
- It uses lightweight triggers to publish simple, meaningful messages into Oracle Transactional Event Queues (TEQ).
- TEQ listeners then pick up those messages, create the final payloads, and send them wherever they need to go — to Kafka, APIs, or other systems
- The heavy work happens asynchronously, often on a read replica, so the main database stays fast and clean, and can even operate in a CQRS pattern using a read replica for fan-out.
The result is a low-maintenance, configurable, and flexible replication framework that’s reliable enough for production yet simple enough to maintain without an army of DBAs and middleware admins.
But of course, mention “trigger-based replication” in an architecture forum, and you’ll see eyebrows rise — usually accompanied by the familiar phrases:
“Triggers are evil.”
“Triggers kill performance!”
“Never use triggers!”
Agentic AI: Learning to Trust Probabilistic Logic in a Deterministic World
The Hype and the Hope
Every few months, our industry finds a new buzzword to rally around. Right now, it’s Agentic AI — and if you’ve watched any tech conference or YouTube demo lately, you’ve probably seen it too:
you say “Book me a 3-day trip to Tokyo next weekend”, and magically, an AI assistant checks flights, books hotels, suggests ramen spots, and wraps it all up in a sleek itinerary.
And as an engineer, you probably thought:
“Cool Demo, but isn’t that just a bunch of API calls triggered by natural language. Where’s the revolution?”
A rule-based system could do that faster, cheaper, and with fewer hallucinations. So again: where’s the intelligence?
You’re not alone.

🧠 The Engineer’s Skepticism: Where’s the Real Innovation?
Let’s be honest. Most so-called “AI agent demos” are just glorified function orchestration — RAG plus function calling engine with an LLM sitting somewhere in the middle.
Common examples:
- Travel Agent: checks flights, suggest itnerary, books flights and hotels.
- Ops Agent: fetches logs, runs analysis, uses RAG to find a fix, and posts a summary on Slack.
- Retail Agent: chats with a customer about a refund, validates a cheaper price link, and issues a return.
We’ve been doing this with RPA, BPM, and microservice orchestration for years. The only difference? Now you can talk to it in English.
So yes — for those of us wired for rules and determinism, the hype feels misplaced.
⚙️ Where Agentic AI Actually Starts
The difference emerges when things don’t go as planned.
What if:
- The flight API fails?
- The hotel site returns JSON errors?
- Your corporate travel policy forbids non-refundable fares?
A rule-based system crashes or escalates here.
An agentic system can adapt:
- Retry the flight search with alternate dates.
- Infer that “weekend” likely means Friday–Sunday.
- Check policy from a knowledge base before booking.
- Ask follow-up questions like: “Would you like me to use reward points instead?”
It’s not about replacing deterministic rules — it’s about making orchestration resilient in the face of uncertainty.

🔍 Deconstructing the “Agent” — What’s Really Going On
Most “agents” today are really just this:
- LLM as interpreter — Parses natural language into structured intents.
- Function calling and tool orchestration — Invokes APIs to complete the tasks.
- State tracking — Maintains a small context window or memory.
In other words: an event-driven orchestrator that happens to use English as its DSL.
Limitations
- Fragile prompts.
- Non-deterministic outputs.
- Context loss over longer tasks.
It’s easy to see why developers call this “glorified automation.”
🧩 Why It’s Still Interesting — The Hidden Hope
Because beneath the orchestration lies autonomy — and that’s new.
The leap comes from combining:
- Language reasoning — inferring goals.
- Environment awareness — observing context.
- Self-correcting loops — retrying and refining actions.
Together, they move us from automation to adaptive systems.
Early Signs
- Multi-Agent Systems (A2A) – digital teams of cooperating agents.
- Tool Access Frameworks – structured planning via LangGraph, CrewAI, smolagents.
- MCPs – standard protocols for models to safely access enterprise tools.
It’s not that agents are smart — it’s that they’re autonomous participants in workflows.
🏗️ Emerging Patterns of Agentic Architecture

🧠 LLM-as-Orchestrator
The model decomposes goals into subtasks and routes to the right tools.
goal = "diagnose database issue"
plan = llm.plan(goal)
for step in plan:
result = execute(step)
llm.reflect(result)
Clear schemas, permissions, and safe defaults define what the agent can do — and nothing more.
💾 Memory + Context Store
Vector stores, caches, and short-term memories preserve context between steps.
👩💻 Human-in-the-Loop
Confidence thresholds trigger human review — balancing autonomy with accountability.
🏦 Enterprise Reality — Security, Privacy, and Scale
Enterprises love automation, but they fear unpredictability.
Let’s take a real-world example — a bank. Governance and Compliance aren’t optional here; they’re the bloodstream of the system.
You don’t want a random LLM waking up one morning and deciding:
“Hey, this payment looks fine — let me approve it!”
or
“Hmm, that forex transaction seems suspicious — I’ll just block it.”
That’s not intelligence; that’s chaos.
Banks rely on deterministic flows for a reason. These systems are built with layers of security, auditing, compliance, sanction screening, limit validation, concurrency controls — all the invisible scaffolding that makes the enterprise world safe and accountable.
So no, we don’t want some “AI intern” micromanaging mission-critical processes.
We still want the rule engines to do what they do best — execute with precision, traceability, and zero drama.
For Agentic AI to succeed, we need controlled autonomy:
- Governance – operate under strict policy scopes.
- Traceability – log every decision and retry.
- Sandboxing – run agents in isolated, permissioned environments.
- Fail-safes – deterministic fallbacks when confidence drops.
- Compliance awareness – policy reasoning, not policy ignorance.
🔧 The Engineer’s Role — Bridging Two Worlds
Here’s the truth: all of this still runs on deterministic plumbing.
We still build:
- APIs, data models, and integration layers.
- Security and monitoring frameworks.
- Validation and governance boundaries.
Agentic AI doesn’t replace engineering — it layers probabilistic reasoning on deterministic infrastructure.
Think of it as giving software an intent interpreter — but we still define the rails, guardrails, and safety laws it runs on.
🚀 Towards the Future — Hope Beyond Hype
Where this is heading:
- Hybrid reasoning models: combining deterministic logic with probabilistic loops.
- Trust layering: agents that explain their choices.
- Interoperability standards (MCP): multi-agent collaboration safely across systems.
- Digital co-workers: AI systems as teammates, not tools.
We’re moving toward a continuum — between structured rules and adaptive reasoning.
🧭 Closing Reflection
Agentic AI is neither sci-fi nor snake oil. It’s the next abstraction in computing — a shift from coding instructions to expressing intent.
The deterministic engineer in me still craves structure.
The architect in me, though, knows that intent-driven systems are inevitable.
So yes, most demos today are trivial.
But tomorrow, when agents start solving problems without being told every step — it’ll still be the deterministic rails we built that make autonomy trustworthy.
Written by Ranjith Vijayan
Architect • Technologist • Skeptic who still believes in well-defined APIs
TL;DR:
DBMS_CLOUD.PUT_OBJECT can upload binaries, but it doesn’t let you attach user-defined metadata. To set metadata (the opc-meta-* headers), call the Object Storage REST API directly from PL/SQL using DBMS_CLOUD.SEND_REQUEST and authenticate with an OCI API Signing Key credential (not an Auth Token).
Why this trick?
If you need to stamp files not only with business context—say customerId=C-1029 or docType=invoice—but also with technical metadata like content-type=application/pdf —you must send those as opc-meta-* HTTP headers on the PUT request. That’s not supported by DBMS_CLOUD.PUT_OBJECT, so we invoke the REST endpoint ourselves with DBMS_CLOUD.SEND_REQUEST.
Heads-up on credentials: For SEND_REQUEST, use an API key (RSA key + fingerprint) credential. Auth Tokens are great for other DBMS_CLOUD operations, but they don’t work reliably for SEND_REQUEST calls to the Object Storage REST endpoints.
Step 1 — Create an API Key credential in ADB
Create a credential backed by an OCI API Signing Key (user OCID, tenancy OCID, PEM private key, and fingerprint).
-- API Key credential (DBMS_CLOUD.CREATE_CREDENTIAL)
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL(
credential_name => 'obj_store_cred',
user_ocid => 'ocid1.user.oc1..aaaaaaaaa...',
tenancy_ocid => 'ocid1.tenancy.oc1..aaaaaaaa...',
private_key => '-----BEGIN PRIVATE KEY-----
YOUR PEM=
-----END PRIVATE KEY-----
OCI_API_KEY',
fingerprint => 'yo:ur:fi:ng:er:pr:in:t...'
);
END;
/
Tip: Make sure your IAM user/group has a policy that allows put/write on the target bucket/namespace/region.
Build the HTTP headers (including your opc-meta-* pairs) and invoke PUT to the regional Object Storage endpoint.
DECLARE
l_resp DBMS_CLOUD_TYPES.RESP;
l_hdrs JSON_OBJECT_T := JSON_OBJECT_T();
l_blob BLOB;
BEGIN
-- Get your file as BLOB (example: from a table)
SELECT file_content INTO l_blob FROM your_table WHERE id = 1;
-- Required content type + user-defined metadata
l_hdrs.put('content-type', 'application/pdf');
l_hdrs.put('opc-meta-customerId', 'C-1029');
l_hdrs.put('opc-meta-docType', 'invoice');
-- Upload with metadata
l_resp := DBMS_CLOUD.SEND_REQUEST(
credential_name => 'obj_store_cred',
uri => 'https://objectstorage.ap-seoul-1.oraclecloud.com/n/<your_namespace>/b/<your_bucket>/o/demo/test.pdf',
method => DBMS_CLOUD.METHOD_PUT,
headers => l_hdrs.to_clob, -- pass JSON headers as CLOB
body => l_blob -- your file BLOB
);
DBMS_OUTPUT.PUT_LINE('HTTP status: ' || l_resp.status_code);
END;
/
That’s it—your object is stored with metadata like opc-meta-customerId=C-1029.
You can issue a HEAD request and inspect the response headers.
DECLARE
l_resp DBMS_CLOUD_TYPES.RESP;
BEGIN
l_resp := DBMS_CLOUD.SEND_REQUEST(
credential_name => 'obj_store_cred',
uri => 'https://objectstorage.ap-seoul-1.oraclecloud.com/n/<your_namespace>/b/<your_bucket>/o/demo/test.pdf',
method => DBMS_CLOUD.METHOD_HEAD
);
DBMS_OUTPUT.PUT_LINE(l_resp.headers); -- contains the opc-meta-* pairs
END;
/
Troubleshooting
- 401/403: Confirm IAM policy, correct region in the endpoint, and the namespace in the URI.
- Credential name mismatch: Use the same
credential_name you created.
- Auth Token vs API Key: If you used an Auth Token credential and get auth errors with
SEND_REQUEST, switch to an API Signing Key credential.
- Content-Type: Set a proper
content-type for the uploaded object so clients recognize it correctly.
References
- Oracle docs — User-defined metadata uses the
opc-meta- prefix
https://docs.public.content.oci.oraclecloud.com/en-us/iaas/compute-cloud-at-customer/topics/object/managing-storage-objects.htm
- Oracle docs — When to use Auth Tokens vs. API Signing Keys in DBMS_CLOUD
https://docs.oracle.com/en-us/iaas/autonomous-database/doc/dbms_cloud-access-management.html
- Oracle blog —
DBMS_CLOUD.SEND_REQUEST basics (headers/body as JSON, returns DBMS_CLOUD_TYPES.RESP)
https://blogs.oracle.com/developers/post/reliably-requesting-rest-results-right-from-your-sql-scripts
Last updated: 2025-08-12
The Art of (C)Lean Code in the Era of Vibe-Coding
How my AI-built spaghetti turned into a clean kitchen (but not without some broken plates)

The Beginning — When Code Felt Like Handcraft
Not too long ago, coding felt like woodworking.
You’d measure, cut, sand, polish. Everything took time.
Then came vibe-coding.
Now you just tell your AI what you want —
“Build me a dashboard with charts, export to Excel, and a dark mode” —
and it does it in minutes.
It’s intoxicating. You feel like a magician.
But magic always comes with fine print.
My First Vibe-Coding Mess
I was building a tool using vibe-coding.
I didn’t touch the code much — I just described errors from the logs back to the AI.
Every time something broke, I’d paste the stack trace, and AI would “fix” it.
It kept working… kind of.
But inside, the code was becoming spaghetti —
pieces added over pieces, quick fixes on top of quick fixes.
Some functions were 200 lines long, doing five unrelated things.
Then one day, I decided: Let’s clean this up.
I asked AI to refactor the code to be shorter, modular, and easier to read.
The result?
Beautiful, lean, elegant code.
…Except it also broke three important features.
That’s when I learned an important truth:
Clean code isn’t just about looking clean — it’s about keeping things working while cleaning.
The Pitfalls of Vibe-Coding
From that day, I started spotting patterns in what goes wrong when you rely on AI without guardrails:
-
The Illusion of “It Works”
Just because your app runs doesn’t mean it’s solid.
AI can fix one bug while planting seeds for three more.
-
Invisible Complexity
AI might import six packages for a one-line feature.
Like buying a screwdriver and getting a forklift delivered.
-
Maze Code
Without naming and structure discipline, you get a codebase no one wants to touch.
-
Overfitting Today’s Problem
AI often solves the exact issue you describe —
but the next small change might require rewriting everything.
How I Turned Spaghetti into Clean Pasta
Here’s what I started doing after my “broken features” incident:
1. Refactor in Small Steps
Instead of one big “clean-up”, break it into small safe changes.
Run tests after each change.
2. Skim, Don’t Skip
Even if you don’t read every line, scan enough to see:
- Hidden dependencies
- Redundant logic
- Odd variable names
3. Add Guardrails Early
Use:
- Linters for style
- Automated tests for core features
- Static analysis to catch silly mistakes
4. Save Good Prompts
If you find a prompt that gives clean, modular code — reuse it.
Bad prompts often produce messy code.
5. Document Decisions
Write why you solved it a certain way.
Future-you will thank you when debugging at 2 a.m.
A Practical Example — My Spaghetti Incident (Python)
Messy AI-Generated Python (Evolved Over Time)
import os, sys, json, datetime, time
def processData(stuff, filePath=None):
try:
if filePath == None:
filePath = "out.json"
res = []
for i in range(len(stuff)):
if "date" in stuff[i]:
try:
d = stuff[i]["date"]
if d != None and d != "" and not d.startswith(" "):
try:
stuff[i]["date"] = datetime.datetime.strptime(d, "%Y-%m-%d").strftime("%d/%m/%Y")
except Exception as e:
print("err date", e)
else:
pass
except:
pass
res.append(stuff[i])
try:
j = json.dumps(res)
with open(filePath, "w+") as f:
f.write(j)
except:
print("write err")
except:
print("error")
time.sleep(0.2)
return "ok"
Why it’s so messy:
- Multiple unused imports (
os, sys, time just to sleep for no reason)
- Deeply nested
try/except swallowing real issues
- Redundant
None and empty string checks
- Weird variable names (
stuff, i, d, j)
- Logic scattered and repetitive
- Random
sleep at the end, serving no purpose
Clean 4-Liner Version
from datetime import datetime
import json
def process_records(records, output_file="out.json"):
for r in records:
if r.get("date"): r["date"] = datetime.strptime(r["date"], "%Y-%m-%d").strftime("%d/%m/%Y")
json.dump(records, open(output_file, "w"), indent=2)
Another “Seen in the Wild” — JavaScript
Messy AI-Generated JavaScript (Evolved Over Time)
function calcTotals(arr, output){
var total = 0;
var t = 0;
if (!arr) { console.log("no data"); return; }
for (var i = 0; i < arr.length; i++) {
var itm = arr[i];
if (itm.p != null && itm.q != null) {
try {
var p = parseInt(itm.p);
var q = parseInt(itm.q);
if (!isNaN(p) && !isNaN(q)) {
t = p * q;
total = total + t;
} else {
console.log("bad data at index", i);
}
} catch(e) {
console.log("err", e)
}
} else {
console.log("missing fields");
}
}
console.log("total is", total);
if (output) {
try {
require('fs').writeFileSync(output, JSON.stringify({total: total}));
} catch (err) {
console.log("write fail");
}
}
return total;
}
Why it’s a mess:
- Variable naming chaos (
total, t, itm, p, q)
- Multiple redundant null and NaN checks
- Logging everywhere instead of proper error handling
- Old-school
for loop where a reduce would be cleaner
- Random optional file-writing mixed into calculation logic
- Overly verbose try/catch for simple operations
Clean 3-Liner Version
function calculateTotal(items) {
return items.reduce((sum, { price, quantity }) => sum + (Number(price) * Number(quantity) || 0), 0);
}

Red Flags to Watch
- Magic numbers or strings everywhere
- Too many dependencies for small features
- Overcomplicated solutions to simple problems
- Mixing coding styles without reason
- No security checks
The Mindset Shift
Vibe-coding is powerful, but the AI is not your senior architect.
It’s a fast builder who never sweeps the floor.
Your job?
Make sure the house it builds won’t collapse when someone slams the door.
The Ending — Clean, Lean, and Battle-Tested
After my spaghetti incident, I learned:
- Code can be both fast to write and safe to maintain
- Refactoring is not just “making it pretty” — it’s making it future-proof
- Tests are your safety net when cleaning up AI’s work
Vibe-coding is here to stay.
But clean, lean code is what will keep you from living in a haunted house.
Oracle APEX’s GenAI Dynamic Actions make it effortless to drop an AI chat widget into your app.
The catch? As of now they’re hard-wired only for certain API providers such as OpenAI, Cohere, and OCI Gen AI.
If your favorite model—say Anthropic’s Claude—uses a different JSON format, it won’t plug in directly.
I ran into this exact roadblock… and found a workaround.
With one small PL/SQL proxy layer, you can keep APEX’s low-code experience and talk to any API — all without signing up for third-party routing services or sharing your API keys outside your control.
The Problem
- APEX GenAI DA sends payloads to the
/v1/chat/completions endpoint in OpenAI format.
- Claude (and most non-OpenAI models) expect a different endpoint and JSON schema (
/v1/messages for Anthropic).
- No setting exists to override the request/response format in the low-code GenAI DA.
- Services like OpenRouter or Together AI can wrap Claude in an OpenAI-compatible API — but they require you to use their API keys and bill through them.
I wanted full control over my keys and usage.
The Workaround
Instead of changing the APEX widget or paying for a middleman, make the API look like OpenAI yourself.
We’ll:
- Create a PL/SQL package that:
- Receives OpenAI-style requests from the DA.
- Transforms them to the target model’s request format.
- Sends them using
APEX_WEB_SERVICE with your own API key.
- Transforms the model’s response back into OpenAI’s shape.
- Expose that package through ORDS at
/v1/chat/completions.
- Point the APEX GenAI DA to your ORDS endpoint instead of api.openai.com.
How It Works
Before
APEX Chat Widget -> OpenAI API -> GPT model
After
APEX Chat Widget -> ORDS Proxy -> Target AI API
The DA still thinks it’s talking to OpenAI, but the proxy does the translation behind the scenes — with zero third-party dependency.
Architecture Diagram (Flowchart)
flowchart LR
A[APEX GenAI Chat Widget] --> B[ORDS endpoint /v1/chat/completions]
B --> C[PLSQL proxy JSON transform]
C --> D[Target AI API]
D --> C
C --> B
B --> A
Architecture Diagram (Sequence)
sequenceDiagram
participant A as APEX GenAI Chat Widget
participant B as ORDS endpoint (/v1/chat/completions)
participant C as PLSQL proxy JSON transform
participant D as Target AI API (Claude / OCI GenAI / Gemini)
A->>B: OpenAI-style request
B->>C: Forward request
C->>D: Transform & call provider
D-->>C: Provider response
C-->>B: Convert to OpenAI format
B-->>A: Chat completion response
Key Code
You’ll find the full working package and ORDS handler in my GitHub repo (link below).
https://github.com/cvranjith/apex-claude-proxy
Highlights:
- Native JSON parsing: Uses
JSON_OBJECT_T / JSON_ARRAY_T instead of APEX_JSON for cleaner, standard parsing.
- APEX_WEB_SERVICE: Handles outbound HTTPS with your APEX credentials; no
UTL_HTTP wallet headaches.
- Configurable model & tokens: Pass
max_output_tokens, temperature, etc., through your proxy.
Example call in the proxy:
APEX_WEB_SERVICE.ADD_REQUEST_HEADER('x-api-key', l_api_key);
APEX_WEB_SERVICE.ADD_REQUEST_HEADER('anthropic-version','2023-06-01');
l_resp_clob := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
p_url => 'https://api.anthropic.com/v1/messages',
p_http_method => 'POST',
p_body => l_body.to_clob()
);
Choosing the Right Claude Model
For general-purpose chat + content creation with JSON analysis:
- claude-opus-4-1-20250805 – highest quality, deepest reasoning.
- claude-sonnet-4-20250514 – great balance of quality and speed.
- claude-3-7-sonnet-20250219 – solid hybrid reasoning, lower cost.
ORDS and APEX Setup
For this integration to work with the APEX GenAI chat widget, your ORDS API must have a POST handler with a URI template ending in:
ORDS Definition Example
ORDS.DEFINE_TEMPLATE(
p_module_name => 'claude-proxy',
p_pattern => 'chat/completions',
p_priority => 0,
p_etag_type => 'HASH',
p_etag_query => NULL,
p_comments => NULL);
ORDS.DEFINE_HANDLER(
p_module_name => 'claude-proxy',
p_pattern => 'chat/completions',
p_method => 'POST',
p_source_type => 'plsql/block',
p_mimes_allowed => NULL,
p_comments => NULL,
p_source =>
' DECLARE
l_body CLOB := :body_text;
l_out CLOB;
BEGIN
l_out := claude_proxy.chat_completions(l_body);
OWA_UTIL.mime_header(''application/json'', TRUE);
HTP.prn(l_out);
END;
');
APEX Generative AI Service Definition
In APEX, go to:
Workspace Utilities → Generative AI Services → Create New Service
Choose “OpenAI” as the service provider.
- URL:
Set it to your ORDS handler without the /chat/completions suffix.
Example:
https://xxxx.adb.ap-singapore-1.oraclecloudapps.com/ords/xxx/claude-proxy/v1
- Additional Attributes:
Add any attributes your target model requires.
For example, many Claude models require max_tokens:
- AI Model:
Declare the model ID you want to use. Example:
Works with OCI Generative AI Agents Too
Oracle’s blog Integrating OCI Generative AI Agents with Oracle APEX Apps for RAG-powered Conversational Experience demonstrates a different approach:
They use low-level REST API calls directly to OCI Generative AI and render messages in a classic report to mimic a chat experience.
That works well, but it’s still a custom UI — you build and maintain the conversation rendering logic yourself.
With this proxy method, you can:
- Keep the APEX GenAI Dynamic Action chat widget for a true low-code UI.
- Point it to your ORDS proxy.
- Have the proxy map the OpenAI-style request to the OCI Generative AI API format (with OCI auth,
modelId, and input).
- Map the OCI response back into the OpenAI
chat/completions shape.
You get:
- The same RAG-powered intelligence from OCI Generative AI.
- Zero custom UI code.
- Full control over authentication and model switching.
Why This is Powerful
- No UI rewrites – keep using the low-code chat widget.
- Model agnostic – works for Claude, OCI GenAI, Gemini, Mistral, or any API.
- Full control – you never hand over your API key to a third-party router.
- Central control – one place to add logging, prompt tweaks, or safety filters.
Write History with the Future in Mind: Modernizing ACTB_HISTORY for the Cloud Era with Hybrid Partitioning
Banking systems like Oracle Flexcube often accumulate massive volumes of transactional history due to regulatory data retention requirements. Core banking tables such as ACTB_HISTORY and ACTB_ACCBAL_HISTORY grow continuously, since purging old records isn’t an option when auditors or customers may request decade-old transactions. Over time, this ever-growing historical data becomes an expensive burden — it inflates backup storage, slows down restore/recovery, and consumes premium database storage while providing little day-to-day business value.
Introduction
Oracle Database 19c introduced a game-changing feature to tackle this challenge: Hybrid Partitioned Tables (HPT). This lets you split a single table’s partitions between regular internal storage and external files, including cheap cloud storage. In practice: keep recent “hot” data in the database for fast access, while offloading cold historical partitions to low-cost storage like Oracle Cloud Infrastructure (OCI) Object Storage — without deleting or losing access to any data. The goal here is to show how HPT optimizes data management in banking applications, preserving full query access to historical data at a fraction of the cost and complexity of traditional archiving.
Challenges in Managing Historical Banking Data
Banks must retain years of transactional data (often 7-10+ years) to meet regulatory and auditing mandates. In practice, that means tables like ACTB_HISTORY continuously accumulate records from daily core banking operations. Key challenges:
- Exploding data volumes: tables can reach billions of rows, making routine maintenance and indexing difficult.
- Backup and recovery overhead: backups get longer and heavier; cloning to non-prod becomes cumbersome.
- Performance impact: old records are rarely accessed online, but their presence can still impact performance.
- Regulatory constraints: purge isn’t an option; data must remain queryable on demand for audits and inquiries.
Traditional archiving (export + purge) complicates retrieval when auditors need data. HPT changes that.
What Are Hybrid Partitioned Tables?
Hybrid Partitioned Tables extend Oracle’s partitioning by allowing some partitions to reside in the database (internal) and others to reside outside (external). In one table, you get recent partitions as normal segments for high-performance OLTP, while older partitions are pointers to files (CSV/Data Pump dumps) on inexpensive storage tiers. Oracle exposes both as a single, unified table to queries.
Key aspects:
- Partitioning strategies: RANGE or LIST (composite also possible), e.g., partition by year.
- External formats: text/CSV via
ORACLE_LOADER, binary dumps via ORACLE_DATAPUMP, and others.
- Read-only external partitions: ideal for archive data; DML is blocked (ORA-14466).
- Seamless queries: SQL spans internal and external partitions transparently.
- At least one internal partition is required as an anchor.
Behind the scenes, Oracle stores metadata (file paths, etc.) in the data dictionary. External files can live on-prem filesystems (NFS/ACFS) or cloud object storage (e.g., OCI).
Cost Optimization with OCI Object Storage
Moving cold data to OCI Object Storage yields dramatic savings: keeping data in active DB storage is far pricier than object storage. Most queries (≈80%) hit only recent “hot” data, so the performance trade-off for archived partitions is acceptable, especially for reporting/audit workloads. Backups shrink, clones get faster, and you can even apply lifecycle management (e.g., archive tier) to rarely accessed files.
On-prem? You can still leverage hybrid partitions with local filesystems; many banks adopt a hybrid-cloud approach where archives sit in OCI while OLTP remains on-prem. DBMS_CLOUD can bridge access securely.
Implementation in a Banking Scenario
Consider ACTB_HISTORY (transaction history). Use yearly partitions: keep recent years internal; offload older years to OCI as Data Pump files.
CREATE TABLE ACTB_HISTORY (
AC_ENTRY_SR_NO NUMBER,
TRN_REF_NNO VARCHAR2(20),
AC_BRANCH VARCHAR2(3),
AC_NO VARCHAR2(20),
TRN_DT DATE,
FCY_AMOUNT NUMBER(22,3),
-- ... other columns ...
CONSTRAINT PK_ACTB_HISTORY PRIMARY KEY (AC_ENTRY_SR_NO) ENABLE
)
PARTITION BY RANGE (TRN_DT) (
PARTITION p_2021 VALUES LESS THAN (DATE '2022-01-01'),
PARTITION p_2022 VALUES LESS THAN (DATE '2023-01-01'),
PARTITION p_2023 VALUES LESS THAN (DATE '2024-01-01'),
PARTITION p_2024 VALUES LESS THAN (DATE '2025-01-01')
)
-- Hybrid concept: keep recent partitions internal; attach older ones as external locations.
-- Actual DDL varies by version; treat below as a pattern to illustrate:
/*
ALTER TABLE ACTB_HISTORY MODIFY PARTITION p_2021
EXTERNAL LOCATION ('https://objectstorage.../actb_history_2021.dmp') ACCESS PARAMETERS (ORACLE_DATAPUMP);
ALTER TABLE ACTB_HISTORY MODIFY PARTITION p_2022
EXTERNAL LOCATION ('https://objectstorage.../actb_history_2022.dmp') ACCESS PARAMETERS (ORACLE_DATAPUMP);
*/
This keeps 2023–2024 internal (hot) and 2021–2022 external (cold). Applications keep using the same table — Oracle fetches from the right partition automatically. Add new yearly partitions as time advances; script conversion of older ones to external.
Migrating existing data: Export the target partition to a Data Pump file in OCI, then ALTER TABLE ... MODIFY PARTITION ... EXTERNAL to attach it. After validation, drop the internal segment to free space.
Architecture Overview
Below is a simple Mermaid diagram of the reference architecture.
flowchart LR
subgraph App["Core Banking App"]
APP[Flexcube / OLTP]
end
subgraph DB["Oracle Database (19c/21c/23c)"]
CORE[(DB Engine)]
IP["Internal Partitions (hot: last 1-2 years)"]
EP["External Partitions (cold: older years)"]
CORE --> IP
CORE --> EP
end
subgraph OCI["OCI Services"]
OS["Object Storage (Data Pump/CSV)"]
ADW["Reporting / ADW or Read-Only DB"]
end
APP -->|SQL| CORE
EP --- OS
ADW <-->|External tables / HPT| OS
Data flows: hot queries hit internal partitions; historical queries stream from Object Storage via external partitions. Reporting systems can query the same historical files without duplicating data.
Benefits, Best Practices, and Key Use Cases
Benefits
- Cost savings (often 50–80% for historical storage).
- Compliance with everything still online and queryable.
- Lean prod DB: shorter backups, faster maintenance.
- Faster migrations: move history ahead of cutover; update pointers later.
- Unified access control: no separate archive DB.
- Tier flexibility: on-prem FS, OCI Standard, Archive tiers.
- Enhanced reporting: share historical data with analytics directly from object storage.
Best practices
- Partition by time (year/quarter); keep current partitions internal.
- Prefer Data Pump (
ORACLE_DATAPUMP) for efficient, exact, compressed archives.
- Automate archival (ILM/ADO jobs) to externalize old partitions on schedule.
- Monitor external access; adjust how many years remain internal.
- Secure credentials and files (DBMS_CLOUD credentials, OCI ACLs, encryption).
Conclusion
For data-intensive banking, Hybrid Partitioned Tables provide an elegant, cost-effective way to turn historical data into an online archive — hot data stays fast, cold data gets cheaper, and everything remains a single SQL away. It aligns with banking IT goals: compliance, cost control, and reduced operational complexity — without sacrificing accessibility.
In the monolithic era of software, picture 8 developers working inside a large shared bungalow. This bungalow represents a monolithic application - one big, unified codebase where everyone operates under the same roof.

Introduction
There’s a common misconception that moving from VMs to containers automatically reduces resource requirements. Some also assume that microservices are inherently small, leading to unrealistic expectations about infrastructure sizing.
However, containerization is not about reducing resource usage—it’s about flexibility, automation, cloud adoption, and efficient resource utilization. Similarly, “micro” in microservices refers to modularity, not minimal resource footprint. Let’s clear up these misunderstandings using a simple house metaphor and discuss how to approach sizing effectively.
Misconception 1: “Deploying to Containers Means Smaller Sizing”
Imagine you have a house representing an application running on a Virtual Machine (VM). Now, you move that house into a containerized environment. Does the house suddenly shrink? No! The number of people inside the house remains the same, and they still need the same space, resources, and utilities.

📌 Key takeaway: If an app requires X amount of resources on a VM, it will still require X on a container. Containerization does not magically reduce CPU, RAM, or storage needs. The workload defines the resource needs, not the deployment model.
🔹 What does containerization offer instead?
- Portability – Move workloads seamlessly across environments, whether on-premises or cloud.
- Automation – Deploy, scale, and manage applications efficiently with orchestration tools like Kubernetes.
- Cloud-native benefits – Leverage cloud elasticity, managed services, and optimized cost strategies.
- Faster deployments & updates – Containers simplify CI/CD pipelines, reducing downtime and increasing agility.
- Resource efficiency – While containerization doesn’t shrink app needs, it allows for better resource pooling and dynamic scaling.
🚀 What can be optimized? While the base resource need remains the same, autoscaling, workload profiling, and optimized instance selection can help manage infrastructure costs over time. Instead of focusing on reducing size, teams should focus on better resource utilization using Kubernetes features like Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler.
Misconception 2: “Microservices Means Tiny Apps”
The word “micro” in microservices refers to modularity, not size. A microservice is independent, meaning it needs its own resources, runtime, and supporting components.

Using the house metaphor again, consider two approaches when moving from VMs to containers:
- Keeping the entire house as-is inside a container → This is similar to running a monolithic application in a container, where all functionalities remain tightly coupled.
- Breaking the house into smaller units → This is the microservices approach. Each smaller house (service) is independent, meaning it needs its own kitchen, bathrooms, and utilities. While this approach enables better scalability and flexibility, it also adds overhead because each unit must function autonomously.
📌 Key takeaway: Microservices are not necessarily “tiny.” While they offer flexibility, independent scaling, and fault isolation, each service adds weight to the overall system. A service that seems small in terms of functionality may still have a significant runtime footprint due to dependencies, API communication, and state management needs.
🚀 Additional Considerations:
- Operational Complexity – Managing many independent services increases deployment, monitoring, and troubleshooting efforts.
- Communication Overhead – Unlike monoliths, microservices need inter-service communication, often adding latency and requiring API gateways.
- Infrastructure Cost – Each service has its own resource allocation, logging, and security configurations, which can lead to higher cumulative costs if not managed well.
Disclaimer:
The views expressed in this document are based on personal analysis and industry insights. These points address common questions from sales discussions regarding database choices for banking applications. It explores why Oracle Database is often regarded as the preferred choice for mission-critical banking workloads.
Why Oracle Database is the Unmatched Choice for Banking Applications
In the world of banking and financial services, where transaction integrity, security, scalability, and availability are paramount, the choice of database technology is critical. While various open-source and general-purpose databases are available, Oracle Database stands as the undisputed leader due to its unparalleled robustness, advanced features, and industry-wide trust.
Here’s why Oracle Database continues to be the gold standard for banking applications and why leading financial institutions rely on it for mission-critical workloads.
1. Oracle: The Pioneer & The Most Advanced Database
Oracle was the first commercial relational database and has evolved continuously for over four decades, setting the benchmark for performance, reliability, and innovation. It has adapted to modern needs with AI-driven features, converged database capabilities, and best-in-class security, making it both the oldest and the most advanced database in the world today.
2. Converged Database: One Database for All Workloads
Unlike general-purpose databases that require different types of databases for different workloads, Oracle provides a truly converged database that supporting:
- Relational (SQL)
- JSON Document Store
- Graph DB
- Blockchain
- Spatial & Geospatial
- Time-Series Data
- Vector Search & AI Data Processing
This eliminates the need for multiple specialized databases, simplifying architecture, reducing operational overhead, and enhancing security.
3. Cutting-Edge AI & ML Features
Oracle Database is AI-ready, enabling advanced data intelligence directly within the database. Key capabilities include:
- In-database machine learning (ML) that allows model training and inferencing without data movement.
- Support for ONNX models, reducing latency by avoiding network travel to external LLMs.
- Vector embeddings and indexing for AI-powered search and fraud detection, leveraging indexing strategies such as IVF Flat and HNSW for fast similarity search.
- AutoML and built-in ML algorithms, streamlining AI workloads without needing external pipelines.
Model Context Protocol (MCP) - Standardizing AI Tool Integration
Why Standardization Matters
Imagine if every cloud provider required a different container format—Docker images wouldn’t work on OCI, AWS, Azure, or Google Cloud. Instead, we agreed on OCI (Open Container Initiative) (not Oracle Cloud!), ensuring containers run anywhere regardless of the tool used (Docker, Podman, containerd). This standardization unlocked massive innovation in DevOps.
AI is at a similar crossroads today. AI models need to interact with external tools—APIs, databases, or file systems. But there’s no universal way to connect them. Every integration is bespoke, meaning developers constantly reinvent how AI agents use tools. Enter Model Context Protocol (MCP)—a standardized, open-source way for AI models to discover and interact with external tools and data sources.
What is MCP?
MCP, developed by Anthropic, provides a universal API for AI to connect with tools, prompts, and resources. Instead of hardcoding integrations, an AI client can connect to any MCP-compliant server and discover what actions it can take dynamically. MCP is model-agnostic, meaning it works with Claude, OpenAI’s GPT, Llama, or any LLM that understands structured tool calls.
Official SDKs exist in Python, TypeScript, and Java, making it easy for developers to implement MCP within their applications.
How MCP Works
MCP follows a client-server model:
- Host: The AI application (e.g., Claude Desktop, Cursor IDE) that needs to use external tools.
- Client: The component in the host that communicates with MCP servers.
- Server: Provides AI-accessible tools (e.g., file system, API access, database queries).
When an AI model wants to list files, send emails, or fetch stock prices, it queries an MCP server, which executes the request and returns a result. This decouples AI from specific tool implementations, allowing any AI agent to use any tool that speaks MCP.
Example: An AI-Enabled Code Editor
Let’s say you’re coding in Cursor IDE, which supports MCP. The AI assistant wants to search for TODO comments in your repo. It doesn’t need a special plugin; instead, it connects to an MCP GitHub server that provides a searchCode tool. The AI calls searchCode, gets structured results, and presents them. No custom API calls, no plugin-specific logic—just MCP.
MCP vs. Other AI Integration Approaches
1. OpenAI Function Calling
- OpenAI’s function calling lets GPT models request predefined functions via structured JSON.
- However, functions must be hardcoded in advance. MCP, in contrast, lets AI dynamically discover and call tools from external servers.
2. LangChain & Agent Frameworks
- LangChain helps structure AI workflows but requires developers to define tool integrations manually.
- MCP is protocol-driven, allowing any AI agent to use any MCP-compliant tool without custom integration.
3. ChatGPT Plugins & API Calls
- ChatGPT plugins require OpenAPI specifications per integration.
- MCP provides a broader, AI-native standard, working across different AI platforms and tools seamlessly.
STDIO Integration: Useful or Childish?
One surprising thing about MCP is that it defaults to STDIO (Standard Input/Output) instead of HTTP. Why?
- Security: MCP servers often run as local processes, preventing network exposure risks.
- Simplicity: No need to manage API endpoints, auth tokens, or networking.
That said, STDIO feels outdated for production use. Luckily, MCP supports HTTP+SSE (Server-Sent Events) for remote communication, making it viable for enterprise-scale deployments.
Human-in-the-Loop: Keeping AI Accountable
One critical feature of MCP-based implementations is human oversight. AI shouldn’t execute actions autonomously without user approval—tools are often real-world actions (like modifying files or sending emails).
For example, in Claude Desktop, MCP tools require explicit user confirmation before running. Cursor IDE also asks for permission before executing AI-generated code. This safeguards against accidental or malicious AI actions—a necessary precaution as AI autonomy increases.
Final Thoughts: Why MCP is Promising
MCP represents a significant step toward standardizing AI tool integration, much like how OCI transformed container portability. By eliminating ad-hoc integrations, MCP enables interoperability between AI models and external tools without custom glue code.
However, adoption remains a key challenge. While MCP is open-source and gaining traction with tools like VS Code extensions, Cursor IDE, and Claude Desktop, its success depends on broad industry support. If OpenAI, Google, and others embrace it, MCP could become the USB-C of AI tool interactions—enabling seamless compatibility across platforms.
Security and governance challenges remain. MCP provides the means for AI to execute actions but does not regulate them. Developers must implement proper authentication and sandboxing to prevent misuse.
Despite these hurdles, MCP is a promising foundation. It allows AI applications to instantly gain new capabilities by connecting to an ever-expanding ecosystem of tools. If AI is to become truly useful in real-world workflows, a standardized protocol like MCP is essential.
Integration is protocol-driven, and let MCP lead the way in AI integrations.
References
Automating Oracle Service Bus (OSB) Integrations at Scale
Introduction
In enterprise integration projects, Oracle Service Bus (OSB) plays a critical role in connecting disparate systems while avoiding the complexity of direct point-to-point integrations. However, manually developing and maintaining a large number of OSB pipelines can be time-consuming and costly. This blog explores strategies for automating OSB integration development, leveraging reusable templates, scripting, and DevOps best practices to significantly reduce effort and improve consistency.
Why Use OSB?
Oracle Service Bus acts as an intermediary layer between various services, offering features like:
- Decoupling systems to reduce dependencies
- Protocol mediation between SOAP, REST, JMS, and more
- Centralized logging and monitoring
- Error handling and fault tolerance
- Scalability and security enforcement
Although Oracle Banking applications are pre-integrated, some customers choose to use OSB as a standard integration pattern across all systems. While this adds uniformity, it may introduce unnecessary complexity and performance overhead. A more efficient approach is to use OSB where it provides tangible benefits—particularly for third-party integrations—while leveraging Oracle Banking Products’ native interoperability elsewhere.
Strategies for Automating OSB Integration Development
1. Use Pipeline Templates to Standardize Development
OSB allows the creation of pipeline templates, which act as blueprints for multiple services. Instead of manually designing each pipeline, you can:
- Define a generic template that includes logging, error handling, and routing logic.
- Instantiate concrete pipelines from the template, customizing only endpoint details.
- Update logic in one place and propagate changes across all services.
Using templates ensures uniformity and dramatically reduces manual effort when dealing with a high number of integrations.
2. Automate Integration Creation Using Scripts
Rather than manually configuring 90+ integrations in JDeveloper, consider:
- WLST (WebLogic Scripting Tool): Automate service creation, endpoint configuration, and deployment using Python scripts.
- Maven Archetypes: Use OSB’s Maven plugin to create standardized OSB project structures from the command line.
- Bulk Configuration Updates: Export OSB configurations (
sbconfig.jar), modify them programmatically (e.g., with Jython or XML processing tools), and re-import them.
3. Use DevOps Practices for OSB CI/CD
To streamline deployment and minimize errors:
- Store OSB configurations in Git to maintain version control.
- Use Maven to build and package OSB projects automatically.
- Implement CI/CD pipelines (Jenkins, GitHub Actions, etc.) to test and deploy integrations seamlessly.
4. Evaluate Kubernetes for OSB Deployment
While OSB is traditionally deployed on WebLogic servers, Oracle supports running OSB on Kubernetes via the WebLogic Kubernetes Operator. Benefits include:
- Automated scalability and high availability
- Simplified environment provisioning using Kubernetes manifests
- Enhanced monitoring and logging with Prometheus/Grafana integration
This approach is particularly useful if your organization is adopting cloud-native infrastructure.
Conclusion
By leveraging OSB pipeline templates, automation scripts, and CI/CD best practices, customers can significantly reduce manual effort, ensure consistency, and improve maintainability of large-scale integrations. While OSB is a powerful tool, organizations should carefully consider whether to use it for Oracle-to-Oracle integrations, reserving it for third-party connectivity where its mediation capabilities offer the greatest benefits.
Next Steps
Evaluate your integration strategy—can you reduce complexity by limiting OSB usage to external integrations? If OSB is necessary, start implementing automation techniques to accelerate deployment and maintenance.
References & Further Reading
If you’ve ever tried to use your keyboard to navigate a pop-up dialog in macOS—such as when closing a document and being prompted to Save, Don’t Save, or Cancel—you might have noticed that sometimes, Tab doesn’t cycle through the buttons as expected. Even if you have “Use keyboard navigation to move focus between controls” enabled in System Settings, the problem can persist.
After searching for solutions and troubleshooting extensively, I discovered a simple but effective fix that isn’t widely documented. If you’re facing this issue, try the following method:
The Quick Fix: Toggle Full Keyboard Access
- Open System Settings (or System Preferences in older macOS versions).
- Go to Keyboard → Keyboard Shortcuts.
- Select Keyboard on the left panel.
- Locate “Use keyboard navigation to move focus between controls” and toggle it off.
- Wait a few seconds, then toggle it on again.
This reset seems to refresh macOS’s ability to recognize keyboard focus within pop-up dialogs. Once enabled, you should be able to navigate the buttons using the Tab key as expected.
Alternative Methods to Navigate Dialogs
If the issue persists, try these additional keyboard shortcuts:
Return or Enter – Selects the default button (usually “Save”).
Esc – Acts as “Cancel”.
Command + Delete – Chooses “Don’t Save” in file dialogs.
fn + Tab – Sometimes required for button focus in certain macOS versions.
Control + Tab or Option + Tab – Alternative navigation in some apps.
Command + Left/Right Arrow – Moves focus between buttons in some cases.
Introduction
Spring Framework has been evolving consistently, adapting to modern development needs and best practices. One significant change was the deprecation and eventual removal of Apache Velocity support. This article explores why Velocity was deprecated in Spring 4.3 and removed in Spring 5.0, along with the best alternatives developers can use for templating in Spring applications.
Why Was Apache Velocity Deprecated and Removed?
Deprecated in Spring 4.3
Velocity was officially deprecated in Spring 4.3 due to multiple reasons:
- Lack of Active Development: Apache Velocity had seen minimal updates since 2010, raising concerns about security and maintainability.
- Shift in Template Engine Preferences: Modern template engines like Thymeleaf and FreeMarker had become more popular, offering better integration and performance.
- Spring’s Decision to Reduce Third-Party Dependencies: The Spring team aimed to streamline support for actively maintained technologies.
Removed in Spring 5.0
With the release of Spring 5.0, Velocity support was entirely removed. The official release notes and discussions confirm this decision:
Developers relying on Velocity had to migrate to other templating solutions.
The Best Alternatives to Apache Velocity
1. Thymeleaf - The Modern Standard
Thymeleaf is a powerful, flexible Java template engine designed for seamless integration with Spring Boot.
- HTML5-compliant templates: Works as static HTML, making development and debugging easier.
- Rich feature set: Supports conditional statements, loops, and Spring expression language (SpEL).
- Easy integration: Well-supported in Spring Boot via
spring-boot-starter-thymeleaf.
2. FreeMarker - Feature-Rich and Versatile
FreeMarker is a widely used template engine that offers great flexibility.
- Customizable syntax: Works well for generating HTML, emails, configuration files, and more.
- Strong integration: Spring Boot provides built-in support via
spring-boot-starter-freemarker.
- Good documentation: Well-maintained with extensive examples.
3. Groovy Templates - Dynamic and Expressive
Groovy Templates are a great option for developers familiar with Groovy.
- Dynamic scripting: Supports inline scripting with Groovy expressions.
- Flexible syntax: Easier for those already using Groovy in their projects.
- Spring integration: Supported out of the box with Spring Boot.
4. Mustache - Lightweight and Logic-Less
Mustache is a minimalistic template engine that enforces a separation between logic and presentation.
- Logic-less templates: No embedded Java code, promoting clean MVC architecture.
- Fast and lightweight: Ideal for microservices and small projects.
- Supported in Spring Boot: Included via
spring-boot-starter-mustache.
Pebble is inspired by Twig (from PHP) and offers a clean and fast templating approach.
- Template inheritance: Helps in structuring large applications.
- Auto-escaping: Improves security by preventing XSS attacks.
- Spring Boot support: Available via
spring-boot-starter-pebble.
Conclusion
Apache Velocity’s deprecation and removal from Spring was a necessary step to keep the framework modern and secure. Developers migrating from Velocity have several strong alternatives, each offering unique advantages. Depending on your project needs, Thymeleaf, FreeMarker, Groovy Templates, Mustache, or Pebble can serve as excellent replacements.
For seamless migration, explore the official documentation of these alternatives and start modernizing your Spring applications today!
In a bustling little town, there was a chef named Carlo, famous for his hearty meals cooked on a simple stove. One day, inspired by the glamorous cooking shows featuring high-end restaurants, he decided to upgrade his kitchen. He bought the latest, high-tech ovens—sleek machines that promised faster cooking, multitasking, and scaling up at the press of a button.
At first, Carlo was excited. With 40 dishes to prepare every day, he thought these miracle machines would make his restaurant run like clockwork. So, he went all in and bought 40 of them, imagining efficiency like never before. But soon, his dream kitchen turned into a nightmare. The machines needed precise settings, frequent maintenance, and often clashed with each other. Instead of saving time, Carlo found himself buried in errors and losing the special touch that made his food so loved. Frustrated, he shut down the gadgets and returned to his trusty stove, muttering, “These fancy tools are useless!”—without stopping to think that the real problem wasn’t the machines but how he had set up his kitchen.
This story came to mind when I read a blog titled I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever. The title immediately grabbed my attention, and as I read, the writer’s journey felt a lot like Carlo’s. They described their struggles with Kubernetes, and as someone who’s used it for deploying applications, I could relate to many of the challenges. The storytelling was engaging, and the examples struck a chord with my own experience as a DevOps practitioner.
However, as I got deeper into the blog, it became clear that the problem wasn’t Kubernetes itself but how it had been used. The writer’s team had made choices that didn’t align with how Kubernetes is designed to work. For example, they managed 47 separate clusters—something that’s not only unnecessary but also makes managing everything a lot harder. Kubernetes is built to:
- Handle many workloads within a single cluster
- Provide tools for dividing and isolating resources
- Simplify deployments, not complicate them
Abandoning Kubernetes because of these challenges felt like burning down a house because you couldn’t figure out how to use the stove.
The Core Issues
The team made several decisions that caused more problems than they solved:
Too Many Clusters
- Creating a separate cluster for every service and environment caused a logistical mess. Most companies manage thousands of services with just a few clusters by using:
* Namespaces to separate workloads
* Rules to control access and resources
* Specialized node setups for different tasks
- Complex Multi-Cloud Setup
* Running their systems across three different cloud providers added layers of unnecessary complexity.
- Ignoring Built-In Features
* Kubernetes has powerful tools for isolating workloads, like namespaces and network rules, but these weren’t used effectively.
The Real Lesson
The blog isn’t really about the flaws of Kubernetes—it’s about the consequences of using a tool without understanding how it’s meant to work. Their challenges reflect more on their approach than on the platform itself.
Advice for Teams Starting with Kubernetes
If you’re considering Kubernetes, here are a few tips to avoid the same mistakes:
* Learn the Basics: Take the time to understand how Kubernetes works.
* Start Small: Begin with a single, well-planned cluster and scale up as needed.
* Use the Built-In Features: Kubernetes has tools for handling growth and isolation—use them instead of over-complicating things.
* Go Step by Step: Don’t try to change everything at once; migrate gradually.
* Invest in Skills: Make sure your team knows how to use the platform effectively.
What Happens When You Don’t Plan Well
The team’s approach led to several problems:
* More Work for the Team: Managing 47 clusters with unique setups made everything more complicated and error-prone.
* Higher Costs: Maintaining so many separate systems wasted money on unnecessary infrastructure.
* Lost Confidence: These missteps made it harder for others to trust the team’s decisions in the future.
Final Thoughts
While the blog offers some helpful insights about technological change, its critique of Kubernetes needs to be taken with a grain of salt. The challenges they faced highlight the importance of good planning and understanding, not the shortcomings of Kubernetes itself.
Kubernetes is a powerful tool when used wisely. It’s true that it offers a lot of features, and it’s tempting to try and use them all—but that’s not always the best approach. Sometimes, keeping it simple is the smartest choice. This is why I often prefer writing my own deployment scripts over relying on tools like Helm, which can add unnecessary complexity
Leveraging WebLogic’s Filter Feature for Customization and Monitoring
The functionality provided by the DBMS_PARALLEL_EXECUTE package enables the division of a workload related to a base table into smaller fragments that can be executed in parallel. This article utilizes a basic update statement as an example, although in practice, it is more efficient to employ a single parallel DML statement. Nonetheless, this simple example effectively illustrates how to use the package. The DBMS_PARALLEL_EXECUTE package is particularly useful in situations where straight parallel DML is not suitable. It entails a series of distinct stages to implement.
The recent case study of Amazon Prime Video showcases an interesting transition from a serverless microservices architecture to a monolithic approach, which resulted in a significant 90% decrease in operating expenses for their monitoring tool. This shift has sparked discussions about the differences between serverless and microservices and how to evaluate their respective advantages and disadvantages. The study revealed that serverless components, including AWS Step Functions and Lambda, were causing scaling bottlenecks and increasing costs. By removing these serverless components and simplifying their architecture, Amazon Prime Video claims to have achieved substantial cost savings.
This shift underscores the importance of striking the right balance between serverless and microservices architectures for specific use cases. While microservice and serverless computing can provide benefits such as scalability and reduced operational overhead, they may not always be the optimal solution for every application or system. Similarly, microservices can offer increased flexibility, but they may also introduce unnecessary complexity in certain situations. Developers must evaluate their project requirements and constraints carefully before deciding which architectural patterns to adopt.

Reference to Amazon Prime Video case study » link
quote:
“We designed our initial solution as a distributed system using serverless components… In theory, this would allow us to scale each service component independently. However, the way we used some components caused us to hit a hard scaling limit at around 5% of the expected load.”
Bash Parameter Expansion: A Short Guide
Bash parameter expansion is a powerful feature that allows you to manipulate strings in shell scripts. It can be used to remove or replace substrings, handle error cases, and more.
Here’s a quick overview of the most commonly used parameter expansion syntaxes:
Basic Syntax
The basic syntax of parameter expansion is ${var}. This expands to the value of the variable var. If var is unset or null, it expands to an empty string. You can also use ${var:-default} to expand to the value of var, or default if var is unset or null.
Example:
name="John Doe"
echo "Hello, ${name:-there}" # Output: "Hello, John Doe"
unset name
echo "Hello, ${name:-there}" # Output: "Hello, there"
kubectl top is a command that allows you to get resource usage information for Kubernetes objects such as pods, nodes, and containers. With this command, you can monitor and troubleshoot the resource usage of your Kubernetes cluster.
In this blog post, we will focus on kubectl top commands for pods and nodes, and some examples of how to use these commands to sort the pods by CPU/memory usage.
kubectl top pod
The kubectl top pod command allows you to get CPU and memory usage information for pods running in your cluster. To use this command, simply run:
This will show the CPU and memory usage for all pods in the default namespace. If you want to view the usage for a specific namespace, you can use the -n option:
kubectl top pod -n <namespace>
When working with remote servers over an SSH connection, it can be frustrating when the connection becomes unresponsive due to network issues. This is especially true when using a VPN to connect to a remote server, as disconnecting the VPN can cause the SSH session to become unresponsive and eventually timeout after several minutes.
Fortunately, there is a way to configure the SSH client to immediately fail the connection when it detects that the network is unreachable. In this blog post, we’ll show you how to set up your SSH client to send keepalive packets and terminate the session if it does not receive a response from the server.
Step 1: Open your SSH configuration file
The first step is to open your SSH configuration file. This file is usually located at ~/.ssh/config, and you can edit it using a text editor like Nano or Vim. If the file doesn’t exist, you can create it by running the command:
Step 2: Set the ServerAliveInterval and ServerAliveCountMax options
In your SSH configuration file, add the following lines:
To simplify the process of setting up a domain, you can record your configuration steps in the Administration Console as a sequence of WLST commands, which can then be played back using WLST.
WLST is a command-line scripting environment designed for managing, creating, and monitoring WebLogic Server domains. It is automatically installed on your system along with WebLogic Server.
Created by Ranjith Vijayan on Feb 02, 2023
kubectl is a user-friendly tool that makes it simple to create new commands that build upon its standard functionality. A common task when using kubectl is to list pods to check their status and see if everything is running. To achieve this, you can use the command “kubectl get pods.” However, typing this command repeatedly can become tedious. If you are coming from the docker experience, you would have developed the muscle memory to type ‘docker ps’. But if you type ‘kubectl ps’ you would get somethingl ike this
kubectl ps
error: unknown command "ps" for "kubectl"
Did you mean this?
logs
cp
To achieve something like docker ps with kubectl, you can create a custom script named “kubectl-ps” in the $PATH. When kubectl encounters a command it doesn’t recognize, it will search for a specific command in the PATH. In this case, it will look for the “kubectl-ps” command.
-
Create a file by name "kubectl-ps" in one of the directories of your $PATH
#!/bin/bash
kubectl get pods "$@"
-
Now if you type kubectl ps it will execute your custom script
kubectl ps
No resources found in default namespace.
Since we have added “$@” in the custom script we can pass arguments
e.g.
kubectl ps -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-597584b69b-m8wfv 1/1 Running 0 7d8h
local-path-provisioner-79f67d76f8-lnq2j 1/1 Running 0 7d8h
metrics-server-5f9f776df5-gbmrv 1/1 Running 0 7d8h
helm-install-traefik-crd-tdpbh 0/1 Completed 0 7d8h
helm-install-traefik-nhnkn 0/1 Completed 1 7d8h
svclb-traefik-8b85b943-5f2nb 2/2 Running 0 7d8h
traefik-66c46d954f-7ft6n 1/1 Running 0 7d8h
Another example.
kubectl ps -A ✔ default ○ @fsgbu-mum-1019
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-597584b69b-m8wfv 1/1 Running 0 7d8h
kube-system local-path-provisioner-79f67d76f8-lnq2j 1/1 Running 0 7d8h
kube-system metrics-server-5f9f776df5-gbmrv 1/1 Running 0 7d8h
kube-system helm-install-traefik-crd-tdpbh 0/1 Completed 0 7d8h
kube-system helm-install-traefik-nhnkn 0/1 Completed 1 7d8h
kube-system svclb-traefik-8b85b943-5f2nb 2/2 Running 0 7d8h
kube-system traefik-66c46d954f-7ft6n 1/1 Running 0 7d8h
ingress-test-ns obpm-deployment-77dcdf4c78-cf2rh 1/1 Running 0 7d8h
ingress-test-ns obpm-deployment-77dcdf4c78-smtr9 1/1 Running 0 7d8h
ingress-test-ns fcubs-deployment-7ccd4b66fb-4r7fh 1/1 Running 0 7d8h
ingress-test-ns fcubs-deployment-7ccd4b66fb-z6xxz 1/1 Running 0 7d8h
Created by Ranjith Vijayan on Feb 01, 2023
When publishing messages to IBM MQ using a Java client, traditional IBM MQ applications may have trouble reading the messages. This is often due to the “RFH2” header that is included in the message, which carries JMS-specific information.
The issue is usually related to the “TargetClient” configuration. IBM MQ messages are made up of three components:
- The IBM MQ Message Descriptor (MQMD)
- An IBM MQ MQRFH2 header
- The message body
The MQRFH2 header is optional and its inclusion is controlled by the “TARGCLIENT” flag in the JMS Destination class. This flag can be set using the IBM MQ JMS administration tool. When the recipient is a JMS application, the MQRFH2 header should always be included. However, when sending directly to a non-JMS application, the header should be omitted as these applications do not expect it in the IBM MQ message.
The “RFH2” header (Reply-To-Format header) is an IBM MQ header used for routing JMS messages. It contains information about the format of the message such as its encoding, character set, and data format. The header allows messages to be properly processed by different systems and specifies the format of reply messages.
Created by Ranjith Vijayan on Jan 08, 2023
Introduction
Purpose of this document is to give a brief introduction to Kerberos Authentication protocol with a working example that can be useful for application development to integrate with Kerberos solutions. In a typical use-case of Kerberos, there will be a client component, a server component and a Authentication component.
In the example discussed in this document -
- KDC: The Kerberos component is provided as a docker image, so that it can be run in a container that it is platform agnostic. We will set up a basic Kerberos KDC for development purpose (not for production usage).
- Client: The client component is a browser application running in the client machine using a browser.
- Server: A springboot application is written to show the role of a “server” (web-based application) component. This application will have a REST API which requires authentication to serve the resource. We will deploy the springboot application in a docker container.
The document describes all the basic setup of these components. But it will not get into the low level details.
Before getting to the sample set up lets have a quick introduction of Kerberos
What is Kerberos
Kerberos is a Network Authentication Protocol developed at Massachusetts Institute of Technology (MIT). The Kerberos protocol provides a mechanism for mutual authentication between a client and a server before application data is transmitted between them
Windows Server widely supports Kerberos as the default authentication option. Kerberos is a ticket-based authentication protocol that allows nodes in a computer network to identify themselves to each other.
A Typical Use Case:
Lets say a Client machine wants to access a resource (e.g. a file) from a Server. She goes to the Server and asks, “Hey Server! Can I access the file?”
To protect its resources, Server insists that Client needs to prove her identity. So, when Client asks Server to grant access to the resource, Server says “Umm… but I don’t know you, can you first prove your identity?”
Luckily both Client and Server know and trust a common partner called KDC (Key Distribution Center), because both of them have valid memberships with the same KDC. Server tells Client, “Can you go to KDC and ask them to issue a session ticket in your name, addressed to me?”
Client was relieved to hear that, because some time ago she had already proven the identity to KDC by going through the hassles of verification process.
(** A little flashback **) …
Some time ago, Client had gone to KDC to prove her identity. KDC asked a set of questions (e.g. user-name, password, OTP, Biometric, etc) to ensure that the client is really who she is claiming to be. After the client had promptly answered all the challenges, the KDC gave a certificate called TGT (Ticket Granting Ticket), and told “Here is your TGT and this is valid until tomorrow. If today any server is asking you to prove your identity to them, you can come back to me with this TGT, and tell me the SPN (Service Principal Name) of the server who is asking for your identity. After verification of TGT and membership of both you and the server, I shall give you a one-time-ticket (Session Ticket) which you can present to the server. Don’t worry, you will not be grueling with questions again until the expiry of this TGT. Please keep it safe in your wallet!”
(** The flashback ends **)
Client goes to KDC with the valid TGT and told them “Here is my TGT. The Server is asking me to get a session ticket. Can you give me one?”
KDC quickly verifies the validity of TGT, and checks if the memberships of both server and client are active. It then gives the Client, a one-time session ticket which can be presented to the Server.
The client then goes to the server with the session ticket and asks for the resource. This time the server can identify the Client because it had presented the ticket which was issued by the KDC that the server trusts. It however checks the ticket is truly issued by the trusted KDC. For this verification it uses a private key called keytab that was originally issued by the KDC to the Server. The Server checks the session ticket using the keytab and if the verification is success then it can read the identity information of the Client. After successful verification, the server is sure that the client is indeed who she is claiming to be. This process is called Authentication. The server will now apply further checks to determine if the client has the rights to view the file. This check is called Authorization (which is beyond the scope of this document)
In this flow, the Client doesn’t have to go through the hassles of sign-on challenges (i.e. provide username password etc) every time it needs to access any server who has a mutual trust with the KDC. This process gives a Single Signon (SSO) experience, which is good for security from the system perspective, and convenient from its users’ perspective.
Created by Ranjith Vijayan on Aug 17, 2022
This article covers an example of how to easily access JSON documents in Oracle DB without loading! This approach may be useful for some integration use-cases wherein json data needs to be processed/uploaded into traditional tables. Instead of writing complicated parsing & loading routines, the “schema-less” datamodel using JSON will be a natural fit for RESTful development techniques, and the power of SQL would come in handy for analytic requirements.
Oracle 21c introduced a new JSON data type. You should use this in preference to other data types. More details here.
Created by Ranjith Vijayan, last modified on Aug 03, 2022
In general K8s expects you to design workload with “Cattle” approach, as against the “Pet” approach in the on-prem world. i.e. pods are inter-changeable.
In the “Cattle” approach, your pods can be terminated / replaced / migrated to another node, in order to ensure High-Availability, Self-healing, etc.
However not all applications, especially traditionally designed ones may not be able to handle the abrupt termination of running pods. So its “safer” to allow some grace time to allow inflight transactions to complete, before the pod is removed.
Created by Ranjith Vijayan, last modified on Aug 03, 2022
In this article, we will look at how to automate the “refreshing” of a deployment in Kubernetes using the Cronjob feature. Normally, the kubectl rollout restart command is used for this purpose, but it has to be done from outside the cluster. By using a Cronjob, we can schedule a Kubernetes API command to restart the deployment within the cluster.
To do this, we must set up RBAC (Role-Based Access Control) so that the Kubernetes client running inside the cluster has permission to make the necessary calls to the API. The Cronjob will use the bitnami/kubectl image, which is a kubectl binary that runs inside a container.
In this example, we are using a ClusterRole so that a single service account can run Kubernetes API for resources in any namespace. Alternatively, you can use a Role if you want to run this within a single namespace. The shell script that contains the kubectl rollout commands can be extended to add more business logic if needed. The schedule of the Cronjob can be controlled using the cron definition. This example runs the job every minute for illustration purposes only and is not practical. The schedule can be changed to run every day at midnight or every Sunday, for example.
Example:
The example creates a namespace called “util-restart-ns” for the service account, ClusterRole, and Cronjob. The deployments are created in a different namespace (ing-test-ns), and a RoleBinding object must be created in this namespace to allow it to use the RBAC service account.
-
deploy.yaml - to create deployment in “ing-test-ns”
deploy.yaml
apiVersion: v1
kind: Namespace
metadata:
name: ing-test-ns
---
kind: ConfigMap
apiVersion: v1
metadata:
name: fcubs-configmap
namespace: ing-test-ns
data:
index.html: |
<html>
<head><title>Hello FCUBS!</title></head>
<body>
<div>Hello FCUBS!</div>
</body>
</html>
---
apiVersion: v1
kind: Service
metadata:
name: fcubs-service
namespace: ing-test-ns
labels:
run: fcubs
spec:
ports:
- port: 80
protocol: TCP
selector:
app: fcubs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fcubs-deployment
namespace: ing-test-ns
spec:
selector:
matchLabels:
app: fcubs
replicas: 1
template:
metadata:
labels:
app: fcubs
spec:
containers:
- name: fcubs
image: nginx:latest
lifecycle:
preStop:
exec:
command: [ "/bin/sleep", "20" ]
ports:
- containerPort: 80
volumeMounts:
- name: fcubs-volume
mountPath: /usr/share/nginx/html/fcubs
terminationGracePeriodSeconds: 60
volumes:
- name: fcubs-volume
configMap:
name: fcubs-configmap
Created by Ranjith Vijayan, last modified on Jul 13, 2022
Created by Ranjith Vijayan on May 08, 2022
WASM
WebAssembly (WASM) is a binary instruction format for a stack-based virtual machine that is designed to be fast to decode and execute. It is a low-level, portable, binary format that is designed to be fast to decode and execute. It is an open standard that is supported by all major web browsers, including Chrome, Firefox, and Safari.
Docker
Docker is a platform for developing, shipping, and running applications in containers. It allows developers to package an application with all of its dependencies and run it in a variety of environments, including on different operating systems and in different cloud environments.
GraalVM
GraalVM is a high-performance virtual machine for running Java applications. It is designed to improve the performance of Java applications by using advanced techniques, such as ahead-of-time (AOT) compilation, to optimize the performance of the Java Virtual Machine (JVM).
One of the key features of WASM is its ability to run on both the web and in standalone environments. This makes it a great fit for use in conjunction with Docker and GraalVM. By using WASM in a Docker container, developers can package their applications with all of their dependencies and run them in a variety of environments, including on different operating systems and in different cloud environments. And by using GraalVM, developers can improve the performance of their WASM applications.
To use WASM in Docker, developers can use a base image that includes a WASM runtime, such as Wasmer. They can then package their WASM application and its dependencies in a Docker container and deploy it to a variety of environments. This allows for easy deployment and management of WASM applications, as well as the ability to run them in a variety of environments.
To use WASM with GraalVM, developers can use the GraalVM WASM interpreter to run their WASM applications. The GraalVM WASM interpreter is a high-performance, open-source WASM interpreter that is designed to improve the performance of WASM applications by using advanced techniques, such as ahead-of-time (AOT) compilation. This allows developers to take advantage of the performance benefits of GraalVM while still being able to use WASM as the underlying binary format for their applications.
By combining the power of WASM, Docker, and GraalVM, developers can create high-performance, portable applications that can run in a variety of environments. WASM allows for fast decoding and execution of binary code, while Docker enables easy deployment and management of applications. And by using GraalVM, developers can take advantage of advanced performance optimization techniques to improve the overall performance of their applications.
It is worth mentioning that the combination of WebAssembly, GraalVM and Docker opens the possibility of running a Java based application as WebAssembly. This can be achieved through GraalWasm which is a experimental project from Oracle, GraalWasm is a native image generator for WebAssembly, to take full advantage of WebAssembly’s capabilities, it can be used inside a container.
Overall, WASM is a powerful technology that has the potential to revolutionize how we build and deploy applications. When used in conjunction with technologies like Docker and GraalVM, it can enable the creation of high-performance, portable applications that can run in a variety of environments. With the continued growth and adoption of WASM, we can expect to see more and more applications built on this technology in the future
Created by Ranjith Vijayan on Apr 03, 2022
This post is created for my own reference of k8s resources as part of learning
Created by Ranjith Vijayan, last modified on Apr 03, 2022
Created by Ranjith Vijayan on Apr 04, 2022
Created by Ranjith Vijayan on Apr 02, 2022
This post is created for my own reference of k8s resources as part of learning
Created by Ranjith Vijayan on Apr 02, 2022
This post is created for my own reference of k8s resources as part of learning
Created by Ranjith Vijayan, last modified on Feb 17, 2022
The pandemic has fast-tracked digitalisation for all aspects of our lives. Be it for work, study, entertainment, or socialising; we are having digital solutions amid social distancing and other measures to control spread of the virus. We have learnt how to do product marketing, sales, requirement studies, solution walk-throughs, project kick-offs and bug fixing without going to office, let alone going onsite. We even celebrated go-lives of projects that had commenced after the pandemic had started, wherein the team members – developers, system administrators, testers, project managers, vendors, end users, customers, have never met “in-person”. People who joined our office during the pandemic time have left the organization without even visiting the office.
While we’re all living ultra-connected lives, with miraculous amount of computing power packed into tiny devices through which we can communicate with anyone, without realizing that they may be sitting in different parts of the planet, the line between physical and real world has become blurry.
In the 2021 year-end blog post on Bill Gates’ blog, the founder of Microsoft and the Bill & Melinda Gates Foundation includes a prediction about the future of work.
SimpleSAMLPHP is an opensource project written in native PHP that deals with authentication. A docker image of SimpleSAMLPHP is built and kept in FSGBU JAPAC Consulting Docker registry that one can directly pull and run an instance of IDP for development and testing purpose
Created by Ranjith Vijayan, last modified on Dec 26, 2020
TL;DR
** Containers != Docker **
Our Journey towards Containerization (a.k.a ‘Dockerization’)
Ever since we embarked our journey in the container world with “Docker” technology, there has never been a go back. It enabled us provisioning and running our application environments in the quickest time possible. It made our path easier in exploring DevOps & Continuous Integration / Continuous Deployment (CI/CD) capabilities for our CEMLI projects and POCs, and it facilitated our developers deploying and testing their code quickly and efficiently.
We knew that containers were popular in micro-service world. But more we learnt about and experimented with Docker, it made us feel at home even while dealing with our legacy/monolithic applications. While Docker is a tool that exploded its popularity by enabling micro-services workflows, we realized that it also a good tool for all types of architectures and applications. We soon “Dockerized” FLEXCUBE Application which mainly used a J2EE Application Server and an Oracle Database as the major tech-stack components, even before we even went about ‘Dockerizing’ the newer micro-service based Oracle Banking products. We could spin the application environments as lightweight, portable and self-sufficient containers, in a matter of minutes, with zero pre-requisites, but the “Docker engine” on the host machine. Docker technology gave us the feeling of “Instant Application Portability”. We made some quick tools (also known as Reference Install - RI tools) to leverage the Docker power adaptable for our applications. We started embracing the docker technology more when delved deeper into it, and it never disappointed us.
