The Ravit Show
Posts
IBM Think 2025 Recap, Advanced RAG Techniques and Unlisted Stock to Buy

IBM Think 2025 Recap, Advanced RAG Techniques and Unlisted Stock to Buy

Ravit Jain
May 19, 2025

He’s already IPO’d once – this time’s different

Spencer Rascoff grew Zillow from seed to IPO. But everyday investors couldn’t join until then, missing early gains. So he did things differently with Pacaso. They’ve made $110M+ in gross profits disrupting a $1.3T market. And after reserving the Nasdaq ticker PCSO, you can join for $2.80/share until 5/29.

Invest today for $2.80/share.

_{This is a paid advertisement for Pacaso’s Regulation A offering. Please read the offering circular at}_{invest.pacaso.com}_{. Reserving a ticker symbol is not a guarantee that the company will go public. Listing on the NASDAQ is subject to approvals. Under Regulation A+, a company has the ability to change its share price by up to 20%, without requalifying the offering with the SEC.}

RAG just got smarter.

If you’ve been working with Retrieval-Augmented Generation (RAG), you probably know the basic setup: An LLM retrieves documents based on a query and uses them to generate better, grounded responses.

But as use cases get more complex, we need more advanced retrieval strategies—and that’s where these four techniques come in:

Self-Query Retriever
Instead of relying on static prompts, the model creates its own structured query based on metadata.
Let’s say a user asks: “What are the reviews with a score greater than 7 that say bad things about the movie?”

This technique breaks that down into query + filter logic, letting the model interact directly with structured data (like Chroma DB) using the right filters.

Parent Document Retriever

Here, retrieval happens in two stages:
1. Identify the most relevant chunks
2. Pull in their parent documents for full context
This ensures you don’t lose meaning just because information was split across small segments.

Contextual Compression Retriever (Reranker)

Sometimes the top retrieved documents are… close, but not quite right.

This reranker pulls the top K (say 4) documents, then uses a transformer + reranker (like Cohere) to compress and re-rank the results based on both query and context—keeping only the most relevant bits.

Multi-Vector Retrieval Architecture

Instead of matching a single vector per document, this method breaks both queries and documents into multiple token-level vectors using models like ColBERT.

The retrieval happens across all vectors—giving you higher recall and more precise results for dense, knowledge-rich tasks.

These aren’t just fancy tricks.

They solve real-world problems like:
• “My agent’s answer missed part of the doc.”
• “Why is the model returning irrelevant data?”
• “How can I ground this LLM more effectively in enterprise knowledge?”

As RAG continues to scale, these kinds of techniques are becoming foundational.

So if you’re building search-heavy or knowledge-aware AI systems, it’s time to level up beyond basic retrieval.

Which of these approaches are you most excited to experiment with?

What happened at IBM THINK 2025?

Met Arvind Krishna. Saw the future of enterprise AI.

“A billion new apps by 2028. The only way to manage them? AI agents, built on your enterprise data.” – Arvind Krishna, CEO, IBM

Ravit Jain with Arvind Krishna, CEO, IBM

Last week at IBM #Think2025 in Boston, I didn’t just sit in the audience—I had a front-row seat to one of the most important conversations around enterprise AI happening today. And I got to meet the man leading that charge—Arvind Krishna, Chairman & CEO of IBM. Visionary doesn’t begin to describe it.

What I experienced wasn’t just a conference—it was a shift.

A shift from experimentation to implementation.

From GenAI as hype... to GenAI as infrastructure.

Here’s what stood out to me, and why it matters:

1. Build Your Own Agent in Minutes

IBM launched Agent Builder in IBM watsonx Orchestrate, making it possible to go from an idea to deployable AI agent in under 5 minutes. These aren’t just chatbots. They’re: - Powered by enterprise-grade models - Integrated with APIs - Governed by IBM watsonx.governance - Able to perform actual tasks, not just generate text And it’s model-agnostic. Whether you're using open-source models, partner models, or your own—IBM is giving you the freedom to choose.

Arvind put it best: “You shouldn’t have to move your data to move your business forward.”

This isn’t “AI assistant in a box.” This is domain-specific automation built by the people who know the domain best

The genius of it? It doesn’t require replatforming, vendor lock-in, or massive change management. Just plug it in—and go.

2. API Agents: Talk Less, Do More

IBM took it further with API Agents—AI that not only fetches info, but executes actions securely through your internal APIs. Example: An agent that checks warehouse inventory, places an order, updates your CRM, and notifies the customer—all autonomously. This moves GenAI from suggestion engines to decision engines.

This isn’t just automation—it’s context-aware orchestration. It’s how GenAI will eventually replace robotic process automation (RPA) in enterprise stacks.

3. AI That Modernizes Code for You

IBM watsonx Code Assistant is coming—and it’s designed to help enterprises migrate from RPG to Java using GenAI. This is a huge win for companies running on legacy systems that still drive core business processes but are hard to maintain.

No full rewrites. No throwing out your core. Just a smart, AI-powered upgrade path.

It’s not flashy—but it’s exactly what enterprise tech leaders are asking for. GenAI that upgrades infrastructure, without rewriting history.

4. IBM + Salesforce = AI-Powered Mainframe Data

IBM and Salesforce are teaming up to activate agentic AI for IBM Z.

Imagine Einstein Copilot (Salesforce’s AI assistant) querying real-time mainframe data, pulling insights, and making decisions—without ever moving the data out.

This is a textbook case of GenAI meeting the enterprise where it is.

The old narrative was: move the data to the AI. The new reality is: move the AI to the data

This is the kind of architectural thinking we need more of—AI meeting governance, not the other way around

Favorite Keynote Moments

Arvind Krishna reminded us: “We’re looking at 1 billion new apps by 2028. You can’t build or manage that with yesterday’s architecture.”

He emphasized that AI must be governed, explainable, and secure—especially when it touches sensitive enterprise data.

Ritika Gunnar delivered one of the most memorable lines: “Accuracy is good. Trust is better.”

She shared that IBM’s updated watsonx.data can improve model accuracy by up to 40%, and highlighted the platform’s ability to handle 450 billion inferences daily—safely, at enterprise scale.

She also spotlighted real-world applications in banking, healthcare, and logistics using agents to automate and optimize workflows.

These aren’t future-state scenarios—they’re proof points that GenAI, when embedded into governed infrastructure, can already outperform traditional systems in both scale and trust.

My Takeaway

If 2023 was the “demo” year for GenAI, and 2024 was “POC season” … IBM Think 2025 made one thing clear: It’s time to go live.

IBM isn’t just theorizing about AI agents. They’ve built the tools—pre-built agents, agent builders, API integrations, governance, and hybrid deployment—for enterprises to use AI in production. You can learn more about it – https://obvs.ly/ravit-jain7

Big thanks to the IBM team for having me there. I’ll be sharing more interviews and behind-the-scenes moments from the conference all week.

Until then, if you're thinking about enterprise AI—you might want to start thinking about IBM.

We’re proud partners with IBM and are already looking forward to what’s coming at Think 2026.

🔍 Stay Ahead in AI & Data! Join 137K+ Data & AI professionals who stay updated with the latest trends, insights, and innovations.

📢 Want to sponsor or support this newsletter? Reach out and let's collaborate! 🚀v

Best,

Ravit Jain

Founder & Host of The Ravit Show