Surendra Singh · Staff AI Engineer · SF

I build production multi-agent AI
for regulated financial services.

For 3+ years I've been shipping conversational AI platforms where the hard problems aren't model selection — they're routing layers, multi-turn state, guardrails that compliance will actually sign off on, and eval infrastructure that catches failure modes single-turn benchmarks miss. Before that, 15 years on Wall Street: Merrill Lynch building intraday risk and trade systems for institutional desks, Fidelity architecting distributed backends for the online trading platform. I know the data flows, the regulatory surface, and the operational reality of these firms from the inside.

Engineering memo → LinkedIn

What I Ship

Five surfaces.
Each load-bearing.

→
Multi-agent routing architectures — ML-first hybrid (DeBERTa-v3 classifiers + session-context Transformer embeddings + LLM fallback), not pure-LLM routing.
→
Multi-turn trajectory evaluation — persona-driven simulated users exercising the full N×M intent × persona matrix; trajectory scoring, not just turn-by-turn.
→
LLM evaluation pipelines — 18+ metrics, CI/CD-integrated, self-hosted for compliance; regressions block merge.
→
Memory-augmented agent systems — bitemporal, deterministic, auditable. See Attestor below.
→
Guardrail architectures for fiduciary contexts — input filtering, retrieval boundaries, response gate, full audit trail. Compliance as a co-author, not a reviewer.

Open-Source Work

Three projects, each one a public artifact.

01 · Active

Finance

MCP-native financial analytics on Claude. 25+ tools and 3 slash commands across equity research, portfolio analysis, options, cross-asset, fundamentals, and ML pipelines.

Stack: Python · MCP · sklearn · yfinance · Massive · Plaid (v1.5)

Browse 8 category pages →

02 · Active

Attestor

Auditable memory for agent teams. Deterministic. Bitemporal. Self-hosted, with no LLM in the critical path.

Stack: PostgreSQL · pgvector · Neo4j · Voyage AI embeddings

The memory layer most agent stacks pretend they don't need until the regulator asks where a number came from.

I write about multi-agent architecture, LLM evaluation, and applied AI in regulated industries. The eight pages below are a deep dive into Finance specifically — open source, in production, with the math, the source, and the design notes.

Why I Built This

I've had the same conversation hundreds of times.

Over 15 years in financial services, Merrill Lynch, Fidelity, and beyond, I kept meeting the same person. A VP. An MD. A senior analyst. Brilliant at their craft. Standing by a whiteboard or hunched over a Bloomberg terminal.

And they'd always say some version of this:

"I know exactly what analysis I need. I just can't get it done fast enough."

Then they'd show me the spreadsheet. Always a spreadsheet.

A $100 trillion industry, portfolios, models, forecasts, reconciliations, board decks, all flowing through .xlsx files held together by VLOOKUP and prayer.

The smartest people in the room weren't bottlenecked by ideas. They were bottlenecked by tooling. Waiting on engineering tickets. Waiting on the quant team. Waiting on a Python script someone wrote in 2019 that nobody remembers how to run.

So I stopped waiting and built the thing they actually needed.

MERRILL LYNCH

Risk Management Systems

Built distributed platforms for institutional risk analytics

FIDELITY INVESTMENTS

Trading Infrastructure

Order routing systems and distributed trading platforms

STEALTH FINTECH · SEP 2021 – PRESENT · SF

Staff AI Engineer · Multi-Agent AI in Production

Member-facing financial product. Multi-agent routing (planning, budgeting, investment, credit) → guardrail layer → response, multi-turn state across concurrent sessions. DeBERTa-v3 intent classifiers + session-level Transformer embeddings, LLM fallback only. 18-metric eval harness with multi-turn trajectory simulation, self-hosted for compliance. Guardrail architecture for fiduciary contexts — input filter, retrieval boundaries, response gate, full audit trail.

Read the engineering memo →

The Conversation That Kept Repeating

Every firm. Every desk. Same story.

VP, Equity Research

"I need a Sharpe ratio comparison across 5 tickers. IT says 3 weeks."

MD, Investment Banking

"I signed up for that Python course. Got through week 2. Then Q3 close happened."

Director, FP&A

"My team spends 40% of their time just cleaning data before they can analyze anything."

PM, Hedge Fund

"I know the analysis I want. I just can't express it in code."

What if you could express it in English?

That's Finance. You describe the analysis. Claude pulls the data, runs the computation, generates charts, and interprets the results. 30 seconds. One sentence.

The Industry's Answer Was Wrong

They said "learn Python."
I watched what happened next.

In 2023, JPMorgan told investors every new analyst would be trained in Python. Training companies charged $200+ per seat. Thousands enrolled.

I watched from the inside. The same cycle played out everywhere, enthusiasm, frustration, abandonment. Deadlines don't wait for your learning curve.

Python is powerful. I've built production systems in it. But asking finance professionals to become software engineers just to run a Sharpe ratio was always the wrong answer.

The right answer: give them tools that speak their language.

Week 1

Excited. Installed Python. "This is going to change everything."

Week 3

Debugging import errors. Stack Overflow tabs multiplying. Deadline looming.

Week 6

Back in Excel. The model ships. The Python course gathers dust.

✓

With Finance

30 seconds. One sentence. Full analysis with charts and interpretation.

Core MCP Tools

Three categories. Zero code.

The foundation layer — market analysis, ML workflows, and environment checks.

Market Analysis (6 tools)

📈

Stock Price Analysis

analyze_stock

Price chart with trend summary. "Show me AAPL's price chart for the last 6 months"

📊

Returns Analysis

get_returns

Daily and cumulative return charts. "What are NVDA's returns since January?"

🌊

Volatility Analysis

get_volatility

Annualized and 21-day rolling volatility. "How volatile has TSLA been this quarter?"

⚡

Risk Metrics

get_risk_metrics

Sharpe ratio, max drawdown, beta vs S&P 500. "Get risk metrics for GOOGL over the last year"

📉

Ticker Comparison

compare_tickers

Normalized performance chart for 2-5 tickers. "Compare AAPL, MSFT, and GOOGL over 90 days"

🔗

Correlation Heatmap

correlation_map

Return correlation for 2-10 tickers. "Show correlation between AAPL, JPM, JNJ, and XOM"

ML Workflows (3 tools)

🔍

CSV Data Ingestion

ingest_csv

Auto-profile any CSV: column detection, outlier removal, EDA charts.

🎯

Liquidity Risk Model

liquidity_predictor + predict_liquidity

Train regression model, then score clients with LOW / MODERATE / HIGH risk ratings.

🤖

Investor Classifier

investor_classifier + classify_investor

RandomForest classification for investor segmentation by profile attributes.

Environment (2 tools)

🏓

Ping

ping

Confirm the MCP server is running and ready.

✅

Validate Environment

validate_environment

Check all 7 required packages are installed with version numbers.

3 Slash Commands

Type a slash. Get institutional output.

Three finance personas. One auto-router and two role-specific lenses (analyst and PM).

Finance Personas (3)

/finance

General-purpose analysis, routes to any of the 25+ tools

/finance-analyst

Equity analyst lens: Sharpe first, single-stock focus

/finance-pm

Portfolio manager lens: drawdown first, portfolio risk

Private Equity

Looking for the PE workflows — DX decision diagnostic, BX cross-portco benchmarking, IC memos, DD checklists, value-creation plans? They moved to their own focused repo.

bolnet/private-equity → · 5-lender BX demo → · Lending Club DX demo →

2 Professional Personas

Same data. Different lens.

Both use the same tools. The difference is framing, priority, and audience.

/finance-analyst

Equity Analyst

Frames every ticker as a security under research coverage. Optimized for buy-side and sell-side consumers.

Leads withSharpe ratio (risk-adjusted return quality)

Beta meansStock-level market sensitivity

Next stepCompare to sector peers

❯ /finance-analyst initiate coverage on NVDA

/finance-pm

Portfolio Manager

Frames every ticker as a holding in the portfolio book. Optimized for internal risk committee and LP reporting.

Leads withMax drawdown (worst-case portfolio loss)

Beta meansPortfolio-level systematic risk exposure

Next stepCheck correlation with other holdings

❯ /finance-pm review risk on my holdings: AAPL, NVDA, JPM

See It In Action

You describe it. Claude delivers it.

claude-code ~/portfolio-analysis

❯ /finance-pm check diversification across AAPL, JPM, JNJ, XOM

⠋ Running compare_tickers + correlation_map...
✓ Retrieved 252 trading days per ticker

Portfolio Risk Summary (PM Lens)
┌──────────┬────────────┬──────────────┬──────────┐
│ Holding │ Sharpe     │ Max Drawdown │ Beta     │
├──────────┼────────────┼──────────────┼──────────┤
│ AAPL     │ 1.42       │ -14.2%       │ 1.12     │
│ JPM      │ 1.18       │ -9.7%        │ 1.08     │
│ JNJ      │ 0.64       │ -7.1%        │ 0.55     │
│ XOM      │ 0.91       │ -11.4%       │ 0.72     │
└──────────┴────────────┴──────────────┴──────────┘

✓ Correlation heatmap saved
✓ Normalized performance chart saved

Real diversification detected. JNJ (beta 0.55) and XOM (0.72) provide meaningful downside protection against the tech-heavy AAPL position. Cross-correlation between JNJ and AAPL is only 0.31...

5 Guided Walkthroughs

Built for how you actually work.

Role-specific workflows, because I've sat across the table from every one of these roles.

Equity Research

Coverage Initiation

analyze_stock · get_risk_metrics · get_returns

Price charts, risk metrics, and cumulative performance. The three data points that feed every research note.

Hedge Fund PM

Diversification Audit

compare_tickers · correlation_map · get_volatility

Real diversification or correlated bets in different names? Vol regime detection and pair trading signals.

Investment Banking

Comparable Analysis

compare_tickers · correlation_map

Normalized performance comparisons and relative positioning for deal pitches and pitch materials.

FP&A

Data Profiling

ingest_csv · liquidity_predictor

Automated CSV profiling, forecasting model inputs, and variance analysis prep.

Accounting

Anomaly Detection

ingest_csv

Transaction data profiling, outlier detection prep, and ERP export cleanup.

The New Finance AI Stack

Three layers. One workflow.

AI finally came to where finance lives. Finance adds the analytical layer that was missing.

🔷

Copilot for Finance

Microsoft

Operational finance: reconciliation, variance analysis, collections, ERP integration.

ReconciliationVarianceCollectionsERP

🟣

Claude in Excel

Anthropic

Model intelligence: formula tracing, scenario testing, error debugging with cell-level citations.

Formula TraceDebuggingScenariosSkills

You Are Here

⚡

Finance

Open Source

Analytical horsepower: 25+ tools, ML models, institutional-grade research.

Sharpe / Beta ML Models Options Greeks Personas

Get Started

Three ways to install.

MCP server for Claude Code CLI, plugin for slash commands, or web connection for claude.ai in your browser.

Recommended

Method 1

MCP Server

Claude Code CLI. Local stdio transport. Full 11 tools.

git clone https://github.com/bolnet/finance.git
cd finance
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Method 2

Claude Code Plugin

MCP server + 18 slash commands + 16 skill definitions bundled together.

cd finance
claude

# All 18 commands auto-discovered
❯ /finance analyze AAPL

Method 3

Web (claude.ai)

HTTP tunnel via ngrok or Cloudflare. Use from browser, no CLI needed.

bash scripts/start_web.sh

# Paste URL in claude.ai
# Settings > Connectors > Add

I build production multi-agent AIfor regulated financial services.

Five surfaces.Each load-bearing.

Three projects, each one a public artifact.

Finance

Attestor

Each category, in depth.

I've had the same conversation hundreds of times.

Risk Management Systems

Trading Infrastructure

Staff AI Engineer · Multi-Agent AI in Production

Every firm. Every desk. Same story.

They said "learn Python."I watched what happened next.

Three categories. Zero code.

Stock Price Analysis

Returns Analysis

Volatility Analysis

Risk Metrics

Ticker Comparison

Correlation Heatmap

CSV Data Ingestion

Liquidity Risk Model

Investor Classifier

Ping

Validate Environment

Type a slash. Get institutional output.

Same data. Different lens.

Equity Analyst

Portfolio Manager

You describe it. Claude delivers it.

Built for how you actually work.

Coverage Initiation

Diversification Audit

Comparable Analysis

Data Profiling

Anomaly Detection

Three layers. One workflow.

Copilot for Finance

Claude in Excel

Finance

Three ways to install.

MCP Server

Claude Code Plugin

Web (claude.ai)

I build production multi-agent AI
for regulated financial services.

Five surfaces.
Each load-bearing.

They said "learn Python."
I watched what happened next.