Every analytical tool calls provider.fetch_price_history(). The provider is a duck-typed object — it can be yfinance, Massive, or Plaid, and the tool code is identical. This is the boring abstraction that makes everything else possible.
Notes from Surendra Singh.
from typing import Protocol class DataProvider(Protocol): def fetch_price_history(self, ticker: str, start: str, end: str | None) -> DataFrame: ... def get_adjusted_prices(self, df: DataFrame) -> Series: ... def get_options_chain(self, ticker: str, expiry: str) -> DataFrame: ... def get_news(self, ticker: str | None, limit: int) -> list[dict]: ... # ... and so on, one method per capability
Python's Protocol is structural typing — any object with the right methods satisfies the contract, no inheritance required. yfinance, Massive, and Plaid are independent classes. None of them inherit from DataProvider. They all are one, because they all have the methods.
Inheritance would have been a mistake. yfinance and Massive have nothing in common at the implementation level — different auth, different serialization, different error semantics. Protocol-based duck typing lets each provider be optimized for its own internals while still being substitutable. This is what "good interface" actually means.
Zero-config. Public Yahoo endpoints. Adequate for individual research, slow for cross-sectional work, rate-limited under load.
57 endpoints behind the same Protocol — stocks, options + Greeks, forex, crypto, indices, news, SEC filings, technicals, fundamentals, movers.
Connect a brokerage in 20 seconds. Positions and cost basis from Fidelity, Schwab, E*TRADE, Vanguard, IBKR, Robinhood, and 12,000+ institutions.
# swap via environment variable, no code change DATA_PROVIDER=yfinance # default DATA_PROVIDER=massive # requires MASSIVE_API_KEY DATA_PROVIDER=plaid # v1.5, requires PLAID_CLIENT_ID + PLAID_SECRET
yfinance is not the most reliable data source. It is rate-limited, occasionally serves stale data, and depends on Yahoo's web endpoints which Yahoo can break at any time without notice. It is the right default anyway. Why: zero config, zero cost, works for the user-trying-the-tool-for-the-first-time. The bar for "default" is "does the README example work without an API key in 90 seconds." yfinance clears it. Nothing else does.
The consequence: yfinance is for evaluation and individual analyst work. The moment the workload becomes cross-sectional or production, the recommendation is to switch to Massive. The Protocol means switching is one env var.
Wrapped via a thin client (finance_mcp/providers/massive/client.py) and a set of mappers that translate Massive's response shape into the Protocol's expected types. The split matters:
client.py — HTTP, auth, retries, JSON parsing. Knows about Massive.mappers.py — translates Massive's payloads into the canonical pandas shapes. Knows about both.provider.py — implements the Protocol methods by composing client + mappers. Knows only about the Protocol.If Massive changes their schema tomorrow, only the mappers change. The tools and the Protocol are insulated. This is what "vendor risk mitigation" looks like at the code level.
yfinance and Massive give the user a way to type tickers and get math. Plaid changes the top of the funnel: the user authenticates a brokerage, and the book arrives. Same downstream tools. New audience — the analyst who never typed tickers because that wasn't the bottleneck.
The Plaid integration adds two Protocol methods:
def get_positions(self, account_id: str) -> list[Holding]: ... def get_cost_basis(self, account_id: str, ticker: str) -> float: ... @dataclass(frozen=True) class Holding: ticker: str shares: float cost_basis: float account: str
Read-only. No order entry, no execution, no custody. The integration is an account-linking convenience layer — not a brokerage.