<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Software Synthesis]]></title><description><![CDATA[AI and startup strategy to help companies grow.]]></description><link>https://www.akashbajwa.co</link><image><url>https://substackcdn.com/image/fetch/$s_!69uV!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0283720-672c-4385-80ad-bf790da0c73c_500x500.png</url><title>Software Synthesis</title><link>https://www.akashbajwa.co</link></image><generator>Substack</generator><lastBuildDate>Mon, 18 May 2026 03:28:46 GMT</lastBuildDate><atom:link href="https://www.akashbajwa.co/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Akash Bajwa]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[akashbajwa@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[akashbajwa@substack.com]]></itunes:email><itunes:name><![CDATA[Akash Bajwa]]></itunes:name></itunes:owner><itunes:author><![CDATA[Akash Bajwa]]></itunes:author><googleplay:owner><![CDATA[akashbajwa@substack.com]]></googleplay:owner><googleplay:email><![CDATA[akashbajwa@substack.com]]></googleplay:email><googleplay:author><![CDATA[Akash Bajwa]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Foundation Models for Math with Harmonic & Latinum]]></title><description><![CDATA[Formal Verification: Code, Hardware and more]]></description><link>https://www.akashbajwa.co/p/foundation-models-for-math-with-harmonic</link><guid isPermaLink="false">https://www.akashbajwa.co/p/foundation-models-for-math-with-harmonic</guid><pubDate>Wed, 13 May 2026 09:02:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!haSy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4><em>Upcoming Events in London</em></h4><ul><li><p>May 20th: <a href="https://luma.com/qqsw0224">Pydantic x Glyphic Engineering Night</a></p></li><li><p>May 28th: <a href="https://luma.com/event/manage/evt-Allx2CD6D3E8vYt/overview">Inference Stack Innovation with Crusoe</a></p></li><li><p>June 4th: <a href="https://luma.com/yryv7w3f">Forward Deployed Engineering with Anjor Kanekar</a></p></li></ul><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Join 8k+ founders and operators for weekly startup and AI strategy</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Two weeks ago, we hosted <a href="https://www.linkedin.com/in/ericbg/">Eric Rodriguez</a> (<strong>Harmonic</strong>), <a href="https://www.linkedin.com/in/brenregan/">Brendan Regan</a> and <a href="https://www.linkedin.com/in/dennjosele/">Dennj Osele</a> (<strong>Latinum</strong>) for a discussion on foundation models for mathematics, formal verification and its applications.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!haSy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!haSy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 424w, https://substackcdn.com/image/fetch/$s_!haSy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 848w, https://substackcdn.com/image/fetch/$s_!haSy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!haSy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!haSy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg" width="1456" height="1941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!haSy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 424w, https://substackcdn.com/image/fetch/$s_!haSy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 848w, https://substackcdn.com/image/fetch/$s_!haSy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!haSy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b197b88-3ee2-4ab9-923f-00ae75eaf5fe_4284x5712.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Formal verification is the practice of expressing software programs and mathematical proofs in a language that a machine can check mechanically. Lean has become the most important proof assistant in this ecosystem, with Mathlib serving as its core knowledge base: a formalised library of mathematics now comprising roughly 2.25 million lines of human-curated code. Historically, however, the cost of formalisation has kept the field relatively narrow, concentrated in academic mathematics and high-assurance domains such as hardware, cryptography, and security-critical software.</p><p>That constraint is beginning to change. LLMs are making it cheaper to generate Lean code and proofs, reducing the human effort required to formalise mathematics or verify software. If the trend continues, formal verification could move from a specialised discipline to a more widely used layer for checking mathematical, software, and eventually AI-generated outputs.</p><p>Models can now generate Lean proofs at a scale and quality that human reviewers struggle to distinguish from human work, opening up a credible path to formalising large parts of mathematics, verifying production software, and producing AI systems whose outputs can be checked rather than merely trusted. The deeper bet is that mathematics is uniquely well-suited as a training environment for general reasoning &#8212; the compiler provides a perfect reward signal, which is the cleanest reinforcement-learning setup currently available.</p><p>On to the discussion.</p><h2><strong>1. The frontier has moved past the IMO</strong></h2><p>Olympiad-level mathematics is now treated as a solved training objective, with new postgraduate mathematics tests emerging, like<a href="https://1stproof.org/about.html"> First Proof.</a> The active research problems are <strong>library construction</strong> and <strong>longer time-horizon research-level proofs</strong>, not individual theorem generation.</p><p>This changes how capability should be measured. A theorem that takes ~10 hours to prove against a mature library may take ~100 hours without it (Metr time-horizon framing). Library growth is the rate-limiter. Systems that amortise library construction, not just prove harder theorems, are what unlock the next step function.</p><h2><strong>2. New libraries emerging</strong></h2><p>LLM submissions are increasing pressure on Mathlib. The underlying constraint is maintenance cost. Lean does not guarantee backward compatibility, so each release requires Mathlib to be updated.</p><p>The risk is that Mathlib&#8217;s current model, which was designed for human-scale throughput. Parallel libraries may grow alongside it rather than feeding into it.</p><p>Multiple parallel <a href="https://github.com/merely-true/merely-true">AI-generated</a> Lean libraries are emerging outside Mathlib, each aiming to be larger than Mathlib itself (but still nascent). There&#8217;s a risk that these will be mutually incompatible, with divergent formalisations of the same objects.</p><h2><strong>3. Latinum&#8217;s approach</strong></h2><p><strong>Denny spoke about Latinum&#8217;s model: recursive latent reasoning.</strong> A large encoder ingests context, produces a reasoning step in latent space, feeds it back, and cycles until convergence; a decoder tokenises only the final output.</p><p>Lineage: JEPA-style world models, Universal Transformer.</p><p>Key claim: <strong>the recursion itself is trained</strong>, with latent states supervised at each step, permitting parallel internal chains that converge externally. Stated rationale: chain-of-thought burns scarce context window; internal recursion is unconstrained because nothing is printed.</p><h2><strong>4. Harmonic&#8217;s model</strong></h2><p><strong>Eric spoke about hybrid tree search.</strong> Three points on a spectrum:</p><ul><li><p>Whole-proof generation &#8212; simple, no granular feedback, one-line breakage forces full regeneration</p></li><li><p>Tree search &#8212; tactic-level policy model, parallel state exploration, heavy infrastructure burden (Lean execution at scale)</p></li><li><p>Hybrid decomposition &#8212; break the problem, use whole-proof generation on pieces, assemble via high-level strategist</p></li></ul><p><strong>Why frontier LLMs don&#8217;t natively do tree search:</strong> base models are trained for full answers, not single tactics. Tree-search policy models are trained differently and have produced proofs not previously known to humanity (Erd&#337;s-class problems), so this is not regurgitation.</p><h2><strong>5. Action-space trade-off: fundamental, can&#8217;t be engineered away</strong></h2><p>Lean offers two obvious cheating mechanisms for an AI prover. sorry is a placeholder that lets the model skip a proof and still compile &#8212; equivalent to writing &#8220;trust me.&#8221; Custom axioms let the model assert something is true without proof. Both are legitimate Lean primitives with valid uses, but both are also the path of least resistance for a model that can&#8217;t actually prove what&#8217;s being asked.</p><p>Earlier prover systems were built so the model literally could not emit sorrys or introduce axioms &#8212; the only allowed operations were adding new content and proving it from existing primitives. This made cheating structurally impossible. Users rejected it: pure-additive systems cannot <strong>refactor existing proofs, fix incorrect user-provided definitions, or scaffold a proof structure with intentional placeholders</strong>. None of these are exotic workflows &#8212; they are how Lean is actually used.</p><p>Current approach used by Harmonic: <strong>open the action space, constrain harmful actions case-by-case.</strong> Recent literature is consistent: over-restriction degrades model performance analogously to constrained attention.</p><p>The hard part is that sorry is not purely a failure signal. Mathematicians structure large proofs by writing skeletons with deliberate sorry-marked gaps and filling them in later (the &#8220;Lean blueprint&#8221; workflow). Models doing this are working correctly. Models emitting sorry because they gave up are working incorrectly. Token-level the two are identical. <strong>The reward signal has to distinguish legitimate placeholder use from give-up behaviour without surface-level cues</strong> &#8212; a non-trivial supervision problem.</p><p><strong>Read-across to agent infrastructure:</strong> the capability-vs-constraint trade-off is the same problem every agent company faces. Give the model file-write, shell access, or arbitrary API calls and it can do useful work &#8212; and also cause damage. Restrict the surface and the agent cannot do real work. Harness engineering &#8212; the discipline of designing this trade-off &#8212; is now a real engineering function. Formal maths is the cleanest version of the problem because cheating is well-defined; the same logic applies across general agent design.</p><h2><strong>6. Why maths? Two justifications, in tension</strong></h2><p><strong>Generalisation bet.</strong> Humans good at Fermat-class problems tend to transfer to code and quant reasoning. Train models on maths, expect transfer. Counter: the causal direction may not hold for models &#8212; selection effects in humans don&#8217;t imply training effects in models. Risk case is a system that plateaus as a domain-specialist.</p><p><strong>Data argument.</strong> Training data is the binding constraint. Maths plus a compiler is <strong>infinite, perfectly-labelled data</strong> &#8212; the compiler is 100% correct.</p><p><strong>Best pushback:</strong> mathematical data is not infinite in the relevant sense. 1+1+1+... is trivially generable and useless. The hard problem is <strong>deriving which <a href="https://davidbessis.substack.com/p/the-fall-of-the-theorem-economy">abstractions matter</a></strong> &#8212; which theorems are worth proving. Same failure mode as code-focused AGI approaches using compiler-pass or test-pass as reward: clean signal, no direction.</p><h2><strong>7. The underlying reason labs started with maths: RL signal quality</strong></h2><p>RL on formal maths works despite being a single-player game with no native win condition. The compiler <em>is</em> the reward signal &#8212; making formal maths among the cleanest RL environments available.</p><p>This explains the path-dependence of frontier lab strategy: they started where reward signals were cleanest (maths, reasoning), then migrated as signals degraded. The migration targets &#8212; Office, Excel, accounting &#8212; are domains where partial reward is recoverable (value reconciliation, keyword presence).</p><h2><strong>8. Mathematics occupies a privileged data position</strong></h2><p>Sharper framing of the data argument: maths has <strong>structure from which semantics can be inferred</strong>. Raw Python or JSON does not. A proof is recognisably a proof; a type signature encodes intent. Arbitrary code on the internet does not train the same way.</p><p>Lean 4 was designed as a programming language first and proof assistant second. In principle: translate conventional code to Lean, execute and verify simultaneously. The runtime supports full applications.</p><h2><strong>Commercial threads</strong></h2><h3><strong>Chip verification</strong></h3><p>Industry ratio: ~2&#8211;3 verification engineers per designer. Silicon errata economics justify this. Years of chip timelines are verification-bound.</p><p>Under-automated, and hardware description languages have small training corpora relative to mainstream languages &#8212; creating specialist-friendly data wedges. Self-play approaches (AlphaGo-analogous) are in commercial development but unsolved.</p><h3><strong>Software verification</strong></h3><p>Recent large rounds in math-to-software-verification reflect the thesis shift. The operative question is not whether every SaaS app needs formal verification, but whether <strong>verification becomes cheap enough that cost-benefit flips for mid-criticality systems</strong>. Current state: hire specialists, give them unlimited runway. Target state: run the verifier, price the failure mode.</p><h3><strong>Contract-first languages</strong></h3><p>Dafny is the reference case &#8212; contract first, implementation second. Never broke through because it required a mathematician and a software engineer. AI collapses this: model writes the contract, then the code; humans only review the contract. Code trends toward &#8220;the new assembly language.&#8221;</p><p>Destination is unclear &#8212; Dafny, extended Rust, Lean 4. Rust already moves most runtime bugs to compile time, so Python &#8594; Rust &#8594; contract-first is a plausible migration path.</p><h3><strong>Interconnection problem</strong></h3><p>Individually-verified components (CPU, router, crypto library) <strong>compose into systems with emergent vulnerabilities</strong> via the OS and network layer. Local verification is necessary, not sufficient. Concurrency proofs are harder than sequential proofs by another order. The gap between &#8220;verified component&#8221; and &#8220;verified system&#8221; is open.</p><div><hr></div><p></p>]]></content:encoded></item><item><title><![CDATA[Inside Intercom's AI-Native Journey: Brian Scanlan]]></title><description><![CDATA[The journey to tripling PRs per engineer]]></description><link>https://www.akashbajwa.co/p/inside-intercoms-ai-native-journey</link><guid isPermaLink="false">https://www.akashbajwa.co/p/inside-intercoms-ai-native-journey</guid><pubDate>Mon, 04 May 2026 06:02:03 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a4241a01-d42a-4ca7-bd2a-45cad7db5159_496x496.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4><em>Upcoming Roundtables in London</em></h4><ul><li><p>May 6th: <a href="https://luma.com/xjyyovjc">AI-Native GTM with Clay</a></p></li><li><p>May 20th: <a href="https://luma.com/qqsw0224">Pydantic x Glyphic Engineering Night</a></p></li><li><p>May 28th: <a href="https://luma.com/event/manage/evt-Allx2CD6D3E8vYt/overview">Inference Stack Innovation with Crusoe</a></p></li><li><p>June 3rd: <a href="https://luma.com/yryv7w3f">Forward Deployed Engineering with Anjor Kanekar</a></p></li></ul><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Join thousands of founders and operators from leading companies</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p><strong>Intercom</strong> is arguably one of the most successful examples of a SaaS company becoming AI-native. They declared Code Red early after ChatGPT launched and went all in on <strong>Fin</strong>, their AI agent for customer support. Since then, the company has started <a href="https://ideas.fin.ai/">publishing a blog</a> charting their journey to become as AI-pilled as they come.</p><p><a href="https://x.com/brian_scanlan">Brian Scanlan</a>, Senior Principal Systems Engineer, went viral when he gave a peek into Intercom&#8217;s Claude Code tooling.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/brian_scanlan/status/2033978300003987527" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FpQB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 424w, https://substackcdn.com/image/fetch/$s_!FpQB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 848w, https://substackcdn.com/image/fetch/$s_!FpQB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 1272w, https://substackcdn.com/image/fetch/$s_!FpQB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FpQB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png" width="1182" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:1182,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:104457,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/brian_scanlan/status/2033978300003987527&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/196240994?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FpQB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 424w, https://substackcdn.com/image/fetch/$s_!FpQB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 848w, https://substackcdn.com/image/fetch/$s_!FpQB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 1272w, https://substackcdn.com/image/fetch/$s_!FpQB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a97f3d3-544b-4a04-8675-354ff0f61380_1182x422.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I had the chance to speak to Brian about how Intercom is changing hiring practices, instrumentation to measure the right metrics, consciously investing in Anthropic as a platform, org design, code quality and graduating from tokenmaxxing to seeing high quality PRs per engineer inflect. You can read some of the Intercom writing referred to in this interview <a href="https://ideas.fin.ai/p/how-we-use-claude-code-today-at-intercom">here</a> and <a href="https://ideas.fin.ai/p/2x-nine-months-later">here</a>. </p><div><hr></div><h2><strong>Talent and the interview process</strong></h2><p><strong>You built a console where Claude can execute arbitrary Ruby against production data, and you mentioned the top-5 users of the read-only Rails production console aren&#8217;t engineers. They&#8217;re design managers, support engineers, PMs. How has that shift changed how you think about what you&#8217;re looking for in talent, and how has your interview process changed?</strong></p><p>We&#8217;ve always looked for people who are flexible and product-oriented. Intercom is a product-oriented company, the founders are product people, and the name of our engineering role is product engineer. Until recently we even had a product component on our engineering interview. The same is true of designers and PMs: we want them close to reality, whether that&#8217;s users, code, or whatever else. There&#8217;s a great tweet from our head of product where he basically states that nothing else matters apart from code in production. I love that coming from a head of product, because you might expect them to be focused on plans, roadmaps, Gantt charts, and designs. Instead, what he cares about is how fast we&#8217;re getting real features and real code into production.</p><p>So this is more of a slight evolution than a reinvention. We&#8217;ve always been tech-agnostic. We&#8217;ve never just hired Ruby on Rails engineers, and we&#8217;ve been pretty liberal about hiring people from diverse backgrounds. I came from Amazon, for example. What we interview for is the ability to teach, learn, and change, the ability to be open-minded, more than five years of any specific stack or the aura of having worked at Google.</p><p>The tweaks we&#8217;ve made are around openness to AI. Most people showing up in our loops these days have seen what Intercom is doing and want to be part of it, so it&#8217;s not usually a problem. But we still get people from environments that have had zero AI exposure, or from more traditional backgrounds. Their enthusiasm and willingness to change is what matters. We look for openness rather than five years of cloud experience.</p><p>Looking forward, we expect design, product, and engineering roles to continue to converge. There&#8217;s still incredible value in depth across all these areas, but we now have product engineers doing full product management of features, designers shipping real code, and PMs shipping code as well. It&#8217;s not just box-checking. It&#8217;s about reframing those roles to look more like a builder role rather than a member-of-technical-staff role, rather than &#8220;I&#8217;m a designer and I hand stuff off to engineers.&#8221; We&#8217;re already well set up for this; the trajectory is just more and more towards everyone being able to do everything.</p><div><hr></div><h2><strong>Measuring what matters in Claude Code adoption</strong></h2><p><strong>You&#8217;ve instrumented every Claude Code lifecycle event with OpenTelemetry. This essentially treats Claude Code like a product you&#8217;re iterating on internally. What metrics do you actually look at to decide whether a skill is working, and how do you avoid the trap of measuring activity instead of impact, e.g. tokens spent, which can be a very poor proxy for value creation?</strong></p><p>There&#8217;s no shame in going through a journey of adoption where, at the start, you&#8217;re just looking to force people to try things out, to token-max. The opportunity cost is so big, and the amount of change you&#8217;re trying to drive through people and the organization is so large, that it&#8217;s perfectly fine to have a slightly immature initial phase where everyone has to try this and use it as much as possible. Yes, you&#8217;ll waste a bit of time, but what you&#8217;re really getting is people becoming comfortable with the tools, figuring out where they work and where they don&#8217;t, and where you need to invest. You&#8217;re putting pressure on the system. If you&#8217;re too careful, or chase the perfect measurements too early, you end up learning lessons later than you should have.</p><p>One of the ways we&#8217;ve historically built at Intercom is to do the harder stuff early. In a lot of project plans, you load the start with easy work to manufacture the illusion of momentum, and the gnarliest, hardest parts get pushed to the end, which is also when you find out whether the thing actually works. We flip that. We want you to hit the hardest part first. We want you to be hungry for the learning of &#8220;does this truly solve the problem&#8221; rather than burying it.</p><p>So with token-maxing or output measurement, you have to first open people up to change. That means, &#8220;just go use it for everything, doesn&#8217;t matter how good it is.&#8221; You need to encourage that initial use: race people against each other, run a crappy leaderboard, gamify it. But you don&#8217;t confuse that with outcomes, which is what you ultimately care about. You go through a juvenile phase deliberately, because gamifying the trial is the easiest way to get people to test it across everything and figure out where it works and where it doesn&#8217;t.</p><div><hr></div><h2><strong>Build vs. buy in an AI-native infrastructure stack</strong></h2><p><strong>You&#8217;ve got 13 plugins, 100+ skills, and a JAMF-managed marketplace. That&#8217;s a lot of internal infrastructure to maintain. I&#8217;m sure you&#8217;ve been able to rip out some SaaS apps as a result. How do you make the build-vs-maintain tradeoff as a company?</strong></p><p>Historically, we&#8217;ve had an engineering principle that we are technically conservative; we bias towards buying over building. We want people focused on shipping product rather than everything else, and we lean heavily on vendors rather than building expertise across the board.</p><p>That has changed somewhat. One of the biggest things to take off at Intercom, arguably bigger than Claude in software development, has been the adoption of Claude Code with a well-set-up suite of MCPs and access controls across the rest of the company. What people have ended up using it for is building their own sales dashboards, customer health dashboards, and highly granular customer-specific reporting, usually someone in sales or someone driving these things directly. What this has effectively replaced is business intelligence tooling and the likes of Tableau.</p><p>We didn&#8217;t deliberately set out to replace Tableau. We&#8217;ve just had a kind of bottoms-up product-market fit for this way of working, and we couldn&#8217;t go back to a hosted platform or a more traditional system. Once people have great, well-controlled data, good guardrails, and good places to host reports or small applications, the need for a centralized purchase of something like Tableau diminishes.</p><p>So I wouldn&#8217;t say we&#8217;re consciously trying to rip out big pieces of software, but the need for smaller purchases (role-specific workflows, reporting tools, that kind of thing) is shrinking. We&#8217;re probably purchasing far fewer of those, and we&#8217;re likely on a path to replacing things like Tableau over time, but very much bottoms-up. We&#8217;re not doing top-down &#8220;let&#8217;s vibe-code a Salesforce replacement today.&#8221;</p><div><hr></div><h2><strong>The harness around Claude Code, and the role of the human engineer</strong></h2><p><strong>The Claude Code hooks you&#8217;ve imposed to enforce adherence to your PR workflow and approved tools are great examples of how you&#8217;re building your own harness around Claude Code to get maximum value. What other best practices have you had to follow to make Claude Code more reliable over time, especially as we enter the age of background agents? And how do you see the role of human engineers evolving to best enable coding agents?</strong></p><p>We&#8217;ve had to strengthen the guardrails substantially and provide a lot of guidance, and we&#8217;ve socialized this problem heavily. We tell people: we are all part of the flywheel, and we expect everyone to be part of it. I&#8217;ve written an engineering AI principle that says all technical work is becoming agent-first. In practice, that means you should be using an agent for all technical work, and when it doesn&#8217;t get things right perfectly every single time, your job is to notice and do something about it.</p><p>What that has meant in practice: more linters everywhere, making patterns in our codebase more standardized, and writing down explicit guidance for Claude Code along the lines of &#8220;if you&#8217;re going to do X, use these paths.&#8221; We have the headwind of being a 15-year-old SaaS with all sorts of legacy in our codebase, patterns we&#8217;ve long since moved past but that still exist. We have to explicitly tell Claude about the modern way. What we do with humans is essentially the same: we onboard people by showing them how to get stuff done, what good changes look like, and which parts of the code to ignore as the old way. The pairing and unblocking process for Claude Code mirrors human onboarding.</p><p>Beyond socialization and guardrails, we&#8217;ve found that skills, currently the main unit of execution, need to be extremely small. Small, testable, provably good. Skills need to do the full job, because it&#8217;s very easy to build a skill that <em>looks</em> like it&#8217;s doing a good job but isn&#8217;t, and a bad skill is almost indistinguishable from a great one at first glance. The hard part is really pushing on what great looks like: fast, accurate, gets it right first time, tested against all the use cases and edge cases, something you can stand over.</p><p>We have a continual flywheel built into pretty much every skill. If something happens that suggests the skill should be updated, the instruction is to update the skill. Skills are self-improving as they&#8217;re used. It&#8217;s easy to go expansive and build opinionated, monolithic skills that try to do a hundred different things, but I think it&#8217;s similar to writing code: you want small, testable units that do one thing well and are understandable.</p><p>The quality bar has to be extremely high, because the worst case is you spend a lot of time building automation, everyone is very enthusiastic, but the skills aren&#8217;t reliable, and then people stop using them, ignore them, bypass them. People just want to get their job done. So we run a small number of extremely high-quality skills rather than a large volume trying to automate complex, sprawling work.</p><div><hr></div><h2><strong>Model lock-in and platform optionality</strong></h2><p><strong>In terms of building infrastructure around Claude Code, are you building in a way that allows you to port your harness onto Codex or other providers easily, or are you developing switching costs as you invest more into enablement of Claude models specifically?</strong></p><p>We knew this when we decided to go all-in on Claude Code. It was a slightly easier decision in December because Opus was ahead, but we knew the frontier wouldn&#8217;t always be in one place. Maybe Opus isn&#8217;t the best for Ruby, maybe not the best for security, maybe Codex overtakes it on cost. We accepted that.</p><p>But this is a platform play. We don&#8217;t choose multiple cloud providers and say &#8220;Google has the best blob storage, Oracle the best MySQL hosting.&#8221; You get the most benefit from a single platform and the compounding benefits of a system that works together. Vibes-wise, Opus may not be the best right now, and you could argue benchmarks back and forth, but I&#8217;m not too worried, because we get more benefit from being on one platform and being able to prove we can reproduce work at extremely high quality.</p><p>If Anthropic shut down tomorrow, I don&#8217;t think it would be a gigantic task to move to Codex, or even to roll our own to a degree. It&#8217;s likely a single Codex session away. We do already use Codex in our environment, not as a general-purpose agent but for things like code review and reviewing the output of another agent. Using two different models for meta-analysis is a perfectly good approach to gaining more confidence in code quality.</p><p>So we&#8217;re being pragmatic, but we put a lot of weight on the fact that it&#8217;s hard enough to get one system to work well. It&#8217;s not worth trying to get everything working at the level of quality and repeatability we want across multiple providers. That doesn&#8217;t mean we won&#8217;t change providers in the future. Capabilities are converging, and the providers are all moving towards the same shapes. I&#8217;m not too worried about lock-in, but I&#8217;d be worried if we were trying to be agnostic for the sake of it. It&#8217;s like multi-cloud: you need really strong reasons to go multi-cloud, because you miss out on the benefits of going deep on one.</p><div><hr></div><h2><strong>Org design in an AI-native company</strong></h2><p><strong>How has your org chart changed in the last 2 years, in light of how many AI-native companies are operating as very flat structures?</strong></p><p>In the last three years, yes, a lot. We went through a phase, like many other companies, of fewer middle managers, a higher ratio of engineers to managers, and updated expectations of managers to be more hands-on, more leading rather than just facilitating. That was happening in the post-ZIRP era at Intercom.</p><p>Then, through the building of Fin, Intercom went through dramatic change. We put 80% of R&amp;D on Fin, moved away from our traditional help desk (we&#8217;ve since invested back in it, which is great) but we made very dramatic decisions about shifting people off the help desk into effectively greenfield areas. We dropped our tried-and-tested product team format and moved to what we internally called experience teams, more like factory project teams, way more execution-oriented, none of the baggage. We staffed them almost exclusively with tenured people who knew how to get things done inside Intercom, leaving traditional onboarding to other teams. Those have been really successful, both for focus and because it&#8217;s the right model when you&#8217;re building with AI: you need to be able to remove baggage, get people out of the way, give them a mission, and let them go.</p><p>We&#8217;re not as advanced as Anthropic, where there&#8217;s more of a research-lab flavor. We&#8217;re still a product company that wants to ship features on the roadmap. Sometimes we do experimental work, sometimes we don&#8217;t know if a feature is buildable, but we haven&#8217;t fully melted into a managerless setup.</p><p>The role of planning has changed significantly, though. We&#8217;re not doing big roadmaps anywhere now; you can barely plan six weeks out. We used to set high-level roadmap goals every three or six months and then figure out what to build in six-week chunks. Now we set goals every six weeks and figure out what to build, and that can be replanned within days. So planning cycles have shifted dramatically. The teams still exist, but the format, expectations, and scope of what people work on have changed significantly. We just haven&#8217;t completely broken up the org.</p><div><hr></div><h2><strong>Measuring code quality</strong></h2><p><strong>In your recent write-up reflecting on how far you&#8217;ve come, you mentioned using static analysis and various heuristics to measure code quality. Can you elaborate on that?</strong></p><p>We&#8217;ve been working with a group at Stanford doing research on the impact of AI on engineering. We gave them access to our GitHub and they&#8217;ve been pulling data. It&#8217;s something we care about. We&#8217;ve always had a strong testing culture and well-understood patterns. We write code in well-understood monolith applications, so we know how to get most things done. I worry more about software architecture than the code itself.</p><p>We initially saw code quality start to decline, despite having more linters written this year than ever before and more tests written than ever before. The fundamentals had improved. We&#8217;d raised the bar on multiple quality metrics by putting AI on those problems and getting more repeatable software production by guiding Claude. But the Stanford research did indicate that quality was slipping in a few directions: greater complexity, some rework.</p><p>Then something turned a corner, and this was only in the last six weeks or so, when we happened to look back and check in. To a degree we&#8217;d already accepted that doubling engineering throughput might come with a small dip in quality; we wanted the acceleration. But today, code quality is <em>increasing</em> every single day. The overall quality of the codebase is going up. That&#8217;s a testament to the detailed work we&#8217;ve been doing on inspecting output quality, but also to things like our automatic approval process.</p><p>The automatic approval process basically says: if you produce a pull request that&#8217;s what you probably should have been creating anyway (small, using feature flags, with code observability, single-purpose, not in dangerous code paths) it gets auto-approved. People are bending their work towards the path of least friction, which is exactly where we want them to go. We use an LLM judge with all of our domain knowledge about what a great Intercom pull request looks like, and we&#8217;re now producing more of these than ever.</p><p>All those years of telling people what great looks like (the blog posts, the code quality books) that was great. But what has actually transformed code production is giving people the incentive to do the right thing through automatic approvals.</p><p>The unsolved problem I worry about is that we may be moving so fast that software architecture drifts into incoherency; you end up with something hard to move, manipulate, or reason about. I&#8217;m hedging on the assumption that we&#8217;ll soon be able to use agents to identify and solve those drift areas. But honestly, it&#8217;s nice to be worrying about higher-level issues rather than whether the code makes sense, or whether we&#8217;re producing something below what I&#8217;d expect from a senior engineer. That&#8217;s a nice problem to have.</p><div><hr></div><h2><strong>The fully loaded cost of a PR</strong></h2><p><strong>You recently made the point that enterprises thinking narrowly about token spend are underestimating the salary cost per PR. How would you advise other companies to think about the fully loaded cost of a PR?</strong></p><p>Think about fixed and variable costs, and the cost of the work being done. There&#8217;s a strong parallel to what we saw with Fin. When we launched Fin at $0.99 per resolution, we got pushback and claims from companies that this wasn&#8217;t much cheaper than what they could achieve with their own customer support team. But when we measured the all-in cost of a conversation in our own business, accounting for software licenses, management overhead, office space, all the real costs, it was over $60. And that wasn&#8217;t even maxed out.</p><p>It&#8217;s hard to think about this because people don&#8217;t often do these full-cost comparisons. But the economics work out so strongly in favor of whatever can produce the highest-quality output without human intervention. If you have a CS agent that resolves at 70% and you&#8217;re competing against one at 55%, given the all-in cost of human resolution, the 55% vendor probably can&#8217;t give it away for free.</p><p>The same is true in our engineering environment. Remove as much human work as possible so you can focus on higher-quality work. People need to compare the throughput, quality, ambition, and lack of rework against the new state with token spend factored in. Token spend matters; we do worry about it, mostly to make sure we&#8217;re not being sloppy and to find inefficiency. But it&#8217;s incredible: we have an awful and ever-growing bill, and yet the cost per change is dropping. As a business, that&#8217;s exactly what you want. A black box you put money into and get features out the other side. We&#8217;re getting more features out per dollar, with no signs of that changing. We&#8217;re going to keep that flywheel turning so we can put in even more money and get even greater multiples of features out the other side.</p><div><hr></div><p><em><a href="https://www.meetup.com/meetup-group-future-form/events/314377541/">Brian is hosting a meetup on Wednesday in London, join here!</a></em></p>]]></content:encoded></item><item><title><![CDATA[Evals with Alec Barber, OpenAI]]></title><description><![CDATA[Eval Harness vs Evals Platform]]></description><link>https://www.akashbajwa.co/p/evals-with-alec-barber-openai</link><guid isPermaLink="false">https://www.akashbajwa.co/p/evals-with-alec-barber-openai</guid><pubDate>Mon, 27 Apr 2026 06:02:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!PK0t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4><em>Upcoming Events in London</em></h4><ul><li><p>May 1st: <a href="https://luma.com/4qkv4b55">basedcollective x unicorn mafia: demo night</a></p></li><li><p>May 6th: <a href="https://luma.com/xjyyovjc">AI-Native GTM with Clay</a></p></li></ul><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Software Synthesis! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Last week, we hosted <a href="https://www.linkedin.com/in/alec-barber/">Alec Barber from OpenAI </a>for a discussion on Evals, building on the <a href="https://www.linkedin.com/posts/akashbajwa_yesterday-we-hosted-wulfie-bain-applied-activity-7364202406680666113-egZt/">first discussion we ran</a> with Wulfie Bain last August. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PK0t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PK0t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 424w, https://substackcdn.com/image/fetch/$s_!PK0t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 848w, https://substackcdn.com/image/fetch/$s_!PK0t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!PK0t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PK0t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg" width="1456" height="1941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4672576,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/195325097?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PK0t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 424w, https://substackcdn.com/image/fetch/$s_!PK0t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 848w, https://substackcdn.com/image/fetch/$s_!PK0t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!PK0t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2db07a-0cb0-490b-8354-052e71f1e9da_4284x5712.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>The central thesis: build your own eval harness</strong></h2><p><strong>Codex and Claude Code are powerful enough that you should build your eval harness yourself, tightly coupled to your AI harness, rather than adopting a generic evals platform.</strong></p><p><strong>The primitives problem.</strong> Every evals platform Alec has worked on has struggled to define a base &#8220;test case&#8221; that generalises across single-turn, multi-agent, and decision-tree architectures. His old startup built around a single-turn schema; users immediately wanted multi-agent, and the abstraction didn&#8217;t extend.</p><p><strong>Where platforms do make sense.</strong> Observability tools (Langfuse etc.) are worth using. Don&#8217;t reinvent Grafana. Hyper-focused platforms (e.g. coding-agent-specific) can deliver real value because the primitive problem is bounded. And a platform that genuinely closes the loop end-to-end would be an exception.</p><p><strong>The architectural implication.</strong> Design your AI harness for testability from day one: decomposable, inspectable, unit-test ready. Too many teams build scrappily and realise later they want evals. Alec&#8217;s recommendation: feed good eval-writing skills to Codex or Claude Code and have the agent build a bespoke harness.</p><p><strong>Defining what a harness is</strong><em>. </em>The harness is the layer around the model, e.g. the CLI for Codex or Claude Code are concrete examples. The CLI is the harness sitting on top of the model. Another example in insurance: documents go in, the system makes 20 sophisticated calls, applies regex at various steps, mixes deterministic and AI components, and eventually produces a pricing judgment. That whole system is the harness. The inference step is just one implementation detail inside it. </p><div><hr></div><h2><strong>Evaluating the evals</strong></h2><p><strong>Entropy as a signal.</strong> Run a single test case 10 times against the same model and grader. 10/10 pass or fail = low entropy, good signal. 5/5 split = high entropy, meaning the grader or test case is ambiguous. Costs 10x per case but the diagnostic value is significant.</p><p><strong>Log-prob confidence scoring.</strong> An attendee spoke about a vertical AI product that uses token-level probabilities from the OpenAI Responses API to compute a heuristic confidence score. Low-confidence outputs route to annotators who improve the dataset on the fly. The confidence-vs-performance curve isn&#8217;t perfect but is statistically validated.</p><div><hr></div><h2><strong>Dataset hygiene</strong></h2><p>A useful split:</p><ul><li><p><strong>Regression set.</strong> Stable, broad coverage. Every change is tested against this.</p></li><li><p><strong>Iteration set.</strong> Small, focused on a current failure mode.</p></li></ul><p>Fixes migrate from iteration into regression. Critically, <strong>prune the regression set over time</strong>: when a new model generation drops, some cases become trivially easy and just waste compute.</p><h2><strong>Binary benchmarks and saturation</strong></h2><p><strong>Trajectory detail gets hidden in one number.</strong> SWE-bench is binary (solved or not), masking partial correctness. One approach is using LLM-as-judge to score each step of a trajectory, aggregate, and correlate with the final outcome.</p><p><strong>Saturation. </strong>There&#8217;s a pattern where you can spend months building an eval for it to be saturated in weeks. Causes include genuine capability gains, contamination, and reward hacking.</p><p>The consensus was that the next frontier is production and real-world use, though this blurs the line between scientific benchmark and lab demo. Alec floated an economics analogy: maybe evals need to become an interpretive discipline, educated judgments from people who&#8217;ve looked at a lot of data.</p><p>Benchmark scores in SWE bench correlate heavily with other domains, raising the question of why you&#8217;d run 20,000 benchmarks when three capture the signal.</p><h2><strong>The domain-expert bottleneck</strong></h2><p>Building good evals for a domain (legal, aerospace, insurance) requires someone who is simultaneously:</p><ul><li><p>A software engineer (to build the infrastructure)</p></li><li><p>A domain expert (to know what &#8220;good&#8221; means)</p></li><li><p>A product/design thinker (to surface the right UX for evaluation)</p></li></ul><p>These unicorns are rare.</p><p><strong>One participant </strong>working in vertical AI found that domain experts kept saying &#8220;this is wrong&#8221; without articulating why. He built a small vibe-coded UI that mirrored the platform lawyers already used, letting them highlight right and wrong passages in their native context. The UI extracts tacit judgment in a form the eval pipeline can use.</p><p><strong>Alec strongly endorsed this.</strong> The answer is bespoke domain UX &#8212; not generic eval platforms. If you&#8217;re building for legal, your eval UI should look like what lawyers already work in. This makes the problem a design problem as much as engineering.</p><h2><strong>Enterprise adoption and liability</strong></h2><p>Many enterprises still make eval decisions on vibes. Adoption was framed as friction-to-set-up vs. value-delivered. In regulated industries, value is high enough (cost of a legal or insurance error is enormous) that adoption is better. Compliance departments effectively force evals as audit artefacts.</p><h2><strong>Practical recommendations for founders</strong></h2><ol><li><p><strong>Design your AI harness for testability from day one.</strong> Decomposable, unit-testable, each component inspectable. Don&#8217;t build a blob and retrofit evals.</p></li><li><p><strong>Build your eval harness yourself using Codex or Claude Code.</strong> The coupling to your AI harness is too tight to delegate to a generic platform.</p></li><li><p><strong>Use observability and dashboards off the shelf.</strong> Langfuse, Grafana &#8212; don&#8217;t reinvent these.</p></li><li><p><strong>Find good eval-writing skills online</strong> (he named Hamel Husain) and feed them to the agent as context.</p></li><li><p><strong>Invest in the domain-expert UX.</strong> Build bespoke interfaces that mirror how experts already work, so you can extract their tacit judgment without training them on eval frameworks.</p></li><li><p><strong>Maintain the regression/iteration split</strong> and prune stale tests over time.</p></li><li><p><strong>Use entropy diagnostics</strong> to identify test cases where your grader is ambiguous.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[Agent Labs: Workload-Harness Fit]]></title><description><![CDATA[Agent engineering vs full-stack training]]></description><link>https://www.akashbajwa.co/p/agent-labs-workload-harness-fit</link><guid isPermaLink="false">https://www.akashbajwa.co/p/agent-labs-workload-harness-fit</guid><pubDate>Mon, 30 Mar 2026 06:01:28 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0e6c7ea9-9675-4c58-ac37-6689232e07b2_676x362.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4><em>Upcoming Roundtables in London</em></h4><ul><li><p>April 1st: <a href="https://luma.com/9lsenhbw">Tool Use: A breakfast discussion</a></p></li><li><p>April 22nd: <a href="https://luma.com/buun0oz2">Evals with OpenAI </a></p></li><li><p>April 23rd: <a href="https://luma.com/pqb9ec1b">Mathematical Superintelligence with Harmonic</a></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p>A range of agent labs (<strong><a href="https://cursor.com/blog/composer-2">Cursor</a></strong>, <strong><a href="https://ideas.fin.ai/p/fin-apex-1-0-model-announcement">Intercom</a></strong>, <strong><a href="https://cognition.ai/blog/swe-1-6-preview">Cognition</a></strong> and <strong><a href="https://decagon.ai/blog/introducing-decagon-labs">Decagon</a>)</strong> released frontier vertical models in recent months, manifesting one of the many plausible end states for app-layer companies: vertically integrate through model training to reduce dependence on the big lans and protect margins. </p><p>The direction of travel was clear once companies like Thinking Machine Labs started abstracting infra primitives, as we <a href="https://www.akashbajwa.co/p/ai-apps-agent-labs">covered last October</a>:</p><blockquote><p><em>Fine-tuning and RL are being productised and abstracted for agent labs to capitalise on their data and distribution advantages.</em></p><p><em>Just recently <strong>Thinking Machine Labs&#8217;</strong> <a href="https://thinkingmachines.ai/blog/announcing-tinker/">Tinker </a>was the latest abstraction for fine-tuning, whilst <strong>OpenPipe</strong> (acquired by CoreWeave) announced a <a href="https://openpipe.ai/blog/serverless-rl">serverless RL product.</a></em></p><p><em>These abstractions pave the way for agent labs to start as API consumers of model labs &#8594; capture traces/evals &#8594; train <strong>narrow models</strong> (embeddings, autocomplete, router, policy) before attempting larger training runs with open-source models.</em></p></blockquote><p>More infra providers have set out to catalyse this market - here&#8217;s how <strong><a href="https://www.primeintellect.ai/">Prime Intellect</a></strong><a href="https://www.primeintellect.ai/"> </a>described the vision for their <a href="https://www.primeintellect.ai/blog/lab">recently released model training offering</a>:</p><blockquote><p><em>If every company had the same access to frontier training infrastructure, the collective creativity of the market would unlock far more breakthroughs.</em></p><p><em>We&#8217;re already starting to see enterprises and application-layer AI startups realize this. Cursor is beginning to post-train their <a href="https://cursor.com/blog/composer">own models</a> optimized directly for Cursor itself as the RL environment, to gain more sovereignty over their product stack.</em></p><p><em>We want more application-layer companies entering the training game for every vertical of the economy.</em></p></blockquote><p>There are two camps at the moment when it comes to how agent labs will focus their technical resources.</p><p>One argues that the examples set by coding and customer support will be followed by market leaders in other verticals, with research effort focused on training and updating model weights.</p><p>The second camp rails against any training, instead focusing research talent on agent engineering (e.g. context management, tool use, long-horizon tasks and everything else that makes a model production-grade for a task) or building the best &#8216;harness&#8217;.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C-ZA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C-ZA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 424w, https://substackcdn.com/image/fetch/$s_!C-ZA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 848w, https://substackcdn.com/image/fetch/$s_!C-ZA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 1272w, https://substackcdn.com/image/fetch/$s_!C-ZA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C-ZA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png" width="534" height="143" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:143,&quot;width&quot;:534,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25519,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/192080793?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C-ZA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 424w, https://substackcdn.com/image/fetch/$s_!C-ZA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 848w, https://substackcdn.com/image/fetch/$s_!C-ZA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 1272w, https://substackcdn.com/image/fetch/$s_!C-ZA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59a88c0f-67b7-40e1-9c46-615cd8798ad0_534x143.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>It depends on the workload.</strong></p><h3><strong>Workload-Harness Fit</strong></h3><p>Workloads or tasks vary by volume, value, verification properties and horizon, among other dimensions. </p><p><strong>Volume</strong> &#8212; How many times per day/week does this task execute? This determines both the economic incentive to reduce per-query cost and the rate at which your data flywheel generates training signal. Intercom's 2M weekly conversations sit at one extreme; a boutique consulting firm's quarterly strategic analyses sit at the other.</p><p><strong>Value per execution</strong> &#8212; What's the economic impact of each individual task completion? A customer service deflection might save $3-5. A correct medical diagnosis or a successful code deployment to production might be worth thousands. This determines how much you can invest in model quality per query and how much tolerance there is for error.</p><p><strong>Verification Properties</strong> - Domains differ fundamentally in how verifiable their reward signals are. <a href="https://x.com/SeanZCai/status/2034059543500742772">Sean Cai</a> outlined three verification properties: veracity (how confident are you the signal is correct?), proliferation (how widely is the signal tracked and available?), and asymmetry (how rare and expensive is the expertise needed to judge correctness?).</p><p><strong>Time horizon </strong>- How many sequential decisions, tool interactions, and context switches does the task require? A single-turn code completion is low horizon. A multi-file refactoring session with debugging is medium. A multi-week scientific experiment design is extremely high horizon. Longer horizons make both reward attribution and training rollout generation exponentially harder.</p><p>Just by looking at workloads through these dimensions, we can start to appreciate how agent labs will focus their research efforts.</p><ul><li><p><strong>Cursor and Cognition&#8217;s</strong> workloads are high volume, high value (especially as enterprise penetration increases), moderate on verification, and increasingly long time horizon with cloud/background agents. </p><ul><li><p>Given these qualities,<em> </em>pre and post-training with sophisticated multi-dimensional reward engineering was justified<em>.</em> The moderate verification means you need to invest heavily in constructing composite reward signals (correctness + style + efficiency + behavioural penalties). The medium horizons require infrastructure for long rollouts and self-summarisation. LoRA or more efficient forms of fine-tuning is probably insufficient here &#8212; Cursor explicitly does full-parameter updates because the bar is just that high when you&#8217;re trying to raise the ceiling of software engineering, not the floor.</p></li></ul></li><li><p><strong>Intercom and Decagon</strong>'s workloads are high volume, low to moderate value, clean verification, short to medium horizon. Each execution is low value individually but the aggregate volume is massive, creating strong economic incentive to reduce per-query cost. Verification is clean &#8212; the ticket was resolved or it wasn't. Horizons are short &#8212; typically single-turn or a few exchanges. The data flywheel spins fast because you get millions of labelled outcomes per month.</p><ul><li><p>The volume justifies the training investment in a full post-training pipeline on a strong base open-source model through inference cost savings alone, let alone the performance improvements that Decagon and Intercom are showing. The clean verification signal makes RL tractable. The short horizons mean rollouts are cheap to generate. This is where the Intercom/Apex playbook works best, and where even LoRA-scale fine-tuning on platforms like Prime Intellect could deliver meaningful gains for smaller players. It&#8217;s worth noting Intercom was early to recruit a research team of scientists, ahead of companies like Decagon and Sierra.</p></li></ul></li><li><p><strong>Harvey and Legora&#8217;s</strong> workloads are moderate volume, high value, moderate on verification, medium horizon. Volume is meaningful but not massive (thousands rather than millions of executions per week). Value per execution is high. Verification is moderate &#8212; an expert can evaluate quality, but it's expensive and slow. Horizons involve multi-step reasoning but typically within a bounded scope.</p><ul><li><p>The jury is out on what approach is best. Harvey allegedly attempted training runs in the past, whilst Legora has stayed clear of any training and fully focused on building the best harness for Anthropic&#8217;s models. </p></li></ul></li></ul><p>The taxonomy of workloads governs which end markets justify training versus agent engineering. But knowing where you sit on the spectrum is only part of the evaluation, the other half is what it actually costs to execute. Cursor's Composer 2 technical report is the first detailed look at the infrastructure overhead required to operate at the rightmost end of this spectrum. .</p><h3>The Training Imperative</h3><p><strong>The training pipeline</strong></p><p>Composer 2&#8217;s development follows a two-stage process. The first stage of continued pretraining took an existing open-weight model (<strong>Kimi K2.5,</strong> a 1 trillion parameter model from <strong>Moonshot AI)</strong> and trained it further on a massive code-dominated dataset, progressively extending its ability to process longer sequences of code. This stage builds the model&#8217;s foundational understanding of programming languages, APIs, and software patterns. The second stage of reinforcement learning is where the model learns how to <em>work</em> as a developer. The model is dropped into realistic coding environments, given tasks derived from actual developer workflows (feature iteration, debugging, refactoring, code review, documentation), and scored on the quality of its end-to-end solutions. Cursor demonstrated empirically that the first stage reliably predicts the success of the second: lower pretraining loss consistently translated to higher RL reward, justifying the investment in pre-taining.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EpIP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EpIP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 424w, https://substackcdn.com/image/fetch/$s_!EpIP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 848w, https://substackcdn.com/image/fetch/$s_!EpIP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 1272w, https://substackcdn.com/image/fetch/$s_!EpIP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EpIP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png" width="302" height="237" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:237,&quot;width&quot;:302,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18128,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/192080793?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EpIP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 424w, https://substackcdn.com/image/fetch/$s_!EpIP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 848w, https://substackcdn.com/image/fetch/$s_!EpIP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 1272w, https://substackcdn.com/image/fetch/$s_!EpIP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27d8551f-21b2-483e-92f6-29d47e57e1e9_302x237.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>The infrastructure overhead</strong></p><p>Composer 2&#8217;s training run spanned three GPU regions and four CPU regions globally. The RL phase alone required hundreds of thousands of isolated virtual machines (Firecracker VMs on Cursor&#8217;s internal compute platform, Anyrun) to simulate realistic coding environments, each capable of running a full development setup including a browser. These environments needed to be spun up at a rate exceeding 500 pods per second to keep pace with the bursty nature of RL training workloads. Cursor partnered with Fireworks AI for inference during training, running distributed clusters across the US and Europe. The team also wrote custom GPU kernels targeting NVIDIA&#8217;s latest Blackwell hardware, including a modified low-precision number format that they found necessary to prevent training instability. The contributor list runs to approximately 50 people! This resembles a global ML infrastructure operation.</p><p><strong>Reward engineering and behavioural shaping</strong></p><p>Several of Cursor&#8217;s most consequential design choices concern how the model is rewarded during RL training, and these reveal the depth of product-informed thinking embedded in the pipeline. Beyond the primary correctness signal, Cursor applies a nonlinear length penalty that penalises the model more steeply for unnecessary effort on simple tasks while tolerating extended reasoning on complex ones, incentivising the model to be fast when it can and thorough when it must. They layer in auxiliary rewards for coding style, communication quality, and product-specific behaviours, including explicit penalties for patterns like creating TODO items and leaving them unfinished. The team actively monitors for emergent problematic behaviours during training and introduces corrective rewards reactively &#8212; for instance, when the model began leaving chains-of-thought in code comments or collapsed to using only the terminal tool. The self-summarisation technique, inherited from Composer 1.5, allows the model to chain multiple generations with intermediate summaries within a single training rollout, with the final reward flowing back to upweight good summaries and downweight lossy ones.</p><p><strong>Benchmarking philosophy: why CursorBench matters more than public scores </strong>Existing public benchmarks are increasingly unreliable indicators of real-world coding agent quality, with four structural problems: domain mismatch (SWE-bench focuses narrowly on bug-fixing), prompt over-specification (public benchmarks assume unnaturally explicit instructions), data contamination (OpenAI suspended SWE-bench Verified reporting after finding models could generate solutions from memory), and narrow evaluation scope (only functional correctness, ignoring code quality, latency, cost, and interactive behaviour). Cursor&#8217;s response is CursorBench, an internal evaluation suite built from real coding sessions of their engineering team. The structural contrast is stark: CursorBench tasks require a median of 181 lines of code changes versus 7&#8211;10 for SWE-bench, while task prompts are significantly shorter and more ambiguous &#8212; mirroring the underspecified nature of real developer requests. The benchmark is continuously refreshed as developer workflows evolve, with each iteration increasing in complexity. </p><p>Despite the enormous effort, training Composer 2 was imperative for Cursor. </p><p>Developers will pick the best harness for their workloads and Composer 2, a SOTA model at a fraction of the cost, has a chance at winning back both old and new workloads.</p><h3>Short AGI?</h3><p>There&#8217;s a real risk of obsolescence from large models, but that logic is not too different to saying two companies (or one, if you ask most people) will capture all of the value created by AI in the coming decades.</p><p>John Collison <a href="https://cheekypint.substack.com/p/bret-taylor-of-sierra-on-ai-agents?triedRedirect=true">asked</a> Bret Taylor half-jokingly whether Sierra is itself premised on a world without AGI:</p><blockquote><p><em><strong>John:</strong> Now we get to it, which is you described building stuff that you know you&#8217;re going to throw away because the model capabilities will get there and you&#8217;re occasionally... They are developing capabilities that you developed yourself. Isn&#8217;t Sierra itself &#8220;short AGI?&#8221; Sorry, I said I couldn&#8217;t resist.</em></p><p><em><strong>Bret:</strong> The short answer is I don&#8217;t know. I mean, the fog of war in the software industry is pretty thick right now. I really believe in the applied AI market, though. I think most companies don&#8217;t want to buy models or buy software. They want to buy solutions to their problem.</em></p></blockquote><blockquote><p><em>I think there&#8217;s so much nuance in how these companies align themselves with different departments at these companies, solve their very unique problems in very specific ways. That is a mix of product, not technology, but product, go-to-market. It&#8217;s an ecosystem around it. I think a lot of that still exists because I&#8217;m not sure coding the software was necessarily the hard part.</em></p></blockquote><blockquote><p><em>I think the reason for it is most business users want actual solutions to their problems, and they want a company that serves their unique problems in a very specific and bespoke way. I actually am extremely bullish on applied AI. I actually think we could accelerate. I&#8217;ll make one statement, which is I think if we paused model development, we&#8217;d still have trillions of dollars of economic value.</em></p></blockquote><blockquote><p><em>I think not only am I somewhat skeptical that there will only be two companies in the world, I actually think one of the main things impeding adoption of AI is the lack of existence of all those other companies.</em></p></blockquote><p>The models are getting better faster than any of us could have ever imagined, with the <a href="https://x.com/AndrewCurran_/status/2037967531630367218">leaked details of Anthropic&#8217;s Mythos</a> suggesting we&#8217;re about to experience another step-change improvement in capabilities.</p><p>But focus matters, as OpenAI has now admitted.</p><p>Startups <strong>will</strong> create enduring value and technical differentiation in some shape or form <strong>will</strong> play a role.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Software Synthesis! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Future Of Software Engineering with Anthropic]]></title><description><![CDATA[With Ash from Anthropic and Sivesh from Balderton Capital]]></description><link>https://www.akashbajwa.co/p/the-future-of-software-engineering</link><guid isPermaLink="false">https://www.akashbajwa.co/p/the-future-of-software-engineering</guid><pubDate>Mon, 16 Mar 2026 07:01:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zY25!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach me on<a href="https://www.linkedin.com/in/akashbajwa/"> LinkedIn </a>and <a href="https://x.com/AkashBajwa96">X</a>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join thousands of readers from OpenAI, Databricks, Stripe, Figma and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p><a href="https://www.linkedin.com/in/sivesh/?originalSubdomain=uk">Sivesh</a> and I recently hosted a roundtable on the future of software engineering with Anthropic&#8217;s <a href="https://www.linkedin.com/in/ash-prabaker/?originalSubdomain=uk">Ash Prabaker</a> and we were joined by engineering leaders from Stripe, NVIDIA, Microsoft, Google DeepMind, xAI, Apple, Scale AI, as well as the legend Peter Steinberger of OpenClaw/OpenAI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zY25!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zY25!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zY25!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zY25!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zY25!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zY25!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg" width="1456" height="1941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:571387,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/190850800?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zY25!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zY25!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zY25!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zY25!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ff4479f-fb33-4993-91af-59edddaacf80_1536x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Origins of Claude Code</h3><p>The session opened with a retelling of the Claude Code origin story, much of which has been covered in public interviews. It began as a simple terminal UI in late 2024, was rough at first, and was built against a guiding principle of designing for where models would be in six to twelve months rather than where they were that day. Adoption was organic &#8212; an IC-driven project that scaled through demonstrated value rather than mandate.</p><h3>The Recursive Improvement Thesis</h3><p>A major thread throughout the discussion was &#8220;closed-loop&#8221; development. One participant described a setup at their company where bug reports are automatically triaged by an agent, bucketed by severity, checked against an eval set, and then a fix PR is opened &#8212; much of it running with minimal human touch. The room broadly agreed that this kind of loop is where compounding gains actually come from: better coding tools improve the models, better models improve the coding tools. Several people noted their companies are prioritizing coding specifically because of this dynamic.</p><h3>How Workflows Are Changing</h3><p>Participants compared notes on what&#8217;s shifting in their engineering practice:</p><ol><li><p>Test-first has become the default. Multiple people said they now define test cases first and let the agent build against them &#8212; described as the only sane way to handle the volume of PRs being generated.</p></li><li><p>Two tiers of evals. One participant outlined their team&#8217;s approach: regression evals that must stay at 100% and run on every PR, plus frontier evals for new capabilities. Others in the room recognized the pattern.</p></li><li><p>Don&#8217;t mandate adoption. There was strong consensus here. One attendee described using competitions, hackathons, and casual incentives instead of top-down requirements &#8212; arguing that forced usage breeds resentment, whereas letting people see early adopters&#8217; results drives proliferation naturally.</p></li><li><p>Code review is in flux. One participant admitted that human reviewers at their company often just click approve within minutes because the AI review layer has gotten good enough. When pushed on where this ends up, they acknowledged the mandatory-human-review model will eventually become inefficient &#8212; and suggested they may already be past that point for some repos. This landed with a mix of recognition and discomfort around the table.</p></li><li><p>Comments are back. A cultural reversal several people found amusing: engineers initially hated the verbose comments agents generated, but the consensus is now swinging toward leaving them in, because the next agent session finds them useful. One person put it as &#8220;we&#8217;re writing code for AI readability as much as human readability now.&#8221;</p></li><li><p>Life in the terminal. One participant described their personal workflow as: plan, verify the plan, implement via the agent, move on &#8212; without reading generated code line by line. This prompted some debate about when that&#8217;s safe and when it isn&#8217;t.</p></li></ol><h3>What Still Gets Scrutiny</h3><p>Not all code is treated the same. Participants generally agreed that anything involving destructive actions (data loss, permission escalation) or core infrastructure deserves higher human review, while internal prototypes don&#8217;t need the same bar as public-facing code. Where exactly to draw the line varied by company.</p><h3>The Bottleneck: Long-Horizon Tasks</h3><p>The room converged on long-horizon tasks as the real frontier problem. One participant noted that product engineering has started to go exponential for them, but closing the loop on more complex research workflows isn&#8217;t there yet. The open questions everyone shared: what do you actually assign an agent for a four- or five-hour run? How do you observe it? How do you keep a human in the loop without babysitting? Nobody had a clean answer.</p><h3>Infrastructure and Sandboxing</h3><p>Discussion of how the industry has swung on sandboxing &#8212; first toward it for safety, then away from it for convenience, now back toward it with more nuance (remote coding agents, sandbox-per-session). The practical pain points people raised were compute for long-running sessions, permissioning, and enterprise deployment.</p><h3>Observability and On-Call</h3><p>One participant described early-stage internal prototypes where agents with access to logs, source control, and chat systems handle incident triage and debugging &#8212; reducing the on-call burden even though the systems aren&#8217;t production-grade yet. A side effect several people found interesting: engineers without infra backgrounds can now contribute to infra work because agents fill the knowledge gaps.</p><h3>Context Management</h3><p>Someone asked how you manage context at scale when thousands of people are changing things every minute. The honest answer from the room was that nobody has this figured out. One participant admitted their approach is basically unstructured &#8212; ad hoc chat threads that agents get MCP access to read, plus a strong writing culture but no formal documentation process.</p><p>A study was mentioned suggesting that pre-loaded markdown context files can sometimes underperform versus letting agents traverse the codebase from first principles. The counter offered was that this probably reflects stale or agent-generated context. The takeaway people seemed to agree on: human-authored context files help, agent-authored or stale ones can actively hurt. Humans have to supply the insight.</p><h3>Hiring</h3><p>When hiring came up, the most striking claim was that the trait one participant now screens hardest for isn&#8217;t raw engineering skill &#8212; it&#8217;s willingness to experiment constantly at the bleeding edge. Their best performers are the ones who understand model limits deeply enough to know when to trust the output and when to intervene. Another attendee noted their core infrastructure teams have stayed lean because AI-assisted cross-pollination lets product engineers contribute outside their usual domain.</p><h3>SaaS Under Pressure</h3><p>This got lively. </p><p>Participants traded stories about which tool categories they&#8217;ve replaced internally:</p><ul><li><p>Incident management &#8212; one person said their team ripped out their vendor because it was too complicated for how people actually worked.</p></li><li><p>Auth layers &#8212; one participant claimed to have migrated auth systems several times in six months, each migration taking hours not weeks.</p></li><li><p>Project tracking &#8212; someone is building custom UIs on top of their coding agent for managing engineering work, and floated that this whole category might be next.</p></li><li><p>Internal micro-tools &#8212; link shorteners and similar utilities were the easy wins several people mentioned.</p></li></ul><p>The pattern everyone noticed: it&#8217;s all developer tooling so far, because that&#8217;s where engineers have agency and speed. Business-facing software (CRMs, etc.) is stickier. One view was that incumbent business tools survive not because they&#8217;re good but because nobody has shipped a compelling AI-native replacement &#8212; just incremental add-ins.</p><p>A counterpoint from the room: the opportunity-cost argument (&#8221;we should focus on what we&#8217;re best at&#8221;) may always hold, which means labs might never prioritise building SaaS replacements over improving models.</p><h3>The &#8220;Everything Is an Option&#8221; Problem</h3><p>A startup founder in the room raised the flip side: because AI makes everything feasible, prioritisation is harder, not easier. Six months ago, rebuilding a tool internally was obviously not worth it. Now it takes a night. Teams get overloaded by the sheer volume of things they could do. Nobody had a great answer beyond defining clear swim lanes and giving individuals ownership of mini-companies inside the org.</p><h3>Code Quality</h3><p>When someone asked about code quality standards, the response was that the definition is shifting. &#8220;Good code&#8221; used to mean human-centric things &#8212; simple, easy to maintain, easy to contribute to. Now it has to account for AI readability too. The practical view from the room: strong regression evals and test-first discipline matter more than clean-code aesthetics.</p><h3>Design Taste and Slop</h3><p>&#8220;The purple gradient vibe&#8221; got a laugh &#8212; everyone recognized the AI-generated-UI aesthetic. The catch-22 someone identified: if you update a model&#8217;s taste profile, everyone uses it, and the new aesthetic just becomes the next generation of slop. Someone also noted that some models actively steer users toward particular frameworks, which functions as a form of lock-in.</p><h3>Convergence Risk</h3><p>One attendee raised a concern that everyone coding with the same models making the same suggestions will collapse the industry onto the same tools and patterns. The pushback was that this was a bigger risk with earlier model generations, which were much stronger at popular web stacks than at legacy or niche languages &#8212; and that gap is closing. Code modernisation of legacy systems came up as an area improving fast.</p><h3>Background Agents</h3><p>General agreement that the direction of travel is toward asynchronous background agents &#8212; remote sandboxes, monitorable from a phone, persisting across hours or days. One person noted that multi-hour autonomous runs are only recently becoming routine for them, having been experimental until not long ago.</p><h3>Model vs. Harness</h3><p>Asked how much recent improvement is model weights versus harness, one participant&#8217;s view was that both matter but on different cadences &#8212; big leaps come from model steps, and the harness philosophy should be &#8220;get out of the way of the model.&#8221; They described a stripped-down prototype &#8212; basically a current-gen model with a system prompt and bash access &#8212; that performs surprisingly well, which wouldn&#8217;t have worked a few generations ago.</p><h3>Regulated Industries</h3><p>Someone from a fintech background asked about regulated deployment. The room&#8217;s read: the most successful AI startups in regulated industries (legal tech was the example) are still fundamentally human-in-the-loop chat-with-document products. Nobody has made the jump to autonomous agents in regulated workflows. The bar is asymmetric &#8212; analogous to self-driving cars, where AI has to be dramatically better than a human to be accepted. Better explainability and structured audit trails were floated as the unlock.</p><h3>Orchestrating Multiple Agents</h3><p>Refreshingly low-tech answer: git worktrees and ten terminal tabs. More sophisticated orchestration is being built by third parties, but nobody in the room claimed to have it solved.</p><h3>The Digital Transformation Irony</h3><p>Someone observed that getting engineers to adopt AI tools &#8212; finding champions, overcoming resistance, managing change &#8212; is exactly the digital transformation problem other industries have faced for years. The irony of applying it to the engineers who built those transformation tools was not lost on the room.</p><h3>The Arc of Programming Languages</h3><p>Closing question: will agents start writing closer to the metal, bypassing the abstraction layers that exist for human convenience? The view was yes, eventually, but only when the model decides it serves performance &#8212; not because lower-level code is easier for models. Current models still benefit from well-structured, well-commented, human-readable code. Someone noted a trend toward Rust in startups, driven partly by AI flattening the learning curve.</p><h3>Key Takeaways</h3><ul><li><p>The recursive loop is real. Better coding tools produce better models, which produce better coding tools. Multiple participants said this is why their companies are prioritising coding.</p></li><li><p>The bottleneck has moved from writing code to managing long-horizon tasks and deploying agents in regulated settings.</p></li><li><p>Developer tooling is being displaced first. Business-facing software with network effects is holding.</p></li><li><p>The human role is shifting from writing and reviewing to planning, evaluating, and steering &#8212; and the best performers are the ones who stay at the bleeding edge.</p></li><li><p>Enterprise adoption is gated by permissioning, sandboxing, and regulatory caution more than by model capability.</p></li><li><p>Slop and convergence are real concerns when millions of people use the same models to make the same choices.</p></li><li><p>Context remains unsolved. Human-authored context helps; stale or agent-generated context can hurt.</p></li></ul><div><hr></div><p><em>Have any feedback? Reach out on <a href="https://www.linkedin.com/in/akashbajwa/">LinkedIn</a> or <a href="https://x.com/AkashBajwa96">X</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Application Software: Earnings Recap]]></title><description><![CDATA[Survival In The AI Age]]></description><link>https://www.akashbajwa.co/p/application-software-earnings-recap</link><guid isPermaLink="false">https://www.akashbajwa.co/p/application-software-earnings-recap</guid><pubDate>Mon, 09 Mar 2026 07:02:26 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b1081fe7-4ad6-45e3-851b-5598b3e44941_508x464.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach me on<a href="https://www.linkedin.com/in/akashbajwa/"> LinkedIn </a>and <a href="https://x.com/AkashBajwa96">X</a>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join thousands of readers from OpenAI, Databricks, Stripe, Figma and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Every public software CEO fielded the same set of questions about their right to exist in the age of agents. The discourse on the end of software rages on as startups substitute app software with custom tooling developed internally and labs productise aggressively in all directions. </p><p>This AI-pilled worldview would see most established enterprise software companies get wiped out and replaced by a combination of AI vendors and in-house builds.</p><p>I&#8217;m of course talking my book when I argue for the rotation of market share from incumbents to disruptors, but there&#8217;s more nuance to this than my X feed cares for.</p><p>Let&#8217;s look through recent earnings commentary from application software companies to unpack how they&#8217;re faring. I chose MNDY, HUBS, INTU, KVYO, BOX, and SAP<strong> </strong>as a broad basket covering SMB to enterprise apps. </p><h3><strong>Application Software</strong></h3><p><strong>MNDY</strong></p><p>FY2027 forecasts were scrapped as a result of higher uncertainty around MDNY&#8217;s historical strength in the self-serve SMB segment:</p><blockquote><p><em>Given the evolving nature of the AI landscape and the choppiness in the no-touch demand environment, we believe it is responsible to keep our near-term communication focused on what we can execute and deliver with high confidence.</em></p></blockquote><p>MNDY can&#8217;t reliably model its own self-serve funnel, which historically represented the majority of new customer acquisition. </p><p>The ZIRP-fuelled PLG days ended a while ago, but it&#8217;s still striking that MNDY deemed the no/low-touch channel structurally choppy. </p><p>Disaggregating the drivers of this choppiness is tricky, but commoditisation of the SMB offering by the AI labs is one plausible driver.</p><p>On the other end, $50k+ customers now represent 41% of total ARR and have 91% gross retention, whilst $100K ARR saw record net adds, and the $500K+ cohort grew 74% - the move upmarket seems to be working. It just might not be working <strong>enough</strong> to offset deceleration downmarket.</p><p>Margins came down a bit, reflecting investment in the upmarket GTM and AI, but pricing is also adapting:</p><blockquote><p><em>So our AI capabilities are foundational in our platform. They&#8217;re embedded in our workflows. And based on the feedback <strong>we&#8217;ve gotten from customers, they really enjoy the predictability of PPU pricing</strong>, and they like to consume those capabilities that way. With that said, some of the more compute-intensive workloads that drive outputs with these workloads, <strong>we are charging and monetizing that through credits.</strong> and our customers like the <strong>mix of both of those.</strong> </em></p></blockquote><p>The AI SKUs are: AI agents for workflow automation, Monday Vibe for application development, and Sidekick for information retrieval. Traction is still early.</p><p><strong>HUBS</strong></p><p>On whether HUBS&#8217; data will be consumed by third-party agents and undermine the role of system of record, CEO Yamini Rangan&#8217;s answer focused on business logic and domain context:</p><blockquote><p><em>Our strategy, as I just articulated, is to be that intelligent system of customer content and we have the data, but more importantly, we have the business context, the industry context and the domain context to deliver it. And that&#8217;s why customers come to us. They rely on us for that context. They want to use more of our APIs and partners want to customize and build on top of us. And as AI adoption accelerates, the value of our agentic platform increases.</em></p></blockquote><p>Followed by (in my view, the weaker argument) determinism:</p><blockquote><p><em>SaaS platforms are more than data. It is the logic, right? You can certainly get a nondeterministic output for a sales e-mail, but try taking a nondeterministic output for your sales forecast. That is not possible. It&#8217;s workflows like forecasting, routing, approvals, permissions, that is logic. It&#8217;s not data to be sucked away. And ownership accountability and governance, all of those lives inside applications.</em></p></blockquote><p>Like MNDY, HUBS prices on seats and credits, with the customer support agent driving ~60% of credits consumed, with prospecting agent, data agent, and intent monitoring each contributing 10-15%.. Intriguingly, the support agent&#8217;s stated resolution rate of mid-60s is in the ballpark of private darlings like Sierra and Decagon.</p><p>Credits are <strong>not yet a material contributor to reported revenue,</strong> but the trajectory is clear: this is a FY2026 emerging tailwind that could become a FY2027 growth driver.</p><p>Management reiterated the core ICP of companies with 2 to 2,000 employees, with strong traction at the upper and multi-product penetration: deals over $5K MRR grew 33%, deals over $10K MRR grew 41%, and customers with 500+ seats grew fivefold. Multi-hub adoption among new Pro Plus customers hit 62%, with 40% of the installed Pro Plus base (by ARR) now owning 4+ hubs, up 6 points year-over-year. </p><p>As a sign of yet another AI-native buying third-party SaaS instead of building in-house, <strong>Lovable is a customer of HUBS -</strong> here&#8217;s Dharmesh Shah&#8217;s take:</p><blockquote><p><em>So just because it&#8217;s possible. So we have, as Yamini mentioned, a large engineering team knows what they&#8217;re doing, spending 97% of their kind of calories using agent coding tools, they&#8217;re not doing it to replace internal platforms. So we think the best companies, both AI and non-AI will not be using Vibe coding to replace core systems. They&#8217;ll be doing it to add value to their customers. That&#8217;s what Lovable doing. I think that&#8217;s what some of the best companies in the world will continue to do.</em></p></blockquote><p>HUBS is seeing the same declines in organic search as a channel as other software companies and has leaned into AEO.</p><p><strong>INTU</strong></p><p>Having announced apps on ChatGPT&#8217;s app store and a partnership with Anthropic to access Intuit products inside Claude, CEO Sasan Goodarzi addressed the elephant in the room:</p><blockquote><p><em>And it&#8217;s also -- the thing I would point out is it&#8217;s why companies like OpenAI, companies like Anthropic look to the partnership with us because at the end of the day, they see and understand that this is a business that comes with a lot of liability and LLMs can&#8217;t just create the platform that we&#8217;ve created overnight.</em></p></blockquote><p>Time will tell if relinquishing the interface to the end customer to the labs is a wise decision. The question of the value of incumbent apps where the product is primarily consumed by agents will only get louder in the future.</p><p>Taking liability is a recurring argument being touted in defence of vertical AI lately, a point INTU reiterated.</p><blockquote><p><em>In our category, accuracy, compliance, security, reliability of financial decisions, and the liability that comes with it are critical to our customers. It&#8217;s our advantage and it&#8217;s why we win.</em></p></blockquote><p>Management presented a narrative of AI enhancing Human Intelligence, i.e. AI-enabled services.</p><blockquote><p><em>Our success rests on our powerful combination of proprietary data, domain-specific AI platform capabilities and AI-powered human intelligence, which we&#8217;ll refer to as HI.</em></p><p><em>Our system of intelligence combines AI and HI to deliver done-for-you experiences with accuracy, compliance, security, reliability and data privacy that create a durable competitive advantage. This foundation delivers what matters most to customers when it comes to financial insights, money management, taxes, bookkeeping and accounting, leading to complete confidence in their high stakes financial decisions.</em></p></blockquote><p>AI-enabled services firms have raised boatloads of funding across law, insurance, accounting, banking and healthcare. The AI+HI traction INTU is seeing is a strong data point for the services-as-software thesis..</p><p>INTU&#8217;s agents are being rolled out at scale, with 3 million customers using them and repeat engagement of more than 85%.</p><p>INTU sees 3 monetisation levers: agents drive higher willingness to pay for subscriptions, agents drive cross-sell, agents drive higher services adoption (HI) - no credit-based consumption as less likely to see compute-intensive workloads relative to MDNY and HUBS.</p><p>Mailchimp continues to struggle, speaking to broader SME weakness that the MNDY no-touch performance already hinted at. INTU&#8217;s push into mid-market ERP is going much better, with new IES contracts growing nearly 50% quarter-over-quarter. </p><p><strong>KVYO</strong></p><p>KVYO&#8217;s business model is more naturally aligned with consumption.</p><p>Revenue scales with active consumer profiles and message/interaction volume, not with the number of humans operating the platform. When AI agents generate more campaigns, send more messages, handle more customer service conversations, and drive more transactions, Klaviyo&#8217;s revenue grows automatically because interaction volume increases.</p><p>When Marketing Agent enables a small team to run more campaigns without adding headcount, Klaviyo captures the increased message volume regardless of how many humans were involved in creation.</p><p>KVYO&#8217;s AI uptake might be one of the most underrated stories in public markets right now: for Marketing Agent adopters, more than half of campaigns are now AI-generated, performing as well as or better than manually created campaigns while taking significantly less time. Customer Agent resolution rates increased 20 points since launch, with monthly resolution volume up 50%+ since Black Friday/Cyber Monday. 85%+ all-time repeat engagement across 3 million+ customers using agents. </p><p>In keeping with this growth, the company is reframing its category - marketing automation is out, &#8216;autonomous B2C CRM&#8217; is in.</p><blockquote><p><em>Our technology marries the customer database we founded Klaviyo on and our robust marketing messaging infrastructure with agents for marketing and customer service that will autonomously create, deliver and optimize customer experiences on behalf of a business. And this agent layer that is designing and delivering experiences to billions of consumers is trained from our deep expertise and the trillions of experiences we&#8217;ve delivered for businesses already.</em></p></blockquote><p>Enterprise traction accelerated meaningfully in FY2025. Customers with $50K+ ARR grew 37% to 3,912, with 349 net new additions in Q4 alone &#8212; beating the previous record by 25%. Customers generating $1M+ ARR doubled year-over-year. It&#8217;s no surprise KVYO hired ex-WDAY co-CEO Chano Fernandez as their new co-CEO. Another sign of the move upmarket is an Accenture partnership who are building practices around &#8220;marketing reinvention and service reinvention&#8221; powered by Klaviyo, targeting the largest consumer brands globally. </p><p>FY2026 revenue guidance of $1.501-$1.509B (21.5-22.5% growth) represents a meaningful deceleration from 32% in FY2025, but this assumes minimal contribution from AI. </p><p>KVYO&#8217;s answers on long-term durability were grounded in technical differentation rather than business context:</p><blockquote><p><em>Our ability to be the agent and the platform of choice finds its roots in how our platform was built. This is our durable advantage. At the core is the database and data infrastructure, specifically built to handle the scale of consumer data and indexing and enriching through machine learning and serving hundreds of thousands of requests <strong>per second with millisecond latencies.</strong></em></p><p><em>It allows Klaviyo to ingest, aggregate and govern first-party data in real time, so every consumer behavior, transaction, preference and consent is available to our users and now to our agents to deliver the best possible consumer experience. That foundation is coupled with our marketing platform, which not only provides high throughput, scalable systems to render messages, deliver them with excellent deliverability and enforce compliance, but also integrates with our customer data infrastructure to make last-mile personalization decisions on content, incentives, timing and channels at the moment we deliver a message or experience to an end consumer. <strong>Speed and scale matter.</strong> </em></p></blockquote><p>Like INTU, KVYO supports Claude via MCP and has an app for ChatGPT, but remains confident in its infrastructure ensuring durability. </p><p>Lastly, the company is one of the best example of operating leverage and margin expansion through AI, with operating margins expansion 900bps YoY in Q4.</p><p><strong>BOX</strong></p><p>Aaron Levie is one of the most AI-pilled enterprise software CEOs and he deftly managed to tie file-based systems as the winning agent tool-calling paradigm to Box&#8217;s strength in enterprise content management:</p><blockquote><p><em>Files are quite simply the native unit of work for agents. Agents use files to keep track of their work&#8230;</em></p><p><em>Thus, to have an effective AI agent strategy, companies fundamentally need a content strategy. They need a secure platform to manage critical content and ensure it can connect to all of their people, agents and applications. This is what we&#8217;re building at Box with our Intelligent Content Management Platform.</em></p></blockquote><p>The narrative being proposed is that agent proliferation = inflection in volume of enterprise content that needs to be managed, governed, and secured. Every Claude Cowork session, every OpenClaw task, every custom agent workflow generates files that need to persist beyond the stateless agent session. The agent&#8217;s compute environment may disappear, but the documents it produced, the contracts it reviewed, the research it synthesised &#8212; all of that must be retained, governed, and discoverable.</p><p>The strategic implication is that BOX&#8217;s TAM grows with agent adoption rather than being threatened by it. More agents = more files = more content management demand. This is why Levie described Claude Cowork and OpenClaw as &#8220;universally good things&#8221; for BOX &#8212; every agentic knowledge work session creates content that needs a secure home.</p><p>BOX is seeing its AI SKUs drive enterprise penetration and expansion. Enterprise Advanced launched approximately one year ago and already accounts for 10% of Box&#8217;s revenue. The pricing uplift from Enterprise Plus to Enterprise Advanced has been 30-40% on a per-seat basis &#8212; at the high end of the 20-40% range initially anticipated. Total Suites customers now represent 66% of revenue, up from 60% a year ago.</p><p>Even so, top-line growth was still in the single-digits at 9%.</p><p>On monetisation, BOX sees themselves as serving agent demand through API consumption in a headless-first world, or more seats if human-in-the-loop use cases grow. The consumption model is still immaterial overall, but might be the future when the agent-first economy takes off.</p><p><strong>SAP</strong></p><p>SAP has leaned into the &#8216;Business Data Cloud&#8217; with a framing that foundation models lack sufficient business context to be reliable. Citing the example of recently won customers like H&amp;M, CEO Christian Klein emphasised SAP&#8217;s position as embedded across mission-critical workflows.</p><blockquote><p><em>And so when I think about the future of AI and SAP, I&#8217;m super happy that I have our ERP. I&#8217;m super happy that I have our apps because without those apps, I wouldn&#8217;t have the data. And without the data, I wouldn&#8217;t have an AI.</em></p></blockquote><p>Management also stressed their right to win against a backdrop of geopolitical turbulence:</p><blockquote><p><em>While geopolitical and trade tensions have taken a certain toll on our top line performance in 2025, the growing need for sovereignty and resilience also offers unique opportunities for those vendors that could offer technologies and tools to reduce dependencies from dominant offering. As largest non-U.S. software, SaaS and PaaS vendor, there is no company better positioned than SAP to satisfy this rapidly growing demand.</em></p></blockquote><p>Two-thirds of deals closed with AI in some shape or form. </p><p>Looking across MNDY, HUBS and SAP, there&#8217;s an argument for SAP&#8217;s position being strongest because of the criticality of the data they sit on: ERP, supply chain, manufacturing. </p><div><hr></div><h3>Conclusions</h3><ul><li><p>AI impact on revenue still early, particularly the premise of agentic consumption of APIs or credits</p></li><li><p>SMB segment showing weakness, indicating both limits of PLG and competitive threat from labs&#8217; prosumer AI assistants</p></li><li><p>Companies are either leaning on accumulated business logic as a durable asset in a future agent economy where software is headless, or on sheer infrastructure differentiation</p></li><li><p>Enterprises still value governance and security of AI, as well as counterparties owning liabilities; will incumbents win on this vector?</p></li><li><p>Partnerships with labs via connectors and MCPs not seen as major threat to enterprise value, even if surface area of product and customer interactions shrinks; no rev-share or data being transferred. How will these lines evolve?</p></li></ul><div><hr></div><p><em>Have any feedback? Reach out on <a href="https://www.linkedin.com/in/akashbajwa/">LinkedIn</a> or <a href="https://x.com/AkashBajwa96">X</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[The Future Of Compute]]></title><description><![CDATA[Nvidia, ASICs, Compilers]]></description><link>https://www.akashbajwa.co/p/the-future-of-compute</link><guid isPermaLink="false">https://www.akashbajwa.co/p/the-future-of-compute</guid><pubDate>Wed, 04 Mar 2026 07:03:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cWKe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach me on<a href="https://www.linkedin.com/in/akashbajwa/"> LinkedIn </a>and <a href="https://x.com/AkashBajwa96">X</a>.. </em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join readers from OpenAI, Databricks, Stripe, Figma and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h4><strong>Roundtables</strong></h4><p><strong>March 5</strong>: <a href="https://luma.com/g22zhqqa">The Future of Software Engineering with Anthropic</a></p><div><hr></div><p>Last week, we hosted a roundtable on the future of AI compute with <a href="https://www.linkedin.com/in/michaelsondergaard/">Michael</a>, CEO of <a href="https://spectralcompute.com/">Spectral Compute</a>, who have built a compiler that compiles CUDA source code directly to native machine instructions for non-NVIDIA GPUs. We were joined by attendees from Graphcore, Cerebras, DeepMind and more.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cWKe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cWKe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cWKe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cWKe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cWKe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cWKe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:367074,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/189568484?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cWKe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cWKe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cWKe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cWKe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ff8bfb-8251-48f6-a763-3e939ad1bff8_2048x1536.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>1. Nvidia&#8217;s Monopoly Position</strong></h2><p>Michael opened by framing the current state of AI compute as a &#8220;monopoly moment.&#8221; Key points:</p><ul><li><p><strong>Nvidia controls 80%+ of datacenter GPU market share</strong> and briefly touched $5 trillion in market cap.</p></li><li><p>Their dominance stems from a <strong>head start in accelerated computing</strong> that dates back to CUDA&#8217;s first release in 2006&#8211;2007. Nearly two decades of consistent innovation in parallel compute has created an entrenched ecosystem.</p></li><li><p>Although open-source alternatives exist (e.g., Kronos Group standards, which Nvidia itself co-chairs), none approach CUDA&#8217;s adoption or defaultness.</p></li><li><p>There are now <strong>hundreds of millions of lines of CUDA code</strong> in the wild. Developers have built deep expertise around CUDA over years, creating massive switching costs. Michael&#8217;s analogy: convincing the world to drop CUDA is like convincing everyone to adopt the UK power plug &#8212; technically superior, but practically impossible.</p></li></ul><div><hr></div><h2><strong>2. Why Vendor Optionality Matters</strong></h2><p>Michael argued that having alternatives to Nvidia is critical for multiple reasons beyond just cost.</p><h3><strong>Cost Leverage</strong></h3><ul><li><p>Nvidia was selling H100s at roughly <strong>10x manufacturing cost</strong>, enabled by their monopolistic position.</p></li><li><p>These margins are unsustainable long-term, but the market hasn&#8217;t corrected because of ecosystem lock-in.</p></li></ul><h3><strong>Supply Chain Resilience</strong></h3><ul><li><p>Single-vendor dependency carries severe supply risk. When Nvidia misses production targets or its suppliers slip, customers are stuck.</p></li><li><p>Worst-case wait times for latest Nvidia compute are <strong>18+ months</strong>.</p></li><li><p>Geographic prioritisation compounds the problem: North American customers get served first, APAC gets deprioritised, and Middle Eastern customers wait until policy deals are brokered.</p></li></ul><h3><strong>Geopolitics</strong></h3><ul><li><p>Compute allocation has become a geopolitical instrument. Where you sit geographically determines your access to cutting-edge chips.</p></li></ul><div><hr></div><h2><strong>3. Current Challengers and the Competitive Landscape</strong></h2><p>Michael surveyed the emerging alternatives to Nvidia:</p><h3><strong>AMD Instinct Series</strong></h3><ul><li><p>Architecturally the most similar chips to Nvidia&#8217;s GPUs. The Instinct MI300 series is shipping, with the MI4XX series announced.</p></li><li><p><strong>Meta signed a deal with AMD</strong> to deploy 6 GW of Instinct compute by end of 2027 &#8212; a massive commitment, though execution remains to be seen.</p></li><li><p>AMD GPUs are generally <strong>~30% cheaper</strong> than equivalent Nvidia hardware, but people still aren&#8217;t switching en masse due to software incompatibility.</p></li></ul><h3><strong>Cerebras</strong></h3><ul><li><p>Cerebras&#8217; wafer-scale engine draws <strong>~27 kilowatts</strong> but is more efficient on a per-token basis at scale.</p></li><li><p>Cerebras announced a <strong>500&#8211;750 MW deployment partnership</strong>.</p></li><li><p>Crucially, Michael noted that Cerebras&#8217;s chip is <strong>not an ASIC &#8212; it is programmable and general purpose</strong>, which he considers a wise design choice nowadays.</p></li></ul><h3><strong>Google TPUs</strong></h3><ul><li><p>TPUs are not purchasable outside of Google Cloud Platform; they are fully vertically integrated. You access them via GCP subscriptions.</p></li><li><p>Programming model is fundamentally different: systolic arrays with data-flow architecture rather than traditional threads/warps.</p></li><li><p>Writing kernels for TPUs works differently &#8212; there aren&#8217;t traditional kernels.</p></li></ul><h3><strong>ASIC-Based Startups</strong></h3><ul><li><p><strong>There are ASICs focused on transformers like Etched, Positron and </strong>even <strong><a href="https://taalas.com/the-path-to-ubiquitous-ai/">Taalas</a></strong> who baked the transformer architecture into chip-level design, claiming speeds like 17,000 tokens per second.</p></li><li><p>Michael was critical of this approach: baking in transformer-specific logic creates the same depreciation cliff seen with Bitcoin miners (90%+ depreciation within 16&#8211;18 months).</p></li><li><p>He questioned the practical utility of extreme token speeds: &#8220;Your coding agent is going to ask you dumb questions a lot faster&#8221; &#8212; the model quality running on these chips matters more than raw throughput.</p></li><li><p>Current speeds on 8B models might scale to 250B+ models in the future, making specialisation worthwhile.</p></li><li><p>Michael&#8217;s response: AI innovation is outpacing chip tape-out timelines. The field hasn&#8217;t settled on a final architecture &#8212; the transformer itself may be disrupted. Betting silicon on a fixed architecture is premature.</p></li></ul><h3><strong>Other Notable Mentions</strong></h3><ul><li><p><strong>Next Silicon</strong> (Israeli) &#8212; Michael was most enthusiastic about their approach: they&#8217;ve etched the selection DAG part of a compiler into the chip, a novel architectural choice.</p></li><li><p><strong>Axelera AI </strong>(European, embedded/small-scale inference)</p></li><li><p>Michael referenced a list of <strong>~50 startups globally</strong> attempting to build AI chips.</p></li></ul><h3><strong>Microsoft Maia</strong></h3><ul><li><p>Microsoft launched Maia with an SDK access request form. Michael noted he still hadn&#8217;t received his SDK access, suggesting the chip may have incurred into delays.</p></li></ul><div><hr></div><h2><strong>4. The Depreciation Problem and Programmability</strong></h2><p>A substantial debate emerged around chip depreciation and what it means for investment decisions.</p><h3><strong>The Financial Reality</strong></h3><ul><li><p>H100 chips launched at <strong>$40k per chip</strong>; a single node costs <strong>$400k+</strong>.</p></li><li><p>These are not disposable assets &#8212; buyers need confidence the hardware retains value beyond a one-year cadence.</p></li><li><p>Bitcoin miners depreciate 90%+ within 16&#8211;18 months, creating a cliff that makes planning impossible.</p></li></ul><h3><strong>Cerebras</strong></h3><p>If CUDA is truly so powerful and versatile, why does hardware depreciate so fast? In theory, CUDA&#8217;s universality should let you keep running older chips (e.g., multiple A100s instead of one H100).</p><p>Michael&#8217;s response acknowledged the tension:</p><ul><li><p>Nvidia has a financial incentive to drive new GPU purchases &#8212; they make money on new silicon, not on customers reusing old chips.</p></li><li><p>There&#8217;s deliberate <strong>market segmentation</strong>: Nvidia explicitly wants H100s in certain deployments and not others. They don&#8217;t even use the term &#8220;heterogeneous&#8221; for their own chip generations.</p></li><li><p>Additionally, hardware constraints make simple substitution impossible: precision (FP) differences between A100 and H100, and capped InfiniBand on A100s mean you can&#8217;t just swap older interconnects with newer ones for large-scale training.</p></li></ul><h3><strong>The Core Argument</strong></h3><p>Michael&#8217;s central thesis: <strong>programmability and general-purpose capability are the most important chip design attributes</strong> because they preserve long-term value as workloads evolve. Application-specific chips face existential risk from architectural shifts in AI.</p><p>With tape-out costs at $20&#8211;30M minimum and 2&#8211;3 year chip lifecycles, a startup raising $100M gets essentially one generation of silicon. If the AI landscape shifts (e.g., mixture-of-experts changing distributed compute patterns), an architecture-specific chip is stranded.</p><div><hr></div><h2><strong>5. Spectral Compute</strong></h2><h3><strong>The CPU Analogy</strong></h3><p>Michael explained the problem by contrasting CPUs and GPUs:</p><ul><li><p>In the CPU world, you write code once and it runs on AMD, Intel, or ARM without modification. You pick hardware on merits (price, performance, cache) without rewriting your software.</p></li><li><p><strong>This freedom does not exist for GPUs.</strong> Switching from Nvidia to AMD requires porting code, despite the chips being architecturally very similar. AMD calls thread groups &#8220;waves&#8221; where Nvidia calls them &#8220;warps&#8221; &#8212; substantively the same thing, but different enough to break compatibility.</p></li></ul><h3><strong>The Spectral Approach</strong></h3><ul><li><p><strong>Embrace CUDA as the de facto standard</strong> rather than trying to replace it.</p></li><li><p>Let developers write CUDA code, pipe it through Spectral&#8217;s compiler, and have it <strong>compile directly to AMD native instructions</strong> (and eventually other architectures).</p></li><li><p>This is not transpilation &#8212; it&#8217;s ahead-of-time compilation that goes all the way down to native hardware instructions, enabling deep optimisation.</p></li><li><p>Developers can still hyper-optimise for specific hardware via #ifdef blocks for particular devices, but the default path works across all supported GPUs.</p></li></ul><h3><strong>Why Direct Compilation Matters</strong></h3><ul><li><p>AMD&#8217;s CUDA equivalent (HIP/ROCm) is frustratingly different in subtle ways that break programs and is also slower in many cases.</p></li><li><p>Transpiling to HIP would lose optimisation opportunities. By compiling directly to AMD native instructions, Spectral can reason about the full pipeline from high-level CUDA to transistor-level operations.</p></li><li><p>A specific example: AMD uses opposite column/row format for tensor operations compared to Nvidia. Spectral&#8217;s compiler can detect this, prove that reordering is safe, and perform the transformation at compile time with zero runtime overhead.</p></li></ul><h3><strong>Performance Results</strong></h3><ul><li><p>Spectral showed benchmarks comparing their compiler against AMD&#8217;s HIP toolkit. In several cases Spectral outperforms AMD&#8217;s own software stack; in others there&#8217;s room for improvement on both sides.</p></li><li><p>They also compared against OpenCL: on Nvidia, OpenCL is behind; on AMD, it&#8217;s roughly on par. But there&#8217;s limited OpenCL code in the wild to benchmark against.</p></li></ul><h3><strong>Coverage Status</strong></h3><ul><li><p>Michael showed their current CUDA API coverage, prioritised by real-world usage. Core CUDA APIs are substantially covered, including GrowMax operations.</p></li><li><p><strong>PyTorch support</strong> is their key milestone: early builds expected <strong>mid-Q2 2025</strong>. This is significant because PyTorch exposes nearly all low-level CUDA APIs to Python, so supporting it effectively means comprehensive CUDA coverage.</p></li></ul><h3><strong>Business Traction</strong></h3><ul><li><p>Most traction currently in <strong>academia</strong> &#8212; researchers using some of the world&#8217;s biggest supercomputers.</p></li><li><p>Working with a <strong>consortium of premier motorport teams</strong> on safety applications.</p></li><li><p>Engagements in <strong>HFT/algorithmic trading</strong>.</p></li><li><p>Goal for next 12 months: serious deployments with large AI labs and adoption by neocloud providers as the standard portability layer.</p></li></ul><h3><strong>Team and Hiring</strong></h3><ul><li><p><strong>22 people</strong>, remote-first, clustered around Europe (6 in greater London, 4 in Edinburgh, 3 in the Netherlands, 2 in Greece, 1 in Denmark, 1 in Italy, co-founder in California, 1 in Singapore).</p></li><li><p>Talent is extremely scarce: people who are both GPU architecture experts AND compiler specialists number roughly <strong>~200 globally</strong>. Spectral recruits from universities and cross-trains.</p></li></ul><h3><strong>Open Source Strategy</strong></h3><ul><li><p>Not currently open-sourcing the core technology. The &#8220;coup de gr&#226;ce&#8221; is their vendor-neutral compiler optimisations, which they consider proprietary IP &#8212; &#8220;for the same reason Nvidia isn&#8217;t open-sourcing theirs.&#8221;</p></li></ul><div><hr></div><h2><strong>6. AI-Assisted Compiler Optimisation</strong></h2><p>A notable exchange occurred around AI&#8217;s role in GPU code optimisation.</p><ul><li><p>An attendee noted his team went from <strong>two months to two weeks of optimisation</strong> using AI-assisted coding tools at low-level GPU programming.</p></li><li><p>He questioned whether traditional compilers could ever make the leap from naive attention to Flash Attention &#8212; something that took human researchers two years to develop.</p></li><li><p>Michael&#8217;s view: the compiler &#8220;just isn&#8217;t smart enough yet&#8221; &#8212; nobody has written compilers to fully exploit GPU architectural possibilities. That&#8217;s a core part of Spectral&#8217;s R&amp;D.</p></li><li><p>He argued AI-assisted code generation will eventually <strong>outperform human hand-optimisation</strong> because it can brute-force search a much larger space of possible programs. Combined with fast correctness verification, this creates a direct path to highly optimized code.</p></li><li><p>Importantly, that optimised code should be written in CUDA (not HIP or another niche language), for the same reason LLMs are better at English than Danish: the training data corpus is vastly larger.</p></li></ul><div><hr></div><h2><strong>7. The Cloud Hyperscaler Lock-In Dynamic</strong></h2><p>Discussion turned to how cloud providers are compounding the vendor lock-in problem:</p><ul><li><p>Google, Microsoft, and Amazon each want to <strong>control the full vertical stack</strong> &#8212; custom chips, custom SDKs, and proprietary APIs.</p></li><li><p>Once customers build on a hyperscaler&#8217;s fully integrated stack, switching costs become prohibitive (similar to AWS&#8217;s original API lock-in playbook).</p></li><li><p>The landscape will <strong>fracture further</strong>, not consolidate. Unlike the CPU world (Intel, AMD, ARM &#8212; three similar architectures), the GPU/accelerator world could see 20&#8211;30 different architectures.</p></li><li><p>Each hyperscaler&#8217;s new chip generation introduces architectural changes, deepening incompatibility.</p></li><li><p>This trend is likely to intensify as the industry shifts from training toward inference, where specialised inference chips from each hyperscaler will capture more market share.</p></li></ul><div><hr></div><h2><strong>8. Training vs. Inference Shift</strong></h2><p>A significant data point emerged in discussion:</p><ul><li><p><strong>60%+ of total compute</strong> used to train foundational models is now <strong>post-training</strong> (reinforcement learning), not pre-training. The pre-training dominant era is over.</p></li><li><p>This RL-heavy workload can be disaggregated across data centers but still faces throughput constraints between clusters.</p></li><li><p>This shift has implications for chip design and data center architecture &#8212; different workloads may favour different hardware configurations.</p></li></ul><div><hr></div><h2><strong>9. Data Center and Infrastructure Constraints</strong></h2><h3><strong>Cooling Capacity</strong></h3><ul><li><p>Cerebras&#8217;s wafer-scale engine draws 27 kW. Regardless of chip vendor, liquid cooling and high-power infrastructure is inevitable.</p></li><li><p>However, the buildout is slow: companies like Cloudflare with hundreds of thousands of PoPs globally can&#8217;t upgrade all sites simultaneously.</p></li><li><p>Meanwhile, the most common PCIe inference card (H100 NVLink) fits standard 600W PCIe slots. Nvidia hasn&#8217;t even updated this form factor for Blackwell. There&#8217;s an <strong>underserved niche</strong> for inference in existing data center infrastructure.</p></li></ul><h3><strong>Multi-Data-Center Training</strong></h3><ul><li><p>It was noted that if you can link H100s across two data centers via high-bandwidth interconnects, you can approximate the performance of newer GPUs without ripping and replacing.</p></li><li><p>Michael agreed this works for some workloads that can be disaggregated, but not all.</p></li></ul><div><hr></div><h2><strong>10. Key Debates and Open Questions</strong></h2><h3><strong>Will Nvidia&#8217;s Dominance Erode Through Interoperability?</strong></h3><ul><li><p>Developers won&#8217;t switch because alternatives aren&#8217;t faster; alternatives can&#8217;t get faster without developer adoption.</p></li><li><p>Michael&#8217;s answer: some large players (Meta with AMD) are making the leap. Spectral&#8217;s role is to be the &#8220;universal Babel translation layer&#8221; that removes the switching cost entirely. Their roadmap includes supporting AMD&#8217;s ROCm libraries on Nvidia hardware &#8212; enabling developers to use the best of both ecosystems on either vendor&#8217;s chips.</p></li></ul><h3><strong>Groq and Nvidia&#8217;s Acquisition</strong></h3><ul><li><p>Nvidia acquired a non-exclusive IP license to Groq&#8217;s core technology and hired their CEO and senior staff in late 2024. The deal was characterised as more than 3x Groq&#8217;s previous valuation, effectively an &#8220;acquihire&#8221; structure.</p></li></ul><h3><strong>GPU Utilization Gaps</strong></h3><ul><li><p>There&#8217;s research showing traditional kernels only use 40&#8211;50% of Nvidia GPU compute capacity, largely due to memory-bound bottlenecks.</p></li><li><p>Michael framed this as an algorithmic optimisation problem with multiple attack vectors: data compression, hardware changes, and compiler improvements. No single approach solves it universally.</p></li><li><p>Nvidia&#8217;s tensor cores are themselves an example of the hybrid approach: an ASIC-like unit embedded within a general-purpose programmable chip.</p></li></ul><h3><strong>The Cisco Networking Analogy</strong></h3><ul><li><p>There&#8217;s a parallel to Cisco&#8217;s networking dominance: Cisco created an entire certification ecosystem that made switching to Arista painful. However, customers with extreme performance needs (e.g., low-latency trading) will always seek alternatives.</p></li><li><p>Cerebras signed with OpenAI specifically because OpenAI couldn&#8217;t achieve &gt;1,000 tokens/second inference with other hardware and needed breakthrough performance.</p></li></ul><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/ScottNolan/status/2026425439275675879?s=20" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KEfO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KEfO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KEfO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KEfO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KEfO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg" width="1200" height="923" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:923,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://x.com/ScottNolan/status/2026425439275675879?s=20&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!KEfO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KEfO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KEfO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KEfO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e6836e-f0d3-44db-b63c-b70704fd979e_1200x923.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!41iy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!41iy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 424w, https://substackcdn.com/image/fetch/$s_!41iy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 848w, https://substackcdn.com/image/fetch/$s_!41iy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 1272w, https://substackcdn.com/image/fetch/$s_!41iy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!41iy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png" width="1456" height="1052" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1052,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!41iy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 424w, https://substackcdn.com/image/fetch/$s_!41iy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 848w, https://substackcdn.com/image/fetch/$s_!41iy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 1272w, https://substackcdn.com/image/fetch/$s_!41iy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99ce995a-5000-427e-b1ca-fefddafaee11_1600x1156.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://softwaremusings.substack.com/p/a-level-headed-look-at-state-of-software?r=1ac0y&amp;triedRedirect=true">A Level Headed Look at State of Software</a></p><p><a href="https://taalas.com/the-path-to-ubiquitous-ai/">The path to ubiquitous AI</a></p><p><a href="https://andymasley.substack.com/p/strategies-for-learning">Strategies for learning</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>Agents also need to love MongoDB. That requires us to ensure that we have all the right integration with the right places, how we auto scale, how we ought to perform during the peaks and valleys. All of that truly needs to be autonomous and driven by machines. And that requires absolutely the focus from the engineering team that how would machines look at this if they want to provision an additional node or if they want to manage cluster because of resiliency across multiple clouds. So that will be the North Star for us that our agents will love MongoDB as much as today, human developers love MongoDB.</p><p><strong>Chirantan Jitendra Desai, President &amp; CEO MongoDB, Q4 FY2026 Earnings</strong></p></div><div><hr></div><p><em>Have any feedback? Reach out on <a href="https://www.linkedin.com/in/akashbajwa/">LinkedIn</a> or <a href="https://x.com/AkashBajwa96">X</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[RL Environments with Scale AI]]></title><description><![CDATA[From Vending Machines To Real World Constraints]]></description><link>https://www.akashbajwa.co/p/rl-environments-with-scale-ai</link><guid isPermaLink="false">https://www.akashbajwa.co/p/rl-environments-with-scale-ai</guid><pubDate>Mon, 23 Feb 2026 07:02:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!01UA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach me on<a href="https://www.linkedin.com/in/akashbajwa/"> LinkedIn </a>and <a href="https://x.com/AkashBajwa96">X</a>.. </em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join readers from OpenAI, Databricks, Stripe, Figma and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h4><strong>Gradient Descending Roundtables</strong></h4><p><strong>February 25:</strong> <a href="https://luma.com/f7s69jj2">The Future of AI Compute</a></p><p><strong>March 5</strong>: <a href="https://luma.com/g22zhqqa">The Future of Software Engineering with Anthropic</a></p><div><hr></div><p>Last week, we hosted <a href="https://www.linkedin.com/in/mattspaul/">Matt</a> and <a href="https://www.linkedin.com/in/ipthomas/">Thomas</a> from Scale AI, following a previous roundtable on <a href="https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement">Rubrics as Rewards</a>. We discussed RL environments, one of the bigger themes in AI this year.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!01UA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!01UA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!01UA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!01UA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!01UA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!01UA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg" width="1200" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:222281,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/187859935?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!01UA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!01UA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!01UA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!01UA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e861924-ff1a-425b-84a1-a005d072175e_1200x1600.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Core Challenges in Building RL Environments</strong></h3><p><strong>1. The Optimisation/Scale Problem.</strong> The ideal RL environment should be extensive and cover many domains, but larger environments are computationally heavier. There&#8217;s a fundamental tension between realism/breadth and efficiency. GPU constraints are significant across training, inference, and environment simulation simultaneously. Anyone doing large-scale RL environment training would face GPU, DRAM, and potentially CPU bottlenecks across all three vectors.</p><p><strong>2. Verifiability Across Domains.</strong> Coding is the gold standard for RL because outputs can be programmatically tested. But many domains lack this property. Matt and Thomas described using &#8220;rubrics&#8221; &#8212; sets of 20-30 rules used as an LLM judge to score outputs in harder-to-verify domains &#8212; but acknowledged this still carries reward hacking risk, is compute-intensive (running two sets of inference at scale), and depends heavily on rubric quality, which requires expert curation.</p><p><strong>3. Real-World Relevance.</strong> Environments must actually drive performance on things end customers care about. Building an environment around vending machines might not justify huge investment. The benchmark and the environment need to be jointly validated &#8212; you need confidence that improving the benchmark actually reflects improving real-world performance, not just benchmark hacking.</p><div><hr></div><h3><strong>The Vending Machine Case Study</strong></h3><p>This was presented as a Scale AI internal hackathon project, partly inspired by an actual Anthropic experiment where Claude was given control of a real office vending machine (stocking items, web searching for suppliers, sending emails to reorder). The hackathon version was scoped down to fit a 2-day format.</p><p><strong>Environment Structure</strong></p><p>The environment was built around a simple loop:</p><ul><li><p><strong>Observation</strong> (input to model): Current inventory (item slots, quantities, prices), available cash balance, current time, and a delta log of changes since the last step (deliveries, price changes, sales)</p></li><li><p><strong>Action</strong> (output from model): A JSON object specifying restock orders and/or price changes</p></li><li><p><strong>Step function</strong>: Takes the action JSON, updates the simulator state, returns the new observation</p></li></ul><p>The model used was GPT OSS (with special tokens visible in the transcript examples). Reasoning tokens were artificially constrained because unconstrained reasoning allowed models to crush benchmarks.</p><p><strong>Mechanics of the Simulator</strong></p><ul><li><p>Demand was simulated based on price (lower price = higher demand)</p></li><li><p>Operating costs were baked in so the agent couldn&#8217;t simply do nothing &#8212; inaction leads to consistent losses</p></li><li><p>Expiration and other costs added further realism</p></li><li><p>Delivery lag was simulated (restock orders appear in future observations)</p></li></ul><p><strong>Training Run</strong></p><ul><li><p>Used GRPO (Group Relative Policy Optimization) algorithm</p></li><li><p>Only 50 training steps &#8212; a very short run</p></li><li><p>Eval: 10 independent runs of 4 days each (16 steps per run)</p></li><li><p>Base model (pre-training): Lost ~50% of starting capital on average</p></li><li><p>Post-training: Generated massive returns (example cited: ~&#163;3,500 from &#163;100 starting capital)</p></li></ul><p><strong>Key Finding &#8212; Reward Hacking vs. Environment Realism</strong> The team initially suspected reward hacking given the extraordinary returns, but upon inspecting actual generation traces, the model had genuinely learned good business operations. The real problem was <strong>environment realism</strong>: the simulation had no competition. In the real world, high-margin strategies attract competitors who arbitrage profits away. The model found the globally optimal strategy within the (unrealistic) environment, not a strategy that would work in the real world.</p><p><strong>Reward Hacking Example Encountered</strong> Early in development, the team found the agent exploiting a loophole: one item had very low supplier cost but very high demand. The agent filled every slot with that single item at maximum margin. This was caught by manually reviewing generation logs &#8212; the entire output was just repeated restock orders for the same item.</p><div><hr></div><h3><strong>Reward Design: Sparse vs. Dense Rewards</strong></h3><p>A significant portion of the discussion covered reward function design:</p><p><strong>Sparse Rewards (traditional approach)</strong></p><ul><li><p>Run the entire simulation, assign a binary 1/0 at the end (success/failure)</p></li><li><p>Extremely compute-inefficient &#8212; you spend enormous compute for 1 bit of signal</p></li><li><p>Poor for scaling training</p></li></ul><p><strong>Dense Rewards (what they used)</strong> Three improvements over sparse:</p><ol><li><p><strong>Continuous reward instead of binary</strong>: Use actual cash balance as reward (e.g., &#163;100 profit &#8594; reward of 100, not just 1)</p></li><li><p><strong>Intermediate step rewards</strong>: Assign small rewards/penalties at each step, not just at episode end. If the agent makes a small profit in step 3, that&#8217;s reflected in the step-3 reward. This provides dramatically more training signal per compute dollar</p></li><li><p><strong>Curriculum/shaping</strong>: Adding human prior knowledge through reward structure helps models learn faster, though it introduces the risk of over-constraining behaviour</p></li></ol><p>The tension highlighted: more intermediate rewards guide the model better but risk teaching it to optimise for the reward signal rather than the true objective. The answer given was that compute constraints force you toward denser rewards in practice &#8212; in an ideal world with unlimited compute you&#8217;d just use a single sparse end reward, but that&#8217;s not viable at scale.</p><div><hr></div><h3><strong>Verifiability and Domain Expansion</strong></h3><p><strong>Domains that work well with RL environments</strong></p><ul><li><p>Coding (programmatic test execution)</p></li><li><p>Computer use / tool interaction (did it navigate to the right page? Binary, immediate)</p></li><li><p>Specific financial tasks (binary compliance checks)</p></li></ul><p><strong>Rubric-based approach for harder domains</strong></p><ul><li><p>Construct 20-30 expert-curated rules</p></li><li><p>Use an LLM as judge to score outputs against the rubric</p></li><li><p>Scale has &#8220;PR Bench&#8221; (public) specifically for financial services and legal</p></li><li><p>Risks: reward hacking via rubric exploitation, LLM judge biases, compute cost of dual inference</p></li><li><p>In enterprise settings, they often build narrow domain-specific benchmarks rather than broad ones</p></li></ul><p><strong>Key insight on domain selection</strong>: For RL environments to be valuable, the benchmark must be: (a) something you can agree actually represents desired performance, and (b) tight enough that improving it reflects real capability gains rather than benchmark gaming. &#8220;Maximise profit in an undefined scenario&#8221; is too broad; &#8220;improve Rust coding on this specific task suite&#8221; is sufficiently constrained.</p><div><hr></div><h3><strong>Reward Hacking Detection and Prevention</strong></h3><p><strong>How it&#8217;s caught</strong>: Primarily by sampling generations and reading the raw model outputs manually. No automated detection was described as reliable.</p><p><strong>Why it&#8217;s hard to prevent by design</strong>: Models find exploits humans don&#8217;t anticipate because they run thousands of simulations and inevitably discover edge cases in the environment logic.</p><p><strong>Real-world example cited</strong>: Chinese labs reportedly hacked SWE-Bench (a coding benchmark) by looking at future git commits to copy solutions rather than actually solving problems &#8212; even frontier labs struggle with this.</p><p><strong>Dual-benchmark approach</strong>: Run training against benchmark A, but also monitor performance on a separate benchmark B that targets the same real-world capability. If B collapses while A improves, strong evidence of benchmark hacking.</p><div><hr></div><h3><strong>Discussion: Multi-Objective and Competing Goals</strong></h3><p>There are challenges with competing goals &#8212; e.g., a shopping agent maximising short-term basket size but damaging long-term customer retention:</p><ul><li><p>The environment itself doesn&#8217;t resolve competing objectives; the agent is free to explore any strategy</p></li><li><p>The horizon is fixed (e.g., 4 days) and the agent learns whatever maximizes reward within that horizon</p></li><li><p>Competing long-term goals (e.g., a B-corp optimizing for both profit and sustainability) are hard to encode &#8212; you&#8217;d need to design separate reward components, or accept that some goals require other training methods (RLHF rather than RLVR)</p></li><li><p>RLHF (human preference feedback) is the approach for subjective domains (does this slide look good?) while RLVR (verifiable rewards) handles objective ones</p></li></ul><div><hr></div><h3><strong>Scalability and the Real World vs. Simulation Debate</strong></h3><p><strong>Can you just use the real world as the environment?</strong> Why not use a real corner shop? The answer: you don&#8217;t have infinite compute to represent the real world, and the real world is too slow and expensive to run millions of times. The optimization is finding a simulation that is realistic enough to drive genuine performance improvement but constrained enough to be computationally tractable. This is described as an open problem for 2026.</p><p><strong>Multiplayer/competitive environments as middle ground</strong> OpenAI Five (Dota 2) was cited as the canonical example &#8212; a rich, competitive, multi-agent environment that can be run millions of times. Video games were popular for this, but labs have moved away because real-world tasks (coding, office work) drive more economic value. However, for robotics and world models, game-like simulation environments are coming back into favor.</p><p><strong>CPU as emerging bottleneck</strong> As environments become more complex (computer use, video editing, Linux environments), CPU becomes a constraint alongside GPU. Running heavy OS-level simulations at RL training scale could make CPU the binding constraint. One participant suggested rewriting environments in JAX to run on GPU, though this clearly doesn&#8217;t apply to high-fidelity real-world simulations.</p><div><hr></div><h3><strong>Training Pipeline Efficiency</strong></h3><p>Emphasised as critical because RL is inherently sequential (run simulation &#8594; training step &#8594; repeat, cannot parallelise easily):</p><ul><li><p>Any inefficiency in any layer (kernels, model, environment simulator) compounds across the entire training run</p></li><li><p>GPU utilisation must be maximised since RL idle time is extremely costly</p></li><li><p>Environment step functions must be highly optimised to not block training</p></li></ul><div><hr></div><h3><strong>Multi-Agent Training Clarification</strong></h3><p>A point of confusion clarified in Q&amp;A: the 10 evaluation runs shown in graphs were not separate agents learning from each other &#8212; they were 10 independent runs of the same model (to account for non-determinism). The training consisted of 50 gradient update steps where all runs contributed to shared model weight updates via GRPO. Post-training, another 10 independent eval runs were conducted.</p><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qawa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qawa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 424w, https://substackcdn.com/image/fetch/$s_!qawa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 848w, https://substackcdn.com/image/fetch/$s_!qawa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 1272w, https://substackcdn.com/image/fetch/$s_!qawa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qawa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png" width="1456" height="793" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:793,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:171066,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/187859935?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qawa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 424w, https://substackcdn.com/image/fetch/$s_!qawa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 848w, https://substackcdn.com/image/fetch/$s_!qawa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 1272w, https://substackcdn.com/image/fetch/$s_!qawa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f02d748-cf5a-4d96-a501-c3abf7f1e00a_1770x964.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://bearing.substack.com/p/founders-should-chase-secrets">Founders Should Chase Secrets</a></p><p><a href="https://meritech.substack.com/p/times-up-for-saas-grow-faster-or">Time&#8217;s Up for SaaS (Grow Faster or Vanish)</a></p><p><a href="https://x.com/WillManidis/status/2019850913599676524/?rw_tt_thread=True">End Game Play</a></p><p><a href="https://x.com/gsivulka/status/2024187126020272197">In defense of vertical software</a></p><p><a href="https://x.com/clairevo/status/2023908375084617729">You&#8217;ve been kicked out of the arena, you just don&#8217;t know it yet</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>I think though that if you look at the workflow overall, it used to be the case like even for sure, a year ago, perhaps even 6, 9 months ago, that a lot of people saw the workflow of product development is very linear.</p><p>The way I think about it is you&#8217;re sampling these infinite possibilities space. And you&#8217;re trying to determine what are the right options to go explore in that space and then push them forward with design. And I think that you can do that through code. You can do that through design, but code is more linear. It&#8217;s more -- you&#8217;re really advancing in one direction. And so you might be moving fast, but make sure you&#8217;re going to the right place before you go too far.</p><p>Whereas design, you&#8217;re really thinking about what are the range of possibilities I should explore and you&#8217;re weighing them and figuring out the trade-offs. And I also think the opportunity for polish and craft and design is quite high.</p><p><strong>Dylan Field, Figma Q4 2025 Earnings</strong></p></div><div class="pullquote"><p>If humans looked at 5 sites when they were making a decision, agents might look at 5,000. If humans had to fall back on generalized software and interfaces, agents allow for infinite customizability of every software application for every need. If humans follow a common circadian rhythm to work, agents never need to sleep. Agents, in other words, are the ultimate infrastructure multiplier.</p><p><strong>Matthew Prince, Cloudflare Q4 2025 Earnings</strong></p></div><div><hr></div><p><em>Have any feedback? Reach out on <a href="https://www.linkedin.com/in/akashbajwa/">LinkedIn</a> or <a href="https://x.com/AkashBajwa96">X</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Software's Metamorphosis]]></title><description><![CDATA[Winning The Upgrade Cycle]]></description><link>https://www.akashbajwa.co/p/softwares-metamorphosis</link><guid isPermaLink="false">https://www.akashbajwa.co/p/softwares-metamorphosis</guid><pubDate>Mon, 02 Feb 2026 07:01:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0C7l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join readers from OpenAI, Databricks, Stripe, Figma and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h4><strong>Gradient Descending Roundtables</strong></h4><p><strong>February 18:</strong> <a href="https://luma.com/6qz9tpzg">RL Environments with Scale AI</a></p><p><strong>February 25:</strong> <a href="https://luma.com/f7s69jj2">The Future of AI Compute</a></p><p><strong>March 4</strong>: <a href="https://luma.com/g22zhqqa">The Future of Software Engineering with Anthropic</a></p><div><hr></div><p>Before Moltbook took over the internet this past weekend, the sell-off of public software companies reignited debates around the future of software.</p><p>Andrej Karpathy <a href="https://x.com/karpathy/status/2015883857489522876">comments </a>on recent advances in AI coding read ominously:</p><blockquote><p><em>LLM agent capabilities (Claude &amp; Codex especially) have crossed some kind of <strong>threshold of coherence around December 2025</strong> and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry <strong>metabolizes the new capability</strong>.</em></p></blockquote><p>The markets are spooked even as incumbents are crushing earnings estimates, as fundamental assumptions behind the quality of the software business model have come into question, as <a href="https://cloudedjudgement.substack.com/p/clouded-judgement-13026-software">Jamin Ball wrote:</a></p><blockquote><p><em>Mainly, confidence in the SaaS business model has shattered. SaaS businesses were long thought of as &#8220;cash flow annuities.&#8221; Loose money early on, flip profitable, and then every year print cash predictably. You could then calculate the &#8220;intrinsic value&#8221; of a SaaS business by summing the present value of every annual cash flow, with a terminal value assumption. More specifically, calculate the present value of the next 10 years of cash flows (discounted back to today), and make an assumption of the terminal value (ie year 11 onward).</em></p></blockquote><blockquote><p><em>AI is creating huge questions about what the future retention rates of these &#8220;stable&#8221; software companies will be. Software bears will say this platform shift will lead to <strong>deteriorating retention rates</strong> as companies leave behind legacy SaaS vendors for modern AI native alternatives. At the same time (and related), this is increasing the probability that the <strong>terminal value is in fact 0</strong> for some companies.</em></p></blockquote><p>Retention has stabilised, having compressed from 2021 levels. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lMG-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lMG-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 424w, https://substackcdn.com/image/fetch/$s_!lMG-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 848w, https://substackcdn.com/image/fetch/$s_!lMG-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!lMG-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lMG-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png" width="1456" height="643" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:643,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:575887,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/186405524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lMG-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 424w, https://substackcdn.com/image/fetch/$s_!lMG-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 848w, https://substackcdn.com/image/fetch/$s_!lMG-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!lMG-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99784c4f-2826-4d09-8d88-ac34bc278a63_2900x1280.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://docsend.com/view/iknzz8xwkzkjf88z">Avenir</a></figcaption></figure></div><p>Software companies have become <em>cash flow</em> profitable, but this hasn&#8217;t translated to true economic profitability on a GAAP basis given the impact of <strong>stock-based compensation.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4ZJL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4ZJL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 424w, https://substackcdn.com/image/fetch/$s_!4ZJL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 848w, https://substackcdn.com/image/fetch/$s_!4ZJL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 1272w, https://substackcdn.com/image/fetch/$s_!4ZJL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4ZJL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png" width="1456" height="883" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:883,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:324726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/186405524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4ZJL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 424w, https://substackcdn.com/image/fetch/$s_!4ZJL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 848w, https://substackcdn.com/image/fetch/$s_!4ZJL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 1272w, https://substackcdn.com/image/fetch/$s_!4ZJL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9171693-747e-43a4-8f2f-e8e5ac335973_1856x1126.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Companies became &#8216;Rule of 40&#8217; efficient&#8230; Source: Jefferies</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3u4l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3u4l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 424w, https://substackcdn.com/image/fetch/$s_!3u4l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 848w, https://substackcdn.com/image/fetch/$s_!3u4l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 1272w, https://substackcdn.com/image/fetch/$s_!3u4l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3u4l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png" width="1456" height="611" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:611,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:879208,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/186405524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3u4l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 424w, https://substackcdn.com/image/fetch/$s_!3u4l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 848w, https://substackcdn.com/image/fetch/$s_!3u4l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 1272w, https://substackcdn.com/image/fetch/$s_!3u4l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0598979c-ec1c-4605-bb62-9e7124cfef90_2922x1226.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">But true profitability is still rare&#8230; Source: Avenir</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Zof!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Zof!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 424w, https://substackcdn.com/image/fetch/$s_!0Zof!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 848w, https://substackcdn.com/image/fetch/$s_!0Zof!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 1272w, https://substackcdn.com/image/fetch/$s_!0Zof!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Zof!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png" width="927" height="405" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c5689beb-13bc-4689-9464-c41e455ecb13_927x405.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:405,&quot;width&quot;:927,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Zof!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 424w, https://substackcdn.com/image/fetch/$s_!0Zof!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 848w, https://substackcdn.com/image/fetch/$s_!0Zof!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 1272w, https://substackcdn.com/image/fetch/$s_!0Zof!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5689beb-13bc-4689-9464-c41e455ecb13_927x405.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Rising SBC has been structurally diluting returns for SaaS investors.. <a href="https://vosscapital.substack.com/p/stock-comp-in-software">Source</a></figcaption></figure></div><p>In light of this, there are three scenarios being debated, all against a backdrop of a potential historic upgrade cycle across the enterprise:</p><ol><li><p>Incumbents infuse <strong>just enough</strong> AI across their product surface area, combined with their distribution advantages, to drive renewals</p></li><li><p>AI-native disruptors win the upgrade cycle and displace incumbents</p></li><li><p>Enterprises build their own applications/agents</p></li></ol><p>The most compelling argument against the latter is simple:<a href="https://buffettsdisciple.substack.com/p/saas-widely-misunderstood-csuto-toiv?r=1ac0y&amp;shareImageVariant=overlay&amp;triedRedirect=true"> focus</a>. </p><blockquote><p><em>Which leads me to the greatest irony of all: <strong>the companies that spend the most time and effort replacing SaaS with internal tools are the most likely to be disrupted. </strong>In business, nothing matters more than focus. The second you start expending engineering, product teams, and designers to reinvent the wheel, you&#8217;re distracting from your core competency, slowing everything and everyone down. Why do you think Netflix won the streaming wars? Why does Spotify lead in streaming music? Why does TSMC make all the chips in the world and not Intel? <strong>Focus</strong>. The best thing any business can do is invest time and effort in <strong>amplifying its own product offerings with AI.</strong> Replacing existing products to save negligible costs is a waste of time.</em></p></blockquote><p>Followed closely by the delta in <a href="https://buffettsdisciple.substack.com/p/saas-widely-misunderstood-csuto-toiv?r=1ac0y&amp;shareImageVariant=overlay&amp;triedRedirect=true">capabilities/total cost of ownership</a> delivered by third-party vendors who are themselves <strong>singularly</strong> <strong>focused</strong> on their core product, versus software developed in-house.</p><blockquote><p><em>SaaS is used because you get an exceptional product at extremely compelling financials; building it internally makes ZERO sense. The corporate version of Slack costs $18 per head per month. For a company of 1,000 people, you&#8217;re paying approximately $220,000 a year for Slack. $220K/year, which yields a proxy for roughly $75M in R&amp;D effort annually. In other words, <strong>you&#8217;re earning a 340x amplifier for your money.</strong> SaaS has historically been an exceptional business because of how compelling the financials are for <strong>everyone</strong>. Slack is able to build an <strong>exceptional product because it focuses on one thing</strong>, their messaging app. Then they offer it at very compelling financial terms because they can leverage scale. It&#8217;s cheap for their customers while still highly profitable for them due to margins and scale.</em></p></blockquote><p>JP Morgan&#8217;s internal tech team made similar remarks: AI is a TAM-expanding technology and the best use of incremental dollars is on the core business rather than rebuilding third-party apps.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/obsidiancap1/status/2014835568840962234?s=20" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8yQr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 424w, https://substackcdn.com/image/fetch/$s_!8yQr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 848w, https://substackcdn.com/image/fetch/$s_!8yQr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 1272w, https://substackcdn.com/image/fetch/$s_!8yQr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8yQr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png" width="1070" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1070,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:203294,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/obsidiancap1/status/2014835568840962234?s=20&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/186405524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8yQr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 424w, https://substackcdn.com/image/fetch/$s_!8yQr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 848w, https://substackcdn.com/image/fetch/$s_!8yQr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 1272w, https://substackcdn.com/image/fetch/$s_!8yQr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cc2b3d2-c0a7-49c0-bd09-8ccc6580253e_1070x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Recent surveys suggest this sentiment still holds true in the enterprise.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0C7l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0C7l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 424w, https://substackcdn.com/image/fetch/$s_!0C7l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 848w, https://substackcdn.com/image/fetch/$s_!0C7l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 1272w, https://substackcdn.com/image/fetch/$s_!0C7l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0C7l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png" width="1410" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1410,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:207277,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/186405524?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0C7l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 424w, https://substackcdn.com/image/fetch/$s_!0C7l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 848w, https://substackcdn.com/image/fetch/$s_!0C7l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 1272w, https://substackcdn.com/image/fetch/$s_!0C7l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e0649f8-9db1-487b-9aef-0542b1d7c642_1410x736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source; Avenir</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UF5Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UF5Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 424w, https://substackcdn.com/image/fetch/$s_!UF5Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 848w, https://substackcdn.com/image/fetch/$s_!UF5Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 1272w, https://substackcdn.com/image/fetch/$s_!UF5Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UF5Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png" width="1456" height="1147" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1147,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UF5Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 424w, https://substackcdn.com/image/fetch/$s_!UF5Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 848w, https://substackcdn.com/image/fetch/$s_!UF5Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 1272w, https://substackcdn.com/image/fetch/$s_!UF5Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43b39ff2-c19d-4052-835e-cfaabf342114_2000x1575.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For now, at least, the upgrade cycle is going to be contested by incumbents and disruptors.</p><p>Klarna&#8217;s narrative in the run up to its IPO is instructive.</p><p>Staking its credentials as an AI-native company, the company decided to churn from Salesforce, among other SaaS apps, and consolidate <a href="https://x.com/klarnaseb/status/1896698293759230429">around a graph database built on neo4j</a>. </p><p>Sebastian Siematkowski&#8217;s explanation nearly one year ago was<a href="https://x.com/klarnaseb/status/1896698293759230429"> prescient</a>:</p><blockquote><p><em>Key to our explorations became the conclusion that the utilization of SaaS to store all forms of knowledge of what Klarna is, why it exists (docs), what it tries to accomplish (slides, tickets, kanban boards), how it is doing (sheets, analytics), who is it dealing with (CRM, supplier management), who works here (ERP, HR) and what it has learnt was fragmented over these SaaS&#8212;most of them having their own ideas and concepts and creating an <strong>unnavigable web of knowledge that required a tremendous amount of Klarna specific expertise</strong> to operate and utilize.</em></p></blockquote><p>Since then, incumbents have continued to <a href="https://tomtunguz.com/defense-comes-to-software/">erect walls</a> around their data, and the concept of <a href="https://foundationcapital.com/context-graphs-ais-trillion-dollar-opportunity/">context graphs to collect company-specific decision traces</a> has gone viral. </p><p>Directionally, it&#8217;s becoming clearer that enterprises want to remove silos and instrument the telemetry to capture decision traces that will enable agents.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cc6e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cc6e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 424w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 848w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png" width="807" height="265" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:265,&quot;width&quot;:807,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cc6e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 424w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 848w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Morgan Stanley</figcaption></figure></div><p>That future, one where stateful agents have freedom to operate across a company&#8217;s data estate, will reward vendors who are able to build the best harnesses around foundation models for their domain. </p><p>The best harness will translate to the strongest capabilities, and by extension the highest ROI. Support agents with the best harness will deliver the highest resolution rate. Coding agents with the best harness will have the highest throughput of PRs that get merged. Legal agents with the best harness will be faster, cheaper and more accurate at reviewing contracts. </p><p>Incumbents have the distribution, ecosystem, long-standing CIO relationships and platforms. The startups taking share away from incumbents are leaning into the future enterprises want and shipping 10x better products. The gulf has to be big enough to win this generational upgrade cycle. </p><p>A number of debates will be settled in coming years. </p><p>What&#8217;s the <a href="https://x.com/iandmacomber/status/2014449113795068083">right UI for knowledge work in an agent-first world?</a></p><p>Will UIs optimise for the experience of agent delegation rather than humans doing the work?</p><p>We&#8217;re moving at warp speed. Buckle up.</p><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/GoogleResearch/status/2016621362480382213" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vDLd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 424w, https://substackcdn.com/image/fetch/$s_!vDLd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 848w, https://substackcdn.com/image/fetch/$s_!vDLd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 1272w, https://substackcdn.com/image/fetch/$s_!vDLd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vDLd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png" width="900" height="664" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:664,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Task-specific performance showing that multi-agent coordination yields substantial gains on parallelizable tasks like Finance-Agent (+81%) while degrading performance on sequential tasks like PlanCraft (-70%).&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://x.com/GoogleResearch/status/2016621362480382213&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Task-specific performance showing that multi-agent coordination yields substantial gains on parallelizable tasks like Finance-Agent (+81%) while degrading performance on sequential tasks like PlanCraft (-70%)." title="Task-specific performance showing that multi-agent coordination yields substantial gains on parallelizable tasks like Finance-Agent (+81%) while degrading performance on sequential tasks like PlanCraft (-70%)." srcset="https://substackcdn.com/image/fetch/$s_!vDLd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 424w, https://substackcdn.com/image/fetch/$s_!vDLd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 848w, https://substackcdn.com/image/fetch/$s_!vDLd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 1272w, https://substackcdn.com/image/fetch/$s_!vDLd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fdf3ff1-0b35-4d87-b37f-ba96568e47c9_900x664.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://x.com/brianzhan1/status/2017304537359610081">How to Bet on AGI</a></p><p><a href="https://x.com/nicbstme/status/2015174818497437834/?rw_tt_thread=True">Lessons from Building AI Agents for Financial Services</a></p><p><a href="https://x.com/theonejvo/status/2016510190464675980">eating lobster souls Part III (the finale): Escape the Moltrix</a></p><p><a href="https://stratechery.com/2026/intel-earnings-the-agentic-opportunity-intels-mistaken-pessimism/">Intel Earnings, The Agentic Opportunity, Intel&#8217;s Mistaken Pessimism</a></p><p><a href="https://zhengdongwang.com/2026/01/30/a-straussian-reading-of-the-adolescence-of-technology.html">A Straussian reading of The Adolescence of Technology</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>It&#8217;s almost like a religious war around where the value is created. Is it on the infrastructure layer which is currently the flavor of the month, where everybody is investing. By the way, that&#8217;s actually good for SAP because we are agnostic and the more money flowing into that, the more competitive that infrastructure will be to run our PaaS and SaaS services on top.</p><p><strong>Dominik Asam, SAP CFO, Q4 2025 Earnings Call</strong></p></div><div class="pullquote"><p>The key metric we&#8217;re optimizing for is tokens per watt per dollar, which comes down to increasing utilization and decreasing TCO using silicon systems and software.&#8221;</p><p><strong>Satya Nadella, Microsoft CEO, Q2 2026 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p>]]></content:encoded></item><item><title><![CDATA[Building Ramp Sheets: Ramp Labs and Applied AI With Alex Stauffer and Alex Shevchenko]]></title><description><![CDATA[Unpacking spreadsheet agents, product philosophy, and shipping velocity]]></description><link>https://www.akashbajwa.co/p/building-ramp-sheets-ramp-labs-and</link><guid isPermaLink="false">https://www.akashbajwa.co/p/building-ramp-sheets-ramp-labs-and</guid><pubDate>Mon, 26 Jan 2026 07:01:55 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/00f2e0a6-ca48-4629-a8b0-a579c8771cfd_1360x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join readers from OpenAI, Databricks, Stripe, Figma, and other iconic companies</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h4><strong>Gradient Descending Roundtables</strong></h4><p><strong>February 18:</strong> <a href="https://luma.com/6qz9tpzg">RL Environments with Scale AI</a></p><p><strong>February 25:</strong> <a href="https://luma.com/f7s69jj2">The Future of AI Compute</a></p><div><hr></div><p><em>Today we&#8217;re speaking with Alex Shevchenko and Alex Stauffer, the Leads behind Ramp Labs. They released <a href="https://labs.ramp.com/sheets">Ramp Sheets</a> in November to a strong reception, with an exciting roadmap ahead.</em></p><p><em><strong><a href="https://www.linkedin.com/in/shevalex/">Alex Shevchenko</a></strong> is the Engineering Lead at Ramp Labs. He&#8217;s been at Ramp for two years, working on the Applied AI team. He started on AI platform-level infrastructure (NVIDIA CUDA drivers, memory partitioning for model services) before moving into experimental projects like Ramp Tour Guide (an early computer use prototype). He set up Ramp Labs with Alex Stauffer.</em></p><p><em><strong><a href="https://www.linkedin.com/in/alexstauffer/">Alex Stauffer</a></strong> is the Product Lead at Ramp Labs. He&#8217;s also been at Ramp for two years and previously was on the founding team at Actively AI. Based in New York, he leads the team responsible for pushing the boundaries of AI on the product and application layer.</em></p><div><hr></div><h2><strong>The Mission of Ramp Labs</strong></h2><div class="pullquote"><p><strong>How does Ramp Labs operate relative to other product teams at Ramp? Is there a clear roadmap of products, or is it more exploratory?</strong></p></div><blockquote><p><strong>Alex Shevchenko:</strong> One of the big things when we were starting out is that there wasn&#8217;t a clear mandate given to us. We were free to roam and figure out things that we think are interesting in the applied AI space and to make products out of them. We already had some experiments in flight just from trying to make our finance team more efficient, which is actually the backstory for Sheets.</p><p>We had no proper guideline of &#8220;you need to do growth, you need to get this many followers, or you need to save this many hours for our team.&#8221; It was really just exploring things, and getting to something that&#8217;s useful in the long run. We don&#8217;t need to hit a specific metric this quarter. We can just build out some incredible design, some incredible product. It might not necessarily be useful right away, but it can then get disassembled into parts that will be used in Ramp to deliver more saved time, more saved money for our customers later on without us being constrained to really think every minute about our KPIs.</p></blockquote><blockquote><p><strong>Alex Stauffer:</strong> To build off of that, one of our first projects was an AI form filler. This is a huge problem in finance teams. A lot of their time is just spent filling in forms whenever they need to wire money. It&#8217;s incredible how much wasted time there is. They&#8217;re just inputting the same company information, the same beneficial owner information all the time.</p><p>We shipped that product, we got a lot of impressions on Twitter and a lot of users. But then the broader story is that technology is being phased into all of the Ramp product today. <strong>There are 55,000 businesses on Ramp, millions of users.</strong> And now whenever it pains a finance admin to go fill in forms in any part of the product, this is being turned into an automated process so they don&#8217;t have to do that.</p></blockquote><h2><strong>The Launch and Early Momentum</strong></h2><div class="pullquote"><p><strong>You launched with no paid marketing and hit 10K users organically. How&#8217;s the momentum been, and where are these users coming from?</strong></p></div><blockquote><p><strong>Alex Stauffer:</strong> We launched a little bit before Thanksgiving last year, during the holiday season, which was interesting timing. We got over 2 million impressions when we launched. We have over 10,000 users today, and a lot of the top VC funds, PE funds, top founders, small business owners; those are really the core users that we&#8217;ve seen on the product so far.</p><p>It&#8217;s really exciting. Users keep coming back. There&#8217;s a lot of organic growth just through word of mouth. We might throw some paid marketing at this soon, but right now it&#8217;s all been organic. And a lot of users are asking for more product features, and that&#8217;s what I&#8217;m really excited for. They&#8217;re giving us a lot of great feedback about how to improve the product.</p><p>For example, having multiple options for formatting, or you could have custom formatting based on how your company does things. So these are really interesting insights. The product is constantly being improved right now.</p></blockquote><div class="pullquote"><p><strong>What&#8217;s the most surprising or creative way you&#8217;ve seen someone use Ramp Sheets?</strong></p></div><blockquote><p><strong>Alex Stauffer:</strong> Most of it is the standard use cases: small business owners, CFOs, small finance teams is definitely one cohort. And then there&#8217;s another that&#8217;s VC, investment banking, consultants.</p><p>In the first cohort, there&#8217;s 13-week cash flow and, for example, modeling out, &#8220;Okay, what if I hire three more people to the sales team this quarter? How does that affect my burn?&#8221; Another interesting point is we&#8217;ve seen founders use it to create valuations of their companies. That&#8217;s pretty novel to us. We did not expect that. These are things that have been self-reported from customers and through our user interviews; we can&#8217;t see your data at all.</p><p>On the investment banking side, a lot of friends in the IB space are signing up for this, even though they&#8217;re using their personal Gmails to not do it through their work. But it has been helping them quite a lot. There&#8217;s a lot of leveraged buyout models. These things take hours and days to create. <strong>And with Ramp Sheets, you can get a really good version of this in 10 to 20 minutes.</strong></p></blockquote><h2><strong>Why Spreadsheets?</strong></h2><div class="pullquote"><p><strong>Why did you pick spreadsheets, and how has your thinking evolved about why this was the right problem to tackle?</strong></p></div><blockquote><p><strong>Alex Shevchenko:</strong> This is a project that we started a long time ago, where we were tasked with improving the speed and efficiency of our own internal finance team. That&#8217;s a non-trivial thing to do because Ramp&#8217;s financial structure is very complex and very mature. So there&#8217;s not a lot of super easy wins to be had there.</p><p>When you think about it, you can find an accountant or a bookkeeper, sit down with them, spend a day explaining a task that normally takes them 30 minutes, and then take two days to automate it away. So you spend three days of engineering time to save 30 minutes of a finance professional&#8217;s time that happens every month. The math doesn&#8217;t work out there.</p><p>So one of the first things I tried to do was streamline the process of getting the context from the finance person&#8217;s brain into the engineer&#8217;s brain. <strong>We started out with a tool where they could record all of the actions that they take for a certain process</strong>. <strong>And with an LLM, we would process it and create basically a piece of documentation that would map out the entire process with an explanation and the tools they&#8217;re using.</strong> So you could take that and start writing a Python script or an automation for it much quicker.</p><p>Then the logical next step was, well, you already have all of this information. Maybe you can just use an LLM to vibe-code something directly on your behalf. So we started generating Python or N8N workflow automations from the videos.</p><p>We made this prototype and we came to Finance and presented it, and they said, &#8220;Well, this is really neat and all, but I don&#8217;t really trust it. This is very black box. The output is this Python thing that is very hard to verify.&#8221; And it also uses Python blocks or JavaScript blocks for automation. So you&#8217;re generating this artifact that is not verifiable by the professional. They don&#8217;t want to really use it because finance is very high-stakes. You can&#8217;t make mistakes, and the artifact that you automatically generate is not verifiable by them.</p><p>So we took a step back, and I was going through the Loom recordings of the processes that they sent to us. And I realized that you drop your cursor anywhere in that video and <strong>95% chance that you&#8217;re in Excel.</strong> Finance professionals live in Excel. That&#8217;s all they do because it&#8217;s such a powerful tool and it&#8217;s the thing that they understand the best. As an engineer, I can read through Python code and my brain is set up for that, but their brain is set up to quickly parse an Excel file and get the information out of it.</p><p>So we decided to take this video automation and instead of generating Python, we&#8217;re going to generate step-by-step instructions for the Excel modifications so that they have a clear view of what is happening. They can audit it very quickly because they&#8217;re used to Excel, and then they can have it as a repeatable process.</p><p>Then for the public launch, we decided that the video modality doesn&#8217;t necessarily make sense for a lot of people. The finance team at Ramp is very strong and has their processes in place. But that&#8217;s not necessarily the case when you&#8217;re a startup founder and you don&#8217;t know what an example of a good model is. So we opened it up to be a text box. You just come into it and you explain what you&#8217;re looking for, and then it takes care of the specifics for you.</p></blockquote><h2><strong>Building the Infrastructure</strong></h2><div class="pullquote"><p><strong>There&#8217;s probably a lot of engineering and infrastructure thinking that&#8217;s gone into context management, tool calling, reasoning, debugging errors. Can you talk us through the harness you&#8217;ve built around the models to make this product production-grade?</strong></p></div><blockquote><p><strong>Alex Shevchenko:</strong> For something that&#8217;s running on Excel, the harness is <strong>OpenAI Agent SDK</strong> for the most part. We have modified it quite a bit, but that&#8217;s the core of it. The hardest piece on something that interfaces with Excel is just how complex Excel is. There are quadrillions of different configurations and settings&#8212;incremental cell calculation for self-referential formula chains. In general, just a random formula is also very complex.</p><p>Making those Excel calculations is really the hardest piece. I know that, for example, Anthropic Claude for Excel uses LibreOffice Excel to do those calculations. They&#8217;ll spin up a headless version of it and make the modifications in that. That&#8217;s not the approach that we took. We have a more complicated approach that I don&#8217;t want to divulge fully, but for the formula calculation and formula understanding&#8212;getting the calculated value versus the cached value (which is the number that you see compared to the formula)&#8212;those are the hardest pieces to get really right. Otherwise, you&#8217;d get a huge sheet with #VALUE!, #REF!, and just other errors all over the place. That was really the hardest piece to get correct from the technical point of view.</p><p><strong>Alex Stauffer:</strong> There&#8217;s also a lot of <strong>design delights</strong> that went into this. We tried to make it as simple as possible. That&#8217;s very clearly shown in the designs. There are a couple of other similar products that are a lot more cumbersome and complex, and we tried to avoid that. A lot of those tools are made for IB and all these crazy workflows. We really wanted to make this just general for the consumer and then also those specific users.</p><p>We care a lot about design and product at Ramp. It&#8217;s what we&#8217;re known for. So we wanted to make it very sleek, very simple. We actually have a lot of new features coming out soon. We&#8217;re going to have <strong>templates, and this is actually going to supercharge Ramp Sheets.</strong> It&#8217;ll be quite similar to Granola recipes where we&#8217;re going to have a lot of pre-baked templates that are for those core use cases, and it&#8217;ll be super easy to fire off prompts.</p><p>One thing that we&#8217;ve realized with a lot of these AI products is that prompting isn&#8217;t super easy at times. <strong>The whole concept of the blank text box is quite jarring</strong> to a lot of people. People don&#8217;t know what to do, people don&#8217;t know what to say. We&#8217;ve tried to make this easy in the UX. We have some basic templates already that are more general-coded, but this is going to be a lot more specific and a lot better.</p></blockquote><h2><strong>Views on Agent Infrastructure</strong></h2><div class="pullquote"><p><strong>Given what you&#8217;re working on at Ramp Sheets and across Ramp Labs, do you have any strong opinions on the direction of agent infrastructure? There are all these primitives emerging&#8212;frameworks, sandboxes, memory, tool-calling protocols. From a supply-side perspective, there&#8217;s so much infrastructure for agents being built. Where do you see things headed?</strong></p></div><blockquote><p><strong>Alex Shevchenko:</strong> That&#8217;s the hardest question to answer because it&#8217;s very case-by-case. I think whatever you&#8217;re building, a lot of the general chat interfaces are obviously going to have a ton of frameworks built around them. If you want to build out a customer support chatbot, there&#8217;s a billion different ways of doing it because there&#8217;s a bunch of providers that all have really good solutions.</p><p>But then there&#8217;s stuff that needs to be customized, like an Excel editor. If you want to create something for text-to-3D modeling or something along those lines. Obviously, in those cases, you will probably end up having to write a lot of custom code that interfaces between them.</p><p>In general, it&#8217;s really a gradient of how much hand-holding you get. The more hand-holding you get, the less customizable or the more finished-up solution you get. All of these are engineering trade-offs. Obviously you can move very fast on a finished solution from a vendor, but then you&#8217;re going to lose out on some of the customisability, and it&#8217;s case-by-case whether that helps you out or not.</p><p>I think one of the biggest things is sandbox providers. We use Modal for <a href="https://builders.ramp.com/post/why-we-built-our-background-agent">Inspect</a>, for example. I think that&#8217;s a great product that is going to become very popular as we start letting agents roam much more freely. <strong>Letting them live in a bunch of sandboxes, you spin up one for each time you run something on the user&#8217;s behalf.</strong> Now it&#8217;s a sandbox within a Linux VM, but maybe they start giving them computer interfaces or something more complex. I feel like that&#8217;s one of the things I&#8217;m pretty bullish on.</p></blockquote><h2><strong>The Roadmap for Ramp Sheets</strong></h2><div class="pullquote"><p><strong>Templates are coming. Can you speak generally about the roadmap for Ramp Sheets in the future?</strong></p></div><blockquote><p><strong>Alex Stauffer:</strong> First of all, it&#8217;s going to be integrated heavily with the Ramp product. There are a lot of workflows today that finance teams are doing within Ramp, and they ultimately end up in a spreadsheet afterwards. We&#8217;re going to do a lot more to automate some of this and have them end up in Ramp Sheets with that AI agent that they can just go talk to and manipulate the data, which is a lot easier than having to do it themselves.</p><p>There are also a lot of really unique and creative product experiences that Ramp Sheets will support which Ramp is creating and will be shipping soon. You&#8217;ll see it coming soon, but it&#8217;s heavily impacting the product in a really positive and fast way.</p><p>And then for Ramp Sheets itself, it&#8217;s proving to be very successful and growing on its own. So we&#8217;re going to keep growing it and investing in the product and platform. We&#8217;re having more resourcing on our side to do that. The goal here is if we can provide a lot of value to a lot of new people and then have them learn about Ramp in the process, then it&#8217;s really a huge success.</p></blockquote><h2><strong>Ramp&#8217;s Product Culture</strong></h2><div class="pullquote"><p><strong>Ramp&#8217;s product culture is hugely respected. How do you think it&#8217;s distinct, and what trade-offs is Ramp making to be that way?</strong></p></div><blockquote><p><strong>Alex Stauffer:</strong> I think the biggest thing is that everyone here is <strong>very AGI-pilled</strong>. That&#8217;s probably the biggest difference between a lot of the top startups today and the ones that aren&#8217;t. <strong>AI is in everything.</strong> We&#8217;re using all the AI products available. <strong>Everyone is using AI to speed up their workflows</strong>. Claude Code and Cursor are upgrading everyone&#8217;s velocity for how fast we can ship.</p><p>We also have an internal agent called <a href="https://builders.ramp.com/post/why-we-built-our-background-agent">Inspect</a>. Everyone is shipping. It&#8217;s really 10x-ing the velocity within the company. </p><p>I think another main cultural difference is that Ramp is still <strong>founder-led</strong>, and Eric and Karim are some of the best to do it. They want to grow Ramp into a massive business.</p><p>As a result, I think the culture is very non-bureaucratic here. It&#8217;s pretty flat. <strong>Everyone is an IC.</strong> Pretty much everyone&#8217;s shipping. And a lot of the top talent is here as well.</p></blockquote><h2><strong>Shipping Velocity vs. Polish</strong></h2><div class="pullquote"><p><strong>Every company takes a different view on shipping fast and iterating from feedback quickly versus waiting until you have a jaw-dropping customer experience. Where does Ramp land on that spectrum?</strong></p></div><blockquote><p><strong>Alex Stauffer:</strong> It&#8217;s the former. Our CPO, Geoff Charles, talks about this. We have a whole system of <strong>Alpha, Beta, and GA releases.</strong> The goal is to just get something in the hands of these alpha customers who have opted in, which are probably five customers/users of the product. Get their feedback really early. Tell us what they like and what they don&#8217;t like, and from there just go to the next version and then send that to the beta, which has more users, and then eventually GA the whole thing.</p><p><strong>We think of velocity as actually enabling high quality product</strong> versus the other way around. The opposite is a bad habit to get into&#8212;I would actually describe it as an AI that just keeps thinking in loops. That&#8217;s the worst thing to happen where the customer isn&#8217;t involved. You just have a lot of iteration without actually understanding if you&#8217;re solving the problem. So that&#8217;s what we try to avoid.</p></blockquote><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OrfY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OrfY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 424w, https://substackcdn.com/image/fetch/$s_!OrfY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 848w, https://substackcdn.com/image/fetch/$s_!OrfY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 1272w, https://substackcdn.com/image/fetch/$s_!OrfY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OrfY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png" width="1063" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:1063,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93319,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/185400201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OrfY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 424w, https://substackcdn.com/image/fetch/$s_!OrfY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 848w, https://substackcdn.com/image/fetch/$s_!OrfY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 1272w, https://substackcdn.com/image/fetch/$s_!OrfY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F348a451e-80c7-4bba-a1b7-7ef2f5c6f910_1063x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Avenir</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ChR8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ChR8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 424w, https://substackcdn.com/image/fetch/$s_!ChR8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 848w, https://substackcdn.com/image/fetch/$s_!ChR8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 1272w, https://substackcdn.com/image/fetch/$s_!ChR8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ChR8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png" width="807" height="892" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7174d596-7741-4119-be69-78073dc2e7d8_807x892.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:892,&quot;width&quot;:807,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!ChR8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 424w, https://substackcdn.com/image/fetch/$s_!ChR8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 848w, https://substackcdn.com/image/fetch/$s_!ChR8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 1272w, https://substackcdn.com/image/fetch/$s_!ChR8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7174d596-7741-4119-be69-78073dc2e7d8_807x892.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://www.fabricatedknowledge.com/p/the-death-of-software-20-a-better?r=1ac0y&amp;utm_medium=ios&amp;shareImageVariant=card&amp;triedRedirect=true">The Death of Software 2.0 (A Better Analogy!)</a></p><p><a href="https://samuelalbanie.substack.com/p/reflections-on-2025?r=1ac0y&amp;utm_medium=ios&amp;shareImageVariant=overlay&amp;triedRedirect=true">Reflections on 2025</a></p><p><a href="https://stratechery.com/2026/meta-compute-the-meta-openai-battle-the-reality-labs-sacrifice/">Meta Compute, The Meta-OpenAI Battle, The Reality Labs Sacrifice</a></p><p><a href="https://x.com/EthanChoi7/status/2014629421089800690/?rw_tt_thread=True">Who Wins the AI Race?</a></p><p><a href="https://aleximas.substack.com/p/the-cyborg-era-what-ai-means-for?r=1ac0y&amp;utm_medium=ios&amp;shareImageVariant=overlay&amp;triedRedirect=true">The Cyborg Era: What AI means for jobs</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>We are making it mandatory, everyone at Box is going to be AI certified just on the basics... Just like we have security and data privacy training. This will be the second mandatory thing just because it&#8217;s that important that everyone deeply understands AI.</p><p><strong>Dylan Smith, Box CFO, Citi TMT Conference, September 2025</strong></p></div><div class="pullquote"><p> We took Jira, Confluence and Loom and consolidated that into one bundle called Teamwork Collection and then included Rovo in that and gave our customers 10x the Rovo or AI credits that they would have received if they just bought the product stand-alone.</p><p>From a go-to-market perspective in order to increase our AI consumption, we&#8217;ve given, one, our customers basically the right to use these products as opposed to focusing our sellers on purely going out selling stand-alone AI. And now we are really using our customer success organization to drive the consumption of these products, which is a very different model than what others in the market are doing.</p><p><strong>Brian Duffy, Chief Revenue Officer, Atlassian, Goldman Sachs Technology Conference, September 2025</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p>]]></content:encoded></item><item><title><![CDATA[2026 Predictions]]></title><description><![CDATA[Mid-Training, Advertising, Industrial Policy...]]></description><link>https://www.akashbajwa.co/p/2026-predictions</link><guid isPermaLink="false">https://www.akashbajwa.co/p/2026-predictions</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Wed, 07 Jan 2026 07:01:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nowd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join readers from OpenAI, Databricks, Stripe, Figma, and other iconic companies</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Hey everyone,</p><p>Hope you had a good break over the holiday period! </p><p>I took time off from writing to recharge. Time away from the constant stream of AI news was a priceless way to mentally reset, step back, and think about the &#8216;bigger picture&#8217;, as they say. </p><p><a href="https://zhengdongwang.com/2025/12/30/2025-letter.html">Step back to 2015</a>, when OpenAI was just founded. Today, much of the global economy&#8217;s growth depends on AI and its attendant infrastructure buildout. There are no benchmarks that can&#8217;t be beat. </p><p>What progress will the next 10 years hold? It&#8217;s perhaps terrifying and captivating in equal measure.  </p><p>I have two primary vehicles for thinking about the future: writing and community. I&#8217;m excited to resume both writing and the <a href="https://www.linkedin.com/posts/akashbajwa_its-been-super-rewarding-this-year-to-bring-activity-7407703119440080896-wBeE?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAABrSuMcBJrehsNuI9ywPyOeCfF2Hjytqqtw">community building that we started last year with the &#8216;Gradient Descending&#8217;</a> series of events. </p><p>As always, I&#8217;m grateful to each and every reader for joining. </p><div><hr></div><h3>1. Mid-Training: Enterprises Shop Model Checkpoints</h3><p>Model training will become more of a <a href="https://fakepixels.substack.com/p/pre-mid-post-training-way-of-life">continuum</a>.</p><p>Enterprise customers will increasingly be able to choose their preferred model checkpoints to inject their internal data.</p><p>As Ben Thompson <a href="https://stratechery.com/2025/aws-reinvent-agents-for-aws-nova-forge/">wrote</a> of AWS&#8217; Nova Forge offering:</p><blockquote><p><em>Right now you have two ways to incorporate your company&#8217;s data into an AI model: first, you can use RAG to basically have a model search your company&#8217;s data in the context of providing an answer. Second, you can post-train a model on your company&#8217;s data. The shortcoming in both approaches is that your company&#8217;s data isn&#8217;t actually in the model, which can lead to unsatisfying results.</em></p><p><em>Nova Forge is an offering built on AWS&#8217;s internally produced AI models; because they own the Nova models, they own the training checkpoints. What you can do with Nova Forge is choose a checkpoint &#8212; say, when the model is 80% trained &#8212; and infuse your company&#8217;s data at that point, so that the data is integrated into the model itself, and not simply searched or trained-in after-the-fact.</em></p></blockquote><p>As more practitioners seek the bare &#8216;<a href="https://x.com/karpathy/status/1938626382248149433?lang=en">cognitive core</a>&#8217; capabilities of language models without the drawbacks induced by low-quality internet data, it&#8217;ll be interesting to see how this market evolves to cater to this demand. </p><p>Will customers choose models at the 80% training stage, or will they desire even earlier checkpoints for the above reason? It depends on how much transparency will be provided into the training corpus. Then again, that level of disclosure hasn&#8217;t been met by most of the &#8216;open source&#8217; model providers who&#8217;ve chosen to only go as far as making their weights open. </p><h3>2. Advertising Finally Arrives</h3><p>OpenEvidence, rumoured to be raising at a $12bn valuation, grew from $8m ARR in 2024 to c. <em><strong>$</strong></em><strong>150m (!)</strong> as of the end of November 2025. </p><p>This astonishing growth was purely advertising revenue. This model enabled viral adoption by clinicians, as the platform now serves 15 million clinicians per month (40% of the US market). Not bad for a four-year-old company.</p><p>OpenAI&#8217;s recent Gemini-induced Code Red mode delayed their roll-out of advertising, but as Ben Thompson has argued many times, advertising is the inevitable internet business model. </p><p>xAI&#8217;s position in the &#8216;Muskonomy&#8217; (SpaceX/Starlink, Tesla, X) may yet see it become more competitive in the consumer market, but for now I see it as a two-horse race between Google and OpenAI. Both are still aggressively competing for market share, to be sure, but engagement metrics will continue improving as agentic commerce takes off (and more capabilities like memory are unlocked/improved). </p><p>Advertisers will finally be able to occupy the valuable ChatGPT/Gemini real estate. </p><h3>3. European Industrial Policy Gains Momentum</h3><p>Kanishka Narayan&#8217;s (UK Minister for AI and Online Safety) <a href="https://vibeshift.uk/">vibeshift </a>narrative is much needed. </p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;cee5f0d3-2cff-4146-9864-ef2c277702b3&quot;,&quot;duration&quot;:null}"></div><p>It&#8217;s easy to be cynical about Europe&#8217;s prospects, but I&#8217;m taking a glass half-full view and believe that policymakers are going to invest heavily across the AI supply chain.</p><p>From faster planning permissions, to new sources of energy and to data centre construction in <a href="https://www.gov.uk/government/publications/delivering-ai-growth-zones/delivering-ai-growth-zones">AI Growth Zones</a>, the physical, atom-based constraints on AI need to be lifted in order to realise accelerated <a href="https://x.com/pmarca/status/2003790009539969198">GDP growth</a>.</p><p>The EU&#8217;s AI Continent Action Plan has a number of promising proposals, including the The Cloud and AI Development Act which aims to triple the EU&#8217;s data centre capacity within the next five to seven years. The Plan also weighs up the importance of data foundries, proposing a Data Union strategy that &#8216;will, among other measures, explore the development of Data Labs as integral components of the AI Factories, to enable the provision, pooling, and secure sharing of high-quality data.&#8217;</p><p>It&#8217;s reasonable to be sceptical of sufficient follow-through on these recommendations, as was the case recently with the Draghi report. I&#8217;m naive enough to believe this period is different. </p><p>It has to be. </p><h3>4. Breakout Consumer Agents In China</h3><p>As I&#8217;ve <a href="https://www.akashbajwa.co/p/notes-on-china">written before</a>, some of the world&#8217;s biggest consumer AI apps after ChatGPT are in China. Meta buying Manus (originally HQ&#8217;d in Beijing) is an indication of the cutting-edge agent harnesses being built in China. </p><p>The antecedents of China&#8217;s hyper-competitive market are too lengthy to labour on here, but the resulting environment is one that is: heavily oriented towards consumer apps, big tech are constantly entering new business lines and competing with each other, business models are heavily reliant on advertising rather than subscriptions.</p><p>Upcoming IPOs of labs like Z.ai and Minimax and competitive model releases from companies like Xiaomi are consistent with China&#8217;s focus on efficient models that can run on devices (and, importantly, in factories).</p><p>Freda Duan <a href="https://robonomics.substack.com/p/os-agent-bytes-mobile-assistant-and">wrote about the impressive Doubao Mobile Assistant </a>from ByteDance in December:</p><blockquote><p><em>The key idea: <strong>&#8220;operating-system-level collaboration between Doubao and mobile phone manufacturers.&#8221;</strong><br>Meaning: this is not a standalone app. It is a <strong>system service baked into the ROM</strong>, able to drive the UI like a human across apps, mostly via real-time voice.</em></p><p><em>&#8220;Find which KFC spicy chicken burger is cheaper on JD vs Meituan vs Taobao, pick the lowest price, deliver to X address, add note &#8216;leave at front desk,&#8217; then screenshot the order and send to A on WeChat.&#8221;</em></p><p><em>The agent literally <strong>opens and navigates multiple shopping apps, compares prices, and completes the entire workflow</strong>.</em></p></blockquote><p>There&#8217;s been lots of speculation around whether Google would lean into Pixel more heavily as they would also be well placed to deliver this kind of experience. As Apple&#8217;s struggles continue, it might be a Chinese company that sets a new high watermark of consumer agents.</p><h3>5. Memory Cracked And Portable Context Wars</h3><p>&#8216;Sign in with OpenAI&#8217; has <a href="https://x.com/maxvwolff/status/1911216591339180431?s=20">been a long time coming</a>, given the rich semantic and behavioural data collected in the last three years. </p><p>These &#8216;behavioural signatures&#8217;, as my friend <a href="https://www.linkedin.com/in/antoinemoyroud/">Antoine Moyroud</a> <a href="https://venture-bystander.ghost.io/portable-memory-and-behavioral-signatures/">described them</a>, could unlock radically better personalisation in software:</p><blockquote><p><em>Logging into a new tool will no longer feel like introducing yourself from scratch but more like picking a memory provider that brings your context with you. Your preferences, tone, goals, personality, all imported with a simple click.</em></p></blockquote><p>The exact architecture for how memory will be <a href="https://leoniemonigatti.com/blog/memory-in-ai-agents.html">stored, updated, and accessed</a> is still being refined, with different approaches representing <a href="https://fastpaca.com/blog/memory-isnt-one-thing/">different trade-offs</a>.</p><p>Google introduced <a href="https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/">&#8216;test-time memorisation&#8217;</a> in December, a way for new &#8216;surprising&#8217; inputs to be stored in long-term memory. I expect to see more developments like this that go beyond improved summarisation of past conversations.</p><p>The portability of context/memory is contentious as in theory it&#8217;ll reduce switching costs for consumers. The labs will continue erecting walled gardens around the metadata they&#8217;ve collected on you, allowing you to use it to onboard onto new software but not export the raw data. New UI paradigms will emerge for how consumers manage their memory repos.</p><h2>Open Questions for 2026</h2><ol><li><p><strong>Will AI app markets start consolidating?</strong> Legal, healthcare, finance and several other end markets are heavily capitalised with efforts to <a href="https://x.com/JayaGup10/status/1989086170899444217">kingmake</a> doing little to restrict more capital flooding into these markets. The same can be said for markets like customer support and go-to-market software. We&#8217;re 1-3 years in for several consensus categories - will 2026 be the year of culling as the winners pull away (defined as companies with durable revenue, talent vortexes, and early upmarket penetration, among other things)?</p></li><li><p><strong>Will credit funds continue financing the AI infrastructure buildout?</strong> The scale of capex today has already far surpassed the free cash flow thrown out by big tech. There have <a href="https://www.ft.com/content/84c147a4-aabb-4243-8298-11fabf1022a3">already been signs</a> that the reliance on external capital may be untenable. </p></li><li><p><strong>Will operating leverage hold as the 2022/2023 class of AI companies continue scaling?</strong> To date, the best AI companies have crushed the leading software companies of the past decades when it comes to metrics like ARR per FTE, OpEx per FTE, and so on. How will these numbers hold up as they scale? </p></li><li><p><strong>Will new distribution channels gain traction? </strong>Founder-led content has helped companies like Lovable break out of a noisy market by building a brand directly with users and potential hires. Products with a prosumer motion might yet iterate on ways they can leverage AI more effectively, e.g. <a href="https://supersonik.ai/">for demos</a>. In the enterprise, if the role of an FDE will continue, how will it evolve and become more specialised?</p></li></ol><p>What questions are you thinking about? Shoot me a note. </p><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nowd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nowd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 424w, https://substackcdn.com/image/fetch/$s_!Nowd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 848w, https://substackcdn.com/image/fetch/$s_!Nowd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 1272w, https://substackcdn.com/image/fetch/$s_!Nowd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nowd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png" width="627" height="576" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:627,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:89496,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/183359874?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67091c30-a643-4248-91eb-7e8aab331903_660x594.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nowd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 424w, https://substackcdn.com/image/fetch/$s_!Nowd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 848w, https://substackcdn.com/image/fetch/$s_!Nowd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 1272w, https://substackcdn.com/image/fetch/$s_!Nowd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf4b69ea-bd13-429c-a7a0-3022248dd655_627x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Jefferies</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lhOn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lhOn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 424w, https://substackcdn.com/image/fetch/$s_!lhOn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 848w, https://substackcdn.com/image/fetch/$s_!lhOn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!lhOn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lhOn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png" width="1024" height="1280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1280,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lhOn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 424w, https://substackcdn.com/image/fetch/$s_!lhOn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 848w, https://substackcdn.com/image/fetch/$s_!lhOn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!lhOn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59184b69-e7c1-4310-882c-6e4a1563dde0_1024x1280.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://venture-bystander.ghost.io/portable-memory-and-behavioral-signatures/">Portable Memory &amp; Behavioral Signatures: the missing layer for AI personalisation</a></p><p><a href="https://danwang.co/2025-letter/">2025 Letter</a> by Dan Wang</p><p><a href="https://zhengdongwang.com/2025/12/30/2025-letter.html">2025 Letter</a> by Zhengdong Wang</p><p><a href="https://thechipletter.substack.com/p/tpu-mania?utm_source=post-email-title&amp;publication_id=1063960&amp;post_id=180421815&amp;utm_campaign=email-post-title&amp;isFreemail=true&amp;r=1ac0y&amp;triedRedirect=true&amp;utm_medium=email">TPU Mania</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>&#8216;But most of the time, we&#8217;re taking just a piece, either we&#8217;re taking one full stack and then we&#8217;re going to take more stacks over time or we&#8217;re just taking maybe a horizontal slice of the stacks.</p><p>And so the opportunity for us exactly, as you said, is a continued stacking of, in a good way, a multiyear steady trend in terms of just doing more and more with each of these merchants &#8216;</p><p><strong>Jeff J. Hoffmeister, Shopify CFO, Nasdaq Investor Conference</strong></p></div><div class="pullquote"><p>&#8216;Copilot is for AI like the iPhone is for personal computing or like Windows was for the PC. It&#8217;s a platform... Agents are basically like the apps on your iPhone.&#8217;</p><p>&#8216;We turned on Agent 365 before we launched and announced the product. And we have 138,000 agents being used by 88,000 employees on a weekly basis, which I would offer you all up to turn on Agent 365, it&#8217;s free in your environment because I would be willing to bet you have more AI happening inside your organization than you know about.&#8217;</p><p><strong>Judson B. Althoff, Microsoft VP, Barclays 23rd Annual Global Tech Conference</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p>]]></content:encoded></item><item><title><![CDATA[Inside Cursor: CursorBench, Internal PMF and Agents]]></title><description><![CDATA[With David Gomes and Eric Zakariasson]]></description><link>https://www.akashbajwa.co/p/inside-cursor-cursorbench-internal</link><guid isPermaLink="false">https://www.akashbajwa.co/p/inside-cursor-cursorbench-internal</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Tue, 16 Dec 2025 07:02:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!r-nd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p><em>Last week we hosted <a href="https://www.linkedin.com/in/davidrfgomes/">David </a>and <a href="https://www.linkedin.com/in/ericzakariasson/">Eric</a> from Cursor to discuss Cursor&#8217;s approach to benchmarking models, product development, the future of IDEs and CLIs, and more.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r-nd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r-nd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!r-nd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!r-nd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!r-nd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r-nd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r-nd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!r-nd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!r-nd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!r-nd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff60093a5-24d8-485b-9fb2-163228e73683_1600x1200.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Cursor 2.2 Launch</strong></h3><p><strong>1. Debug Mode</strong></p><ul><li><p>Collects runtime context by automatically adding logging statements to code</p></li><li><p>Agent reproduces issues and analyses actual data flow</p></li><li><p>Launches a Node.js server locally to capture logs across multiple repos</p></li><li><p>Works across different languages (Python, JavaScript) via HTTP requests</p></li></ul><p><strong>2. Plan Mode Improvements</strong></p><ul><li><p>Emerged from developers using markdown files to steer models</p></li><li><p>Allows upfront alignment between developer and AI before code execution</p></li><li><p>Treats plans as markdown files that agents can search and reference</p></li><li><p>Plans may evolve into long-term memory/context for projects</p></li><li><p>Moving toward storing all artifacts (plans, chat histories) as files rather than in databases</p></li></ul><p><strong>3. Multi-Agent Judging</strong></p><ul><li><p>Runs same task across multiple models simultaneously</p></li><li><p>Uses LLM-as-judge (currently Opus 4.5) to recommend best solution</p></li><li><p>The judge model sees original prompt and outputs, not tools calls or thinking process, not implementation details</p></li><li><p>Future: Tournament-style bracket judging for 8+ models</p></li><li><p>Already shown to outrank any individual model in Cursor&#8217;s benchmarks</p></li></ul><h2><strong>Technical Philosophy &amp; Infrastructure</strong></h2><h3><strong>Context as Core Focus</strong></h3><p>The team emphasised gathering context at two critical points:</p><ul><li><p><strong>Pre-generation:</strong> Plan mode for upfront alignment</p></li><li><p><strong>Post-generation:</strong> Debug mode for validation</p></li></ul><h3><strong>&#8220;Everything as Files&#8221; Strategy</strong></h3><ul><li><p>Models excel at reading/writing files</p></li><li><p>Converting all artifacts to files enables agents to grep and semantic search</p></li><li><p>Chat histories moving from SQLite to markdown files</p></li><li><p>Leverages existing primitives agents already understand</p></li></ul><h3><strong>CursorBench (Private Benchmark)</strong></h3><ul><li><p>Internal benchmark using real software engineering tasks</p></li><li><p><strong>Must remain private</strong> to prevent training data contamination</p></li><li><p>Currently tracks best single model vs ensemble approach:</p></li><li><p>Better predictor than public benchmarks (SWE-bench now potentially in training data)</p></li></ul><h2><strong>Model Evaluation Insights</strong></h2><h3><strong>Quality Metrics</strong></h3><p>Two key signals tracked:</p><ol><li><p><strong>Code persistence:</strong> How long generated code remains in codebase after commits</p></li><li><p><strong>Sentiment analysis:</strong> Next message indicates success (topic change) vs. failure (error logs, continued same issue)</p></li></ol><h3><strong>Model Usage Patterns</strong></h3><ul><li><p>Internal data shows many engineers are using Tab less, and more agent-heavy workflows</p></li><li><p>Sonnet family most popular, with spikes when new versions release</p></li><li><p>Brief Gemini adoption, then return to Claude/OpenAI models</p></li></ul><h2><strong>Development Culture &amp; Process</strong></h2><h3><strong>Internal PMF First</strong></h3><ul><li><p>Ship features internally before external release</p></li><li><p>Debug mode initially built on weekends, faced skepticism</p></li></ul><h3><strong>Rapid Iteration</strong></h3><ul><li><p>Minimal RFC process for most features</p></li><li><p>Each person is empowered to ship internally when they have ideas</p></li><li><p>RFCs mainly for context/knowledge sharing and alignment</p></li><li><p>RFC writing now valuable because agents can implement from specs</p></li></ul><h3><strong>Sub-Agents</strong></h3><ul><li><p>Had sub-agents in April but no internal PMF</p></li><li><p>Bringing back with Composer model (fast, nearly Sonnet-quality)</p></li><li><p>Models improved enough to make the experience acceptable</p></li></ul><h2><strong>Product Strategy &amp; Roadmap</strong></h2><h3><strong>Interfaces (IDE, CLI)</strong></h3><ul><li><p><strong>P0</strong> (primary product) is to automate coding - the interface will evolve over time</p></li></ul><h3><strong>Computer Use for Validation</strong></h3><ul><li><p>Using computer use to verify code changes work correctly</p></li><li><p>Example: Bug reported on Slack &#8594; Cursor fixes &#8594; Returns video proof &#8594; PR merged</p></li><li><p>Async workflow advantage - not waiting for human to test</p></li></ul><h3><strong>Other Features</strong></h3><ul><li><p><strong>Deep Links for sharing:</strong> Commands and rules shareable via links</p></li></ul><ul><li><p><strong>Voice Input:</strong> Already working well, working on voice output back from Cursor</p></li></ul><h3><strong>Future Features Discussed</strong></h3><ul><li><p><strong>Extension Marketplace:</strong> Reuse VS Code marketplace for commands, rules, MCP servers</p></li><li><p><strong>Video Input:</strong> Waiting for model providers to add capability</p></li></ul><h2><strong>Technical Deep Dives</strong></h2><h3><strong>Tool Management for MCP</strong></h3><p>Solutions being explored:</p><ol><li><p><strong>Lazy loading:</strong> Agent sees tool list, loads definitions only when needed</p></li><li><p><strong>File-based discovery:</strong> Tools stored as files, agents grep/search</p></li><li><p><strong><a href="https://blog.cloudflare.com/code-mode/">Code Mode</a>:</strong> Wrap MCPs as APIs (models better at APIs than MCP definitions), enables programmatic chaining</p></li></ol><h3><strong>Context Window Challenges</strong></h3><ul><li><p>Opus 4.5 only 180K context window (surprisingly small)</p></li><li><p>Team stops using chat at ~60% capacity (quality degrades)</p></li><li><p>Preference for one-shot prompts vs. multi-turn conversations</p></li><li><p>Summarisation degrades quality significantly</p></li></ul><h2><strong>Other Insights</strong></h2><h3><strong>BugBot Feature</strong></h3><ul><li><p>Automatic PR review bot in Cursor</p></li><li><p>Intentionally catches fewer bugs than competitors</p></li><li><p>Reason: Filtering false positives - too many warnings = users ignore them</p></li><li><p>Custom model trained for false positive detection</p></li></ul><h3><strong>Internal Tools Usage</strong></h3><ul><li><p>Can see internal data on feature usage, model preferences</p></li><li><p>Agent quality team (8 people) solely focused on harness/evals/tools</p></li></ul><h3><strong>Competitive Positioning</strong></h3><ul><li><p>Cursor Bench used as de facto model ranking</p></li><li><p>Model selector in IDE trusted more than public benchmarks</p></li></ul><div><hr></div><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7kN0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7kN0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 424w, https://substackcdn.com/image/fetch/$s_!7kN0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 848w, https://substackcdn.com/image/fetch/$s_!7kN0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 1272w, https://substackcdn.com/image/fetch/$s_!7kN0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7kN0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png" width="1456" height="743" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:743,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Weekly token volume by model type&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Weekly token volume by model type" title="Weekly token volume by model type" srcset="https://substackcdn.com/image/fetch/$s_!7kN0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 424w, https://substackcdn.com/image/fetch/$s_!7kN0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 848w, https://substackcdn.com/image/fetch/$s_!7kN0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 1272w, https://substackcdn.com/image/fetch/$s_!7kN0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F904ba0f1-1e1e-47fa-83cd-1b8d51d2c844_2454x1252.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: OpenRouter</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bzIS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bzIS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bzIS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bzIS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bzIS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bzIS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg" width="1108" height="528" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:528,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Exhibit 7&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Exhibit 7" title="Exhibit 7" srcset="https://substackcdn.com/image/fetch/$s_!bzIS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bzIS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bzIS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bzIS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3108ef78-9305-4773-b629-803a660b24cb_1108x528.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/">Titans + MIRAS: Helping AI have long-term memory</a></p><p><a href="https://venture-bystander.ghost.io/portable-memory-and-behavioral-signatures/">Portable Memory &amp; Behavioral Signatures: the missing layer for AI personalisation</a></p><p><a href="https://thechipletter.substack.com/p/tpu-mania?publication_id=1063960&amp;post_id=180421815&amp;isFreemail=true&amp;r=1ac0y&amp;triedRedirect=true">TPU Mania</a></p><p><a href="https://robonomics.substack.com/p/os-agent-bytes-mobile-assistant-and">OS Agent | Byte&#8217;s Mobile Assistant &amp; Implications</a></p><p><a href="https://x.com/fergal_reid/status/1999530555642098141/?rw_tt_thread=True">B2B &#8220;AI products&#8221; that are really services in a trenchcoat</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>You can achieve performance-wise so much better in the custom purpose-designed, hardware-driven XPU. And we see that in the TPU and we see that in all the accelerators we are doing for our other customers, much, much better in areas of sparse core, training, inference, reasoning, all that stuff.</p><p><strong>Hock E. Tan, Broadcom Q4 2025 Earnings Call</strong></p></div><div class="pullquote"><p>I&#8217;ll give you one example from a financials or economics perspective. Let&#8217;s take a media and entertainment company we&#8217;re working on... that organization was spending $10 million with us ARR on our core creative products... We were able to sell them Firefly Services and Firefly Foundry for about $7 million, so a pretty significant step up in terms of the engagement that we have with the customer.</p><p><strong>David Wadhwani, Adobe Q4 2025 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/inside-cursor-cursorbench-internal?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/inside-cursor-cursorbench-internal?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/inside-cursor-cursorbench-internal/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/inside-cursor-cursorbench-internal/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Open Source AI with Alibaba Qwen]]></title><description><![CDATA[With Vlad, Head of AI Products and Solutions at Alibaba Cloud]]></description><link>https://www.akashbajwa.co/p/open-source-ai-with-alibaba-qwen</link><guid isPermaLink="false">https://www.akashbajwa.co/p/open-source-ai-with-alibaba-qwen</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Tue, 02 Dec 2025 07:01:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gdd1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Upcoming events in London</em></h3><div class="pullquote"><p><strong>December 4th</strong>: <strong><a href="https://luma.com/uk1t92xl">Paper Club on RL and Multimodal Models</a></strong></p><p><strong>December 11th</strong>: <strong><a href="https://luma.com/ifow8non">AI Engineering with Cursor</a></strong></p><p><strong>December 15th</strong>: <strong><a href="https://luma.com/obmvljb4">2025 Reflections with Google and Earlybird AI Friends</a></strong></p></div><blockquote><div><hr></div></blockquote><p><em>Last week we hosted Vlad from Alibaba Cloud to discuss the Qwen model family, which are some of the most heavily used <a href="https://openrouter.ai/rankings">open source models</a>. </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gdd1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gdd1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gdd1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gdd1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gdd1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gdd1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg" width="1048" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!gdd1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gdd1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gdd1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gdd1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16252013-5354-4e3e-8f67-c291829e64e6_1048x786.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Alibaba Cloud</strong></h2><p>Alibaba Cloud positions itself as one of the few public hyperscale cloud providers with a complete, vertically integrated AI stack. The company operates 92 data centres globally across 29 regions.</p><p>Current European presence includes five data centres (three in Frankfurt, two in London), with upcoming regions in France and Netherlands. The company was founded in 2009 as part of Alibaba Group&#8217;s IT team before becoming a standalone cloud business focused on &#8220;user first and AI driven&#8221; strategy.</p><p>Alibaba Cloud maintains its own homegrown technology stack from the operating system level up. This includes custom kernels, compute infrastructure, storage services, networking, and security layers.</p><h2><strong>The Qwen Model Family</strong></h2><h3><strong>Core Philosophy</strong></h3><p>Qwen represents Alibaba&#8217;s commitment to open source AI development under Apache 2.0 licensing, providing significant freedom for commercial use. The models are designed to balance performance with efficiency (reflecting the broader Chinese AI ecosystem&#8217;s adaptation to compute access).</p><h3><strong>Model Variants &amp; Specifications</strong></h3><p><strong>Text Models:</strong></p><ul><li><p>Dense models: 0.6B to 32B parameters</p></li><li><p>Mixture of Experts (MoE): Qwen3-30B-A3B 32B and Qwen3-235B-A22B (with 22B active at inference)</p></li><li><p>Qwen3-Max: flagship commercial model exceeding 1 trillion parameters</p></li><li><p>Qwen3-Next-80B-A3B: 8B parameter model with 3B active parameters</p></li></ul><p>The variety in sizing allows developers to choose optimal performance-to-resource ratios for their specific use cases.</p><h3><strong>Multimodal Capabilities</strong></h3><p><strong>Qwen3-VL (Vision-Language):</strong></p><ul><li><p>Supports 256,000 token context window for large video processing</p></li><li><p>Near real-time performance (5-second delay) for video analysis</p></li><li><p>Applications include video labeling, tagging, content moderation, quality management</p></li><li><p>Particularly strong with complex images like tables and diagrams</p></li></ul><p><strong>Qwen-Audio:</strong></p><ul><li><p>Speech recognition and audio processing</p></li><li><p>Pairs with VL model for comprehensive multimodal applications</p></li></ul><p><strong>Qwen-Omni:</strong></p><ul><li><p>Experimental single model for all modalities</p></li><li><p>Goal is eventual AGI-like architecture</p></li><li><p>Demonstrated in edge computing scenarios (mobile devices)</p></li></ul><p><strong>Qwen Coder:</strong></p><ul><li><p>Specialised for code generation</p></li><li><p>Trained on massive code base during second phase of model training</p></li><li><p>Engineers had to drop down to PTX (low-level instruction set) for optimization</p></li><li><p>Competitive with leading code models</p></li><li><p>Available across multiple IDEs</p></li></ul><h3><strong>Reasoning &amp; Thinking Models</strong></h3><p>Qwen initially released a hybrid reasoning/non-reasoning model but later separated them based on community feedback. Key innovation is the <strong>&#8220;thinking budget&#8221;</strong> feature that allows users to define token limits for reasoning. Once the budget expires, the model automatically switches to instruct mode, optimising GPU resource utilisation.</p><p>The flagship Qwen3-Max commercial model ranks competitively with OpenAI, Claude, and Google Gemini on major benchmarks, though the competitive landscape shifts rapidly with frequent releases.</p><h3><strong>Multilingual Support</strong></h3><p>Qwen3 supports 119 languages and dialects, making it effectively universal for global deployment. This represents a major breakthrough from Qwen2.5&#8217;s ~30 language support.</p><h2><strong>Generative AI: The Wan Family</strong></h2><p>The Wan family handles video and image generation with multilingual capabilities. Wan 2.2 is open source under Apache 2.0, while Wan 2.5 is commercial.</p><p><strong>Capabilities:</strong></p><ul><li><p>Text-to-video, text-to-image</p></li><li><p>Image-to-video, video-to-video editing</p></li><li><p>Currently supports 15-second video generation</p></li><li><p>Performance comparable to Google Veo and Sora</p></li><li><p>Brand customisation for organisational alignment (colours, fonts, shapes)</p></li></ul><p><strong>Use Cases:</strong></p><ul><li><p>Social media content generation</p></li><li><p>Broadcasting and sports content</p></li><li><p>Corporate marketing materials</p></li></ul><h2><strong>Commercial vs Open Source Distinctions</strong></h2><h3><strong>Development &amp; Training</strong></h3><p><strong>Commercial Models:</strong></p><ul><li><p>36 trillion tokens for general training</p></li><li><p>Continuous post-release tuning and updates</p></li><li><p>Enhanced security and content generation support</p></li><li><p>Extended context windows up to 1 million tokens</p></li><li><p>Access to caching and optimisation features</p></li></ul><p><strong>Open Source Models:</strong></p><ul><li><p>~20 trillion tokens (roughly 1.5x less than commercial)</p></li><li><p>Fixed at release date, no continuous updates</p></li><li><p>Context windows: 32K to 128K depending on configuration</p></li><li><p>Community-driven development and derivative models</p></li><li><p>Self-managed training and fine-tuning</p></li></ul><h3><strong>Support &amp; Pricing</strong></h3><p><strong>Commercial:</strong></p><ul><li><p>Token-based pricing through Model Studio</p></li><li><p>Full technical support from Alibaba Cloud</p></li><li><p>Fine-tuning tools and evaluation models (in pilot, coming soon internationally)</p></li><li><p>Available in Singapore, Hong Kong;; Europe in 2026</p></li></ul><p><strong>Open Source:</strong></p><ul><li><p>Free to use and modify</p></li><li><p>Community support</p></li><li><p>Self-hosted with user-controlled infrastructure</p></li><li><p>Derivative models emerging (Japanese, Arabic language variants)</p></li></ul><h2><strong>Technical Infrastructure &amp; Deployment</strong></h2><h3><strong>Model Studio &amp; Platform-as-a-Service (PaaS)</strong></h3><p>Model Studio provides fully managed AI services combining:</p><ul><li><p>Qwen family models (text, vision, audio)</p></li><li><p>Partner models (DeepSeek, Glm, Kimi, Llama, others)</p></li><li><p>Token-based consumption across multiple models</p></li><li><p>Agent orchestration capabilities</p></li><li><p>MCP native support</p></li></ul><h3><strong>PAI (Platform for AI)</strong></h3><p>Elastic Algorithm Service enabling:</p><ul><li><p>Few-click deployment of latest Qwen models on GPU infrastructure</p></li><li><p>GPU pool creation and load distribution</p></li><li><p>Fully managed infrastructure</p></li></ul><h3><strong>Vector Database Options</strong></h3><p>Multiple managed services:</p><ul><li><p>AnalyticDB for Postgres (with vector engine, built on Greenplum)</p></li><li><p>Hologres (MPP database with vector engine, homegrown product)</p></li><li><p>Elasticsearch and OpenSearch (managed services)</p></li><li><p>Milvus (fully managed)</p></li></ul><p>Choice depends on use case: AnalyticDB works with larger data platforms (MaxCompute), Hologres optimised for specific workflows.</p><h2><strong>Agent Framework &amp; Tools</strong></h2><p><strong>Qwen Agent (Open Source):</strong></p><ul><li><p>Platform for agent orchestration</p></li><li><p>Supports planning, decision making, information management</p></li><li><p>Integration with various tools and functions</p></li><li><p>MCP-based tool selection and execution</p></li></ul><p><strong>Qwen Guard:</strong></p><ul><li><p>Enterprise guard-railing solution</p></li><li><p>Uses LLM to control input/output</p></li><li><p>Customisable policies with default integration</p></li><li><p>Protection against prompt injection and other attacks</p></li></ul><p><strong>Additional Tools:</strong></p><ul><li><p>Qwen3 Embedding and Reranker</p></li><li><p>Open source RAG framework</p></li><li><p>OCR capabilities (Qwen OCR for text-heavy documents)</p></li></ul><h2><strong>Key Discussion Points &amp; Insights</strong></h2><h3><strong>Adoption Patterns</strong></h3><p>The community shows fast adoption of Qwen open source models. Major platforms like HuggingFace integrated Qwen models into HuggingChat based on technical merit rather than advocacy. Companies like Airbnb have said that they use OpenAI for pilots but switch to open source models like Qwen for production due to cost efficiency at scale.</p><p>Derivative models demonstrate strong community engagement, particularly language-specific implementations (Japanese model topped charts for extended periods, Arabic implementations showing strong performance).</p><h3><strong>Performance &amp; Optimisation</strong></h3><p>Performance varies significantly across hosting solutions. Alibaba Cloud uses its custom acceleration technology (PAI-Lingjun) but doesn&#8217;t test Qwen performance on competitors like Google Cloud or AWS. Each cloud provider applies different optimisations and accelerations, making direct comparisons difficult.</p><h3><strong>Cost Economics</strong></h3><p>The token-based pricing for commercial models can become expensive at scale, driving production workloads toward open source alternatives. The calculation of when self-hosting becomes more economical than API calls depends heavily on specific use cases and scale. Alibaba Cloud offers both Model Studio (token-based) and PAI (AI infrastructure-based) to provide flexibility.</p><h3><strong>Fine-Tuning Strategy</strong></h3><p>Fine-tuning capabilities are being piloted within Alibaba Cloud Model studio in some regions. The recommended approach follows standard best practices:</p><ol><li><p>Start with system prompts</p></li><li><p>Try LoRA (Low-Rank Adaptation)</p></li><li><p>Full fine-tuning only if necessary (requires significant engineering expertise)</p></li></ol><p>Fine-tuning isn&#8217;t always necessary for solving specific problems, and over-reliance on it can indicate other issues with the implementation.</p><h2><strong>Future Roadmap &amp; Technical Forecasts</strong></h2><h3><strong>Context Length Evolution</strong></h3><ul><li><p>Current: Up to 1 million tokens (commercial)</p></li><li><p>Forecast: 10 million to 100 million tokens</p></li><li><p>Challenge: Processing massive documents while maintaining attention across entire context</p></li></ul><h3><strong>Model Architecture Questions</strong></h3><p>The engineering team forecasts several architectural considerations:</p><ol><li><p>All-modality convergence toward single universal models</p></li><li><p>Potential limitations of transformer architecture at current scaling</p></li><li><p>Focus on test-time compute scaling</p></li><li><p>Data scaling and quality improvements</p></li><li><p>Enhanced reinforcement learning implementation</p></li></ol><p>Some experts suggest the industry may have reached transformer architecture limitations, requiring new approaches beyond simply increasing model size.</p><h3><strong>Open Source Strategy</strong></h3><p>Alibaba remains committed to releasing both commercial and open source versions of new models. This dual approach serves multiple purposes:</p><ul><li><p>Developer community engagement and ecosystem building</p></li><li><p>Direct user feedback for rapid iteration</p></li><li><p>Path to commercial model adoption</p></li><li><p>Competitive differentiation in a crowded market</p></li></ul><p>The Qwen2.5 Math and Qwen2.5 Coder models extensively generated synthetic training data that fed back into pretraining for subsequent versions, creating a virtuous cycle of improvement.</p><h2><strong>Competitive Context &amp; Market Position</strong></h2><h3><strong>Chinese AI Ecosystem</strong></h3><p>The broader Chinese AI landscape features intense competition. Following DeepSeek&#8217;s breakthrough, major tech companies (Alibaba, Baidu, Moonshot, others) significantly accelerated foundation model investments. DeepSeek pioneered many efficiency innovations but hasn&#8217;t maintained momentum recently. Moonshot&#8217;s Kimi briefly overtook Qwen on some leaderboards.</p><p>Qwen models remain among the most popular open source options globally, with MetaLM showing strong consumption across various API reselling platforms. The competitive intensity means new model versions release constantly, making benchmarks quickly outdated.</p><h3><strong>Vertical Integration Advantage</strong></h3><p>Alibaba&#8217;s control of models, and cloud infrastructure represents a rare combination. Most competitors lack global cloud presence. This vertical integration enables:</p><ul><li><p>Full optimisation across the stack</p></li><li><p>Better cost control for model training</p></li><li><p>Custom acceleration technologies</p></li><li><p>Faster iteration and deployment</p></li></ul><h2><strong>Strategic Implications</strong></h2><h3><strong>Open Source as Go-to-Market</strong></h3><p>Alibaba&#8217;s open source strategy serves multiple strategic purposes:</p><ol><li><p>Developer mindshare and ecosystem development</p></li><li><p>Rapid feedback loops for model improvement</p></li><li><p>Lower barrier to experimentation leading to commercial adoption</p></li><li><p>Competitive differentiation against closed-source-only competitors</p></li></ol><p>The community&#8217;s willingness to create derivative models (Japanese, Arabic variants) validates technical quality while expanding Qwen&#8217;s reach into specialized domains and languages.</p><h3><strong>Build vs Buy Dynamics</strong></h3><p>For enterprises, the decision calculus involves:</p><ul><li><p><strong>API services:</strong> Fast to start, expensive at scale, less control</p></li><li><p><strong>Self-hosted open source:</strong> Higher upfront engineering cost, better long-term economics at scale, full control</p></li><li><p><strong>Managed services:</strong> Middle ground with varying levels of optimization</p></li></ul><p>The emergence of companies like Fireworks and Together.ai suggests a sustainable market for managed open source model hosting, capturing value from optimization and operations.</p><h3><strong>Data &amp; Synthetic Training</strong></h3><p>The use of Qwen2.5 models to generate synthetic training data for Qwen3 hints at an important trend: models improving through self-generated data. This approach requires careful quality control but can overcome data scarcity constraints, particularly relevant given export restrictions and data access challenges.</p><h2><strong>Q&amp;A</strong></h2><h3><strong>Downstream RL Training</strong></h3><p>One attendee noted Qwen3 serves as an excellent seed model for downstream reinforcement learning training and domain specialization. However, reward hacking occurs during RL training. Alibaba expressed interest in collecting specific feedback to bring back to engineering teams for future model development.</p><h3><strong>Computer Use Agents</strong></h3><p>When asked about computer use capabilities (similar to Lovable or v0), Vlad clarified that Qwen Agent framework provides orchestration tools, but interactive computer control features might be product-level additions to Qwen Chat rather than core model capabilities. This mirrors how ChatGPT&#8217;s Operator exists as a product feature rather than model capability.</p><h3><strong>Voice Capabilities</strong></h3><p>Alibaba offers cascaded voice models (ASR and TTS) but attendees noted the industry trend toward native voice-to-voice models for better latency and naturalness. While cascaded models currently offer better reliability for enterprise use cases requiring tool calling and reasoning, native voice-to-voice models will likely improve significantly in the coming months for consumer applications.</p><h3><strong>Prompting Challenges</strong></h3><p>A significant practical concern raised: switching from one model to another (e.g., Gemini to Qwen) requires substantial prompt engineering adjustments. While competitors like Anthropic provide detailed migration guides explaining how to adjust prompts between model versions, Qwen&#8217;s documentation lacks comprehensive prompting guides, especially for cross-model migration.</p><div><hr></div><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iAk3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iAk3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iAk3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iAk3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iAk3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iAk3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!iAk3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iAk3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iAk3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iAk3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb400e88-b33d-4c2d-a3fd-a38d4e2cff93_1512x850.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Theory Ventures</figcaption></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uzv4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uzv4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 424w, https://substackcdn.com/image/fetch/$s_!uzv4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 848w, https://substackcdn.com/image/fetch/$s_!uzv4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 1272w, https://substackcdn.com/image/fetch/$s_!uzv4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uzv4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png" width="1456" height="729" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:729,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uzv4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 424w, https://substackcdn.com/image/fetch/$s_!uzv4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 848w, https://substackcdn.com/image/fetch/$s_!uzv4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 1272w, https://substackcdn.com/image/fetch/$s_!uzv4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57447724-b889-4bd8-b4cf-3147213a8f2d_3303x1653.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://newsletter.semianalysis.com/p/tpuv7-google-takes-a-swing-at-the">SemiAnalysis</a></figcaption></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://www.a16z.news/p/moats-before-gross-margins-revisited">Moats Before (Gross) Margins: Revisited</a></p><p><a href="https://fastpaca.com/blog/llm-memory-systems-explained">LLM Memory Systems Explained</a></p><p><a href="https://www.latent.space/p/agent-labs?publication_id=1084089&amp;post_id=175170404&amp;isFreemail=true&amp;r=1ac0y&amp;triedRedirect=true">The Agent Labs Thesis</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>Most accelerators without CUDA and NVIDIA&#8217;s time-tested and versatile architecture became obsolete within a few years as model technologies evolve. Thanks to CUDA, the A100 GPUs we shipped 6 years ago are still running at full utilization today, powered by vastly improved software stack.</p><p><strong>Jensen Huang, Nvidia Q3 2026 Earnings Call</strong></p></div><div class="pullquote"><p>All of the latest GPUs that are running are running at full capacity and not just them, the last generation GPUs, even GPUs from 3 to 5 years ago, so also several generations back, those GPUs are to this day still running at full capacity.</p><p><strong>Yongming Wu, Alibaba Q2 2026 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/open-source-ai-with-alibaba-qwen?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/open-source-ai-with-alibaba-qwen?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/open-source-ai-with-alibaba-qwen/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/open-source-ai-with-alibaba-qwen/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Designing AI-Native Software]]></title><description><![CDATA[With SPACING]]></description><link>https://www.akashbajwa.co/p/designing-ai-native-software</link><guid isPermaLink="false">https://www.akashbajwa.co/p/designing-ai-native-software</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Mon, 24 Nov 2025 07:02:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uEFq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Gradient Descending Roundtables in London</em></h3><div class="pullquote"><p><strong>November 26th</strong>: <strong><a href="https://luma.com/43yu7r2e">Open Source AI with Alibaba Qwen</a></strong></p></div><blockquote><div><hr></div></blockquote><p><em>Last week we hosted <a href="https://www.linkedin.com/in/jonathanheuser/">Jonathan</a> and <a href="https://www.linkedin.com/in/amelie-schlueter/">Amelie</a>, cofounders of design venture studio <a href="https://www.spacing.co/">SPACING</a>, to unpack how to approach design for AI-Native startups.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uEFq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uEFq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uEFq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uEFq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uEFq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uEFq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg" width="3024" height="3020" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3020,&quot;width&quot;:3024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1652430,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179758317?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b9e8ade-15f5-4778-a998-4fb208f13c84_4032x3024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uEFq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uEFq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uEFq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uEFq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5571587c-dade-4593-8429-ec9201268242_3024x3020.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Moving Beyond Chat</strong></h2><p>Most startups <a href="https://www.spacing.co/">SPACING</a> worked with requested a chat interface, treating it as the default pattern for AI interaction. This led to:</p><ul><li><p>Homogeneous user experiences across products</p></li><li><p>Home screens with generic prompts like &#8220;What do you want to achieve today?&#8221;</p></li><li><p>Users struggling to understand capabilities and how to articulate requests</p></li><li><p>Suboptimal workflows for specific use cases</p></li></ul><h3><strong>The Chat Interface Paradox</strong></h3><p><strong>Historical Context:</strong></p><ul><li><p>Command-line interfaces &#8594; Graphical User Interfaces (GUIs) &#8594; back to text-based chat</p></li><li><p>The key difference: Natural language processing enables conversational interaction</p></li><li><p>But this doesn&#8217;t mean chat is always optimal</p></li></ul><p><strong>When Chat Works:</strong></p><ul><li><p><strong>Cursor</strong> - Engineers working with code and text (natural medium)</p></li><li><p>Text-heavy workflows</p></li><li><p>Open-ended exploration</p></li><li><p>Specific questions requiring detailed responses</p></li></ul><p><strong>When Chat Fails:</strong></p><ul><li><p>Tasks requiring visual precision (e.g., changing button colors)</p></li><li><p>Workflows needing immediate visual feedback</p></li><li><p>Actions better suited to direct manipulation</p></li><li><p>Users without clear prompting knowledge</p></li></ul><h3><strong>Direct Manipulation vs Chat</strong></h3><p>Changing a button colour through chat:</p><ul><li><p>Requires writing: &#8220;Change the colour to red&#8221;</p></li><li><p>May not get the exact shade desired</p></li><li><p>Requires specifying hex codes for precision</p></li><li><p>Lacks immediate visual feedback</p></li><li><p>Multiple iterations needed</p></li></ul><p>vs. Direct manipulation:</p><ul><li><p>Click colour picker</p></li><li><p>See all colour options instantly</p></li><li><p>Immediate visual feedback</p></li><li><p>Faster and more accurate</p></li></ul><div><hr></div><h2><strong>Four Ideas Beyond Chat</strong></h2><h3><strong>1. AI as a Collaborative Partner</strong></h3><p><strong>Philosophy:</strong> AI as a companion rather than a replacement, working alongside users to enhance capabilities.</p><h4><strong>Implementation Strategies:</strong></h4><p><strong>a) Contextual Commenting System</strong></p><ul><li><p>Integrate AI into existing UI patterns (commenting features)</p></li><li><p>Tag AI in comments on canvas elements instead of switching to separate chat</p></li><li><p>AI responds in context, maintaining workflow continuity</p></li><li><p>Example: Comment on a design element &#8594; tag AI &#8594; AI suggests variations in place</p></li></ul><p><strong>b) Project-Aware AI Companion</strong></p><ul><li><p>AI &#8220;lives&#8221; within design files with persistent context</p></li><li><p>Receives master instructions about project goals, brand guidelines, tone of voice</p></li><li><p>Eliminates repeated context-sharing (unlike ChatGPT workflow)</p></li><li><p>Press button to request copywriting assistance</p></li><li><p>AI works in background while designer continues other tasks</p></li></ul><p><strong>c) Exploratory AI Features</strong></p><ul><li><p>&#8220;Surprise button&#8221; next to colour picker</p></li><li><p>Generates random colour suggestions</p></li><li><p>Press multiple times for variety</p></li><li><p>Sparks inspiration</p></li></ul><p><strong>d) Icon Generation</strong></p><ul><li><p>Quick icon generation on demand</p></li><li><p>No need for external icon libraries</p></li><li><p>Integrated into workflow</p></li></ul><p><strong>e) AI-Generated First Drafts</strong></p><ul><li><p>Addresses &#8220;blank canvas&#8221; problem</p></li><li><p>AI creates initial structure (3 screens, basic flow, components)</p></li><li><p>Goal: No accuracy, but starting point for iteration</p></li><li><p>Leverages human strength: Critiquing existing work vs. creating from nothing</p></li><li><p>AI generates initial automation flow so users can modify rather than build from scratch</p></li></ul><p><strong>Key Insight:</strong> People excel at criticising and improving existing work but struggle starting from nothing. AI provides that initial draft.</p><div><hr></div><h3><strong>2. Proactive AI</strong></h3><p><strong>Philosophy:</strong> AI anticipates next actions and surfaces suggestions contextually.</p><h4><strong>Implementation Examples:</strong></h4><p><strong>a) Contextual Suggestions</strong></p><ul><li><p>User selects table cells</p></li><li><p>AI surfaces most obvious next action</p></li><li><p>Keyboard shortcut to execute</p></li><li><p>Example: Select cells &#8594; AI suggests rewriting content &#8594; Press shortcut to execute</p></li></ul><p><strong>b) Smart Filtering</strong></p><ul><li><p>100+ job applicants to review</p></li><li><p>AI pre-scores applicants</p></li><li><p>Surfaces top 3 candidates</p></li><li><p>Provides strength/weakness summaries</p></li><li><p>Generates interview questions</p></li><li><p>User focuses on high-value activities (interviewing) rather than screening</p></li></ul><p><strong>c) Autonomous Content Creation</strong></p><ul><li><p>AI learns brand identity and competitor landscape</p></li><li><p>Generates new content pieces daily without prompting</p></li><li><p>User wakes up to 5 new AI-generated video outputs</p></li><li><p>Proactive value delivery while user sleeps</p></li></ul><h4><strong>Key Challenge: Onboarding &amp; Data Collection</strong></h4><ul><li><p>Initial approach: Comprehensive onboarding collecting information manually</p></li><li><p>Problem: Onboarding too long, but insufficient data = poor results</p></li><li><p>Solution: Balanced approach</p><ul><li><p>Collect minimum data for first value</p></li><li><p>Continuously gather data during usage</p></li><li><p>Feedback mechanisms: Like/unlike, quality ratings, script preferences</p></li><li><p>Long-term: Close the loop with analytics showing ad performance</p></li></ul></li></ul><p><strong>Critical Balance:</strong> Minimise user friction while maximising AI intelligence</p><div><hr></div><h3><strong>3. Task-Driven Workflows </strong></h3><h4><strong>Phase 1: Chat-First Approach</strong></h4><ul><li><p>Assumption: Give users conversational interface for maximum flexibility</p></li><li><p>Reality: Users didn&#8217;t understand what to type</p></li><li><p>Problem: Too much freedom without guidance</p></li><li><p>Especially challenging for users without marketing expertise</p></li><li><p>Result: Poor user experience, confusion, abandonment</p></li></ul><h4><strong>Phase 2: Guided Workflow</strong></h4><ul><li><p>Broke down required information into structured steps</p></li><li><p>Used AI to enhance each input field</p></li><li><p>Clear progression: Step 1 &#8594; Step 2 &#8594; Output</p></li><li><p>Result:</p><ul><li><p>Shorter time to value</p></li><li><p>Better outcomes</p></li><li><p>Higher user satisfaction</p></li><li><p>More consistent quality</p></li></ul></li></ul><h4><strong>Key Lesson</strong></h4><p>&#8220;We started with uncontrolled freedom...and the result was no one could figure out what to do. Then we moved to a guided approach...and the overall experience was just better.&#8221;</p><h4><strong>Supporting Features:</strong></h4><p><strong>a) Prompt Enhancement</strong></p><ul><li><p>User types short, simple prompt</p></li><li><p>&#8220;Enhance prompt&#8221; button</p></li><li><p>AI expands to sophisticated prompt automatically</p></li><li><p>Better results without prompt engineering knowledge</p></li></ul><p><strong>b) Example Templates</strong></p><ul><li><p>Pre-filled examples to start from</p></li><li><p>Users click, see result, then iterate</p></li><li><p>Educates users on capabilities through interaction</p></li></ul><p><strong>c) Smart Defaults</strong></p><ul><li><p>Pre-populated based on user type (designer vs. founder)</p></li><li><p>User only edits exceptions rather than filling blank forms</p></li><li><p>Reduces cognitive load</p></li></ul><div><hr></div><h3><strong>4. Personalised Software</strong></h3><p><strong>Philosophy:</strong> Software adapts to individual users and teams rather than one-size-fits-all.</p><h4><strong>The Home Analogy</strong></h4><p>&#8220;Your home should be arranged to fit your needs. You shouldn&#8217;t move into a house and have to live with someone else&#8217;s furniture arrangement.&#8221;</p><h4><strong>Personalisation Dimensions:</strong></h4><p><strong>a) Custom Feature Development</strong></p><ul><li><p>Users build features on top of base software</p></li><li><p>Example: Calendly + Stripe integration</p><ul><li><p>Many users want to charge for meeting slots</p></li><li><p>Calendly doesn&#8217;t offer this</p></li><li><p>Vision: User prompts to build Stripe integration themselves</p></li><li><p>Enables non-technical users to extend software</p></li></ul></li></ul><p><strong>b) User-Type Adaptation</strong></p><ul><li><p>Onboarding asks: Designer or Founder?</p></li><li><p><strong>Designer view:</strong> Shows all customisation tools, auto-layout options, advanced features</p></li><li><p><strong>Founder view:</strong> Hides overwhelming technical details, surfaces agentic AI features</p></li><li><p>Software adapts to expertise level</p></li></ul><p><strong>c) Usage Pattern Learning</strong></p><ul><li><p><strong>Keyboard shortcut user:</strong> Hide tooltips, maximise screen space</p></li><li><p><strong>Visual navigation user:</strong> Show labels, icons, clear UI elements</p></li><li><p>AI observes behaviour and adjusts interface accordingly</p></li></ul><p><strong>d) Team-Level Customisation</strong></p><ul><li><p>Company-specific workflows</p></li><li><p>Example: Custom performance review process in Linear/Asana</p></li><li><p>Not in standard software</p></li><li><p>Team prompts tool to build their unique workflow</p></li></ul><h4><strong>Critical Challenges Discussed:</strong></h4><p><strong>Challenge 1: Consistency Across Users</strong></p><ul><li><p>Problem: If everyone&#8217;s UI looks different, how do colleagues help each other?</p></li><li><p>Zoom calls: &#8220;Click top left&#8221; but it&#8217;s in different position for colleague</p></li><li><p>Potential solution: Team-level personalisation rather than individual</p></li><li><p>Open question: Balance between personalisation and shared understanding</p></li></ul><p><strong>Challenge 2: Guardrails</strong></p><ul><li><p>What can AI adjust vs. what must remain stable?</p></li><li><p>Need clear boundaries for personalisation scope</p></li><li><p>Risk: AI moving critical elements unexpectedly</p></li></ul><p><strong>Challenge 3: Progressive Disclosure</strong></p><ul><li><p>Different users see different features</p></li><li><p>How to maintain product coherence?</p></li><li><p>When does personalisation become fragmentation?</p></li></ul><div><hr></div><h2><strong>Q&amp;A Discussion Highlights</strong></h2><h3><strong>On Data Collection &amp; AI Quality</strong></h3><p><strong>Question:</strong> How much onboarding data is needed for proactive AI?</p><p><strong>Super Scale Experience:</strong></p><ul><li><p>Started with website scraping</p></li><li><p>Added competitor analysis for niche understanding</p></li><li><p>Iteratively removed unnecessary fields</p></li><li><p>Balance: Long onboarding vs. quick value</p></li><li><p>Solution: Minimum viable onboarding + continuous learning</p></li><li><p>Feedback loops: Scoring systems, like/dislike, style preferences</p></li><li><p>Future: Analytics closing the loop (ad performance &#8594; agent improvement)</p></li></ul><h3><strong>On User Friction &amp; Context</strong></h3><p><strong>Challenge:</strong> Getting users to provide context without disrupting work</p><p><strong>Strategies:</strong></p><ul><li><p>Automatic data collection where possible</p></li><li><p>Example: Connect accounts and scrape data directly instead of asking for information manually</p></li><li><p>Reduce manual input through intelligent defaults</p></li></ul><p><strong>Key Principle:</strong> Users won&#8217;t write perfect prompts or provide extensive context. Design must accommodate this reality.</p><h3><strong>On Getting Users Started with Chat</strong></h3><p><strong>The &#8220;Empty Chat Box&#8221; Problem:</strong> Users don&#8217;t know what to type when facing a blank chat interface.</p><p><strong>Solutions Implemented:</strong></p><ol><li><p><strong>Prompt Templates Below Chat</strong></p><ul><li><p>Pre-written use cases users can click</p></li><li><p>Shows capabilities through examples</p></li><li><p>Especially important for users unfamiliar with AI tools (e.g., lawyers)</p></li></ul></li><li><p><strong>Structured Input Forms</strong></p><ul><li><p>Guide users through required information</p></li><li><p>AI enhances each field</p></li><li><p>Clear progression to output</p></li></ul></li><li><p><strong>Visual Examples/Galleries</strong></p><ul><li><p>Show what others have created</p></li><li><p>User-generated content inspiration</p></li><li><p>Similar to Notion templates, V0 examples</p></li><li><p>Calibrates expectations and demonstrates value</p></li></ul></li><li><p><strong>Starting Point Generation</strong></p><ul><li><p>User clicks example</p></li><li><p>Gets initial result (may not be perfect)</p></li><li><p>Iterates from there</p></li><li><p>Educates on capabilities through interaction</p></li></ul></li></ol><h3><strong>On Feedback Mechanisms</strong></h3><p><strong>Question:</strong> How do you collect meaningful AI feedback beyond thumbs up/down?</p><p><strong>Challenges:</strong></p><ul><li><p>Users rarely provide feedback</p></li><li><p>Thumbs up/down insufficient</p></li><li><p>Need both quantitative and qualitative insights</p></li></ul><p><strong>Approaches:</strong></p><p><strong>Quantitative Tracking:</strong></p><ul><li><p>Time to export/completion</p></li><li><p>Number of regenerations needed</p></li><li><p>Whether users edit prompts after submission (indicates dissatisfaction)</p></li><li><p>Usage of manual editing tools (signals AI output inadequacy)</p></li></ul><p><strong>Qualitative Feedback:</strong></p><ul><li><p>Optional detailed feedback forms</p></li><li><p>&#8220;Why don&#8217;t you like this?&#8221; prompts</p></li><li><p>Script/style preferences</p></li><li><p>Problem: Low user engagement with feedback</p></li></ul><p><strong>Indirect Signals:</strong></p><ul><li><p>If user goes to manual editor after AI generation = AI missed the mark</p></li><li><p>Multiple regenerations = AI not understanding requirements</p></li><li><p>Quick export = successful generation</p></li></ul><p><strong>Challenge:</strong> People are &#8220;annoyed&#8221; by feedback requests (like ChatGPT&#8217;s constant questions about response quality)</p><h3><strong>On Voice Interfaces</strong></h3><p><strong>Context:</strong> Discussion of voice as alternative to chat</p><p><strong>When Voice Works:</strong></p><ul><li><p>Thinking out loud / brainstorming</p></li><li><p>Therapeutic &#8220;dumping&#8221; of thoughts to ChatGPT</p></li><li><p>Quick input capture</p></li><li><p>Natural expression of ideas</p></li></ul><p><strong>When Voice Fails:</strong></p><ul><li><p>Open office environments (privacy, noise)</p></li><li><p>Latency issues</p></li><li><p>Can&#8217;t easily review or edit what was said</p></li><li><p>Reading is faster than listening to responses</p></li><li><p>Forces real-time engagement (can&#8217;t pause and think)</p></li></ul><p><strong>Key Insight:</strong> &#8220;You talk quicker than you type, but you read quicker than you speak&#8221; - voice works for input but text is better for output consumption.</p><h3><strong>On Model Comparison &amp; Iteration</strong></h3><p><strong>Cursor Feature Highlight:</strong></p><ul><li><p>Run multiple models in parallel</p></li><li><p>Compare outputs side-by-side</p></li><li><p>Some models more conservative, others more creative</p></li><li><p>Users choose best result</p></li></ul><p><strong>Inline Commenting Approach:</strong></p><ul><li><p>Highlight specific parts of AI output</p></li><li><p>Comment on what needs changing</p></li><li><p>Regenerate specific sections rather than entire output</p></li><li><p>Reduces back-and-forth iterations</p></li></ul><h3><strong>On Progressive Disclosure &amp; User Types</strong></h3><p><strong>Question:</strong> Who decides user type - the person or AI?</p><p><strong>Answer:</strong> AI decides based on:</p><ul><li><p>Onboarding questionnaire responses</p></li><li><p>Observed usage patterns</p></li><li><p>Keyboard shortcut usage vs. visual navigation</p></li><li><p>Feature utilisation over time</p></li><li><p>Evolving understanding of user sophistication</p></li></ul><p><strong>Guardrails Discussion:</strong></p><ul><li><p>Need rules for what AI can/cannot adjust</p></li><li><p>Some elements must remain stable</p></li><li><p>Balance adaptability with predictability</p></li></ul><div><hr></div><h2><strong>Case Study: From Chat to Guided Workflow</strong></h2><p><strong>Initial Vision:</strong></p><ul><li><p>Chat interface for video ad creation</p></li><li><p>Assumption: Natural language = better UX</p></li><li><p>&#8220;Create this ad&#8221; &#8594; AI generates video</p></li></ul><p><strong>Problems Encountered:</strong></p><ol><li><p>Users didn&#8217;t understand capabilities</p></li><li><p>No marketing expertise to craft effective prompts</p></li><li><p>Too much freedom = paralysis</p></li><li><p>Prompts too vague for quality output</p></li><li><p>Long wait times followed by disappointment</p></li></ol><p><strong>The Pivot:</strong> Moved to structured, guided flow:</p><ul><li><p>Step 1: Brand information (URL, identity)</p></li><li><p>Step 2: Voice and tone</p></li><li><p>Step 3: Niche analysis (automatic)</p></li><li><p>Step 4: Output preferences</p></li><li><p>AI enhances each input field</p></li><li><p>Clear progression to output</p></li></ul><p><strong>Results:</strong></p><ul><li><p>Dramatically shorter time to value</p></li><li><p>Higher quality outputs</p></li><li><p>Better user satisfaction</p></li><li><p>More predictable results</p></li><li><p>Users understand capabilities through structure</p></li></ul><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vnSc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vnSc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 424w, https://substackcdn.com/image/fetch/$s_!vnSc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 848w, https://substackcdn.com/image/fetch/$s_!vnSc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 1272w, https://substackcdn.com/image/fetch/$s_!vnSc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vnSc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png" width="1200" height="958" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:958,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!vnSc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 424w, https://substackcdn.com/image/fetch/$s_!vnSc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 848w, https://substackcdn.com/image/fetch/$s_!vnSc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 1272w, https://substackcdn.com/image/fetch/$s_!vnSc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6090792-adae-4ac8-8abc-74a69fd1bc37_1200x958.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://x.com/JonSaadFalcon/status/1988658433374187545">Source</a></figcaption></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://x.com/GavinSBaker/status/1991248768654803337">Some thoughts on AI</a></p><p><a href="https://theoryvc.com/blog-posts/from-context-engineering-to-context-platforms">From Context Engineering to Context Platforms</a></p><p><a href="https://joincolossus.com/article/is-space-investable/">Is Space Investable?</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>Over time, the best enterprises will have seamless data access across many of their data lakes. Whether it&#8217;s their observability data lake, their security data lake, their IT data lake. Because eventually, you want agents to go and go figure out what&#8217;s going on across multiple data lakes and solve your problem. And sometimes problems cross across multiple data lakes, right? If something is down in application, maybe the firewall shut it down, so firewall is in security data lake. So if you want this agentic capability across data lakes, all we&#8217;re trying to do is we&#8217;re trying to build the enterprise fabric with our customers.</p><p><strong>Nikesh Arora, Palo Alto Networks Q1 2026 Earnings Call</strong></p></div><div class="pullquote"><p>We now have over 2,450 customers on Elastic Cloud using us for Gen AI use cases with over 370 of these amongst our cohort of customers spending $100,000 or more with us annually, representing nearly 1/4 of our greater than $100,000 ACV customer cohort leveraging Elastic for GenAI use cases.</p><p><strong>Ashutosh Kulkarni, Elastic Q2 2026 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/designing-ai-native-software?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/designing-ai-native-software?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/designing-ai-native-software/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/designing-ai-native-software/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Agent Frameworks & Memory Roundtable]]></title><description><![CDATA[With Cloudflare's Matt Carey]]></description><link>https://www.akashbajwa.co/p/agent-frameworks-and-memory-roundtable</link><guid isPermaLink="false">https://www.akashbajwa.co/p/agent-frameworks-and-memory-roundtable</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Fri, 21 Nov 2025 07:01:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!JTWt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Gradient Descending Roundtables</em></h3><div class="pullquote"><p><strong>November 26th:</strong> <a href="https://luma.com/43yu7r2e">Open Source Models with Alibaba Qwen</a></p></div><div><hr></div><p><em>This week, we hosted <a href="https://www.linkedin.com/in/mattzcarey/">Matt </a>from Cloudflare to discuss Agent Frameworks and Memory. Thanks to everyone who came and made the discussion so insightful! </em></p><p><em>I&#8217;m sharing the summary of our discussion below.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JTWt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JTWt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JTWt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JTWt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JTWt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JTWt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg" width="1456" height="1941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2699230,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179497802?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JTWt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JTWt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JTWt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JTWt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f9124a2-fef2-429a-9f81-ccead1c71d94_3024x4032.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>1. Framework Fatigue &amp; The &#8220;Library vs Framework&#8221; Debate</strong></h3><p><strong>Core tension:</strong> Most participants expressed frustration with bloated frameworks</p><ul><li><p>General consensus: &#8220;Everyone&#8217;s trying to build frameworks and products, no one&#8217;s trying to build libraries&#8221;</p></li><li><p>Many have <strong>abandoned complex frameworks</strong> in favor of simpler approaches (OpenAI SDK, Anthropic SDK directly)</p></li><li><p>Matt noted this is his <strong>4th agent framework</strong> - each iteration has reduced complexity</p></li></ul><p><strong>Current preferences:</strong></p><ul><li><p>Direct SDK usage (OpenAI/Anthropic) increasingly popular</p></li><li><p>Cloudflare&#8217;s Agents SDK seen as lighter-weight abstraction</p></li><li><p>LangGraph still used but developers &#8220;find myself going back to deterministic workflows&#8221;</p></li></ul><h3><strong>2. Code Mode</strong></h3><p><strong>What is Code Mode?</strong></p><ul><li><p>Generating an SDK from tools, then having LLM write code against that SDK instead of direct tool calls</p></li><li><p>Cloudflare implementation uses <strong>dynamic loaders</strong> to run generated code in isolated workers</p></li><li><p>~1ms cold start times on V8 isolates (not full sandboxes)</p></li></ul><p><strong>Key advantages identified:</strong></p><ul><li><p><strong>Massive token efficiency</strong> - compress 30 tools into one code-generation tool</p></li><li><p>Enables <strong>data flow without seeing data</strong> (like bash pipes)</p></li><li><p>Deterministic execution with compile-time validation</p></li><li><p>Can use minified code since it&#8217;s machine-executed</p></li></ul><p><strong>Open question Matt posed:</strong> Should this be framework-level abstraction or let developers implement themselves?</p><h3><strong>3. Memory &amp; Context Management Strategies</strong></h3><p><strong>Minimal RAG adoption:</strong> Only 1-2 participants using embeddings-based retrieval</p><ul><li><p>One team: &#8220;glob and grab is unreasonably strong baseline&#8221; - hard to beat for code agents</p></li><li><p>When RAG is used: Hybrid approach with knowledge graphs + embeddings</p></li></ul><p><strong>Graph-based approaches:</strong></p><ul><li><p><strong>Neo4j implementation</strong> by Cisco team for platform engineering</p></li><li><p>Using LLMs to build knowledge graphs from documents</p></li><li><p>Challenges: injection attack concerns, access control complexity</p></li><li><p>Pattern: Find relevant node via embeddings &#8594; K-nearest neighbors traversal</p></li></ul><p><strong>&#8220;Predictive context loading&#8221;</strong> - Most novel approach discussed:</p><ul><li><p>Track agent behaviour patterns across evals</p></li><li><p><strong>Pre-load context</strong> based on statistical patterns (e.g., &#8220;if agent touches file X, 90% chance it needs files Y, Z next&#8221;)</p></li><li><p>Comparison to web prefetching/autocorrect</p></li><li><p>&#8220;Old school ML&#8221; middle layer between agent and tools</p></li></ul><h3><strong>4. Session Management &amp; Sandboxing</strong></h3><p><strong>Pain points with Claude Code SDK:</strong></p><ul><li><p>&#8220;Very insistent on file system&#8221; - hard to extract/resume sessions</p></li><li><p>Teams need to fork agents thousands of times over months</p></li><li><p>Built custom session managers to work around limitations</p></li></ul><p><strong>Current approaches:</strong></p><ul><li><p><strong>Micro-VMs</strong> (e.g., Firecracker via E2B) for code execution</p></li><li><p>Cloudflare&#8217;s <strong>durable objects</strong> - &#8220;distributed little objects with SQLite store&#8221;</p></li><li><p>First iteration: &#8220;Durable object as agent was a one-liner&#8221;</p></li></ul><h3><strong>5. Tracing</strong></h3><p><strong>Unanimous pain point:</strong> Existing tools (LangSmith, LangFuse, etc.) inadequate</p><p><strong>Key limitations identified:</strong></p><ul><li><p>Built for simple LLM calls, not complex agent traces</p></li><li><p>Can&#8217;t visualize tree searches, test-time scaling</p></li><li><p>Fail for sessions spanning months with thousands of forks</p></li><li><p>&#8220;Not the same as distributed tracing for microservices&#8221;</p></li></ul><p><strong>Solutions:</strong></p><ul><li><p>Teams building custom tracing UIs</p></li><li><p>Atla fine-tuned a model specifically for analyzing agent traces against rubrics</p></li><li><p>Cloudflare just released tracing for Workers</p></li></ul><h3><strong>6. Optimisation Strategies &amp; Architectural Patterns</strong></h3><p><strong>Tool design:</strong></p><ul><li><p>Debate: Many simple tools vs. few complex tools with parameters</p></li><li><p>Calling LLMs inside tools &#8220;becomes painfully slow&#8221;</p></li><li><p>Context reduction: Some teams using separate models to filter tool relevance before main agent</p></li></ul><p><strong>Multi-agent vs. Single-agent evolution:</strong></p><ul><li><p>Pattern: Teams started with &#8220;orchestrator + investigators + verifiers&#8221;</p></li><li><p>Newer models collapsing this: &#8220;Just pass everything to coding SDK - it&#8217;s comparable&#8221;</p></li><li><p>&#8220;Handoffs are much faster&#8221; than complex tool returns</p></li></ul><p><strong>Prompt engineering shifts:</strong></p><ul><li><p>&#8220;A year ago building complicated prompt workflows... now just &#8216;repo tool and guide&#8217;&#8221;</p></li><li><p>Less manual XML construction, more reliance on model intelligence</p></li></ul><h3><strong>7. Production Patterns</strong></h3><p><strong>Pre-determined flows for common patterns:</strong></p><ul><li><p>Recruiting agent example: Binary tree of prompts based on user requirements</p></li><li><p>Generate &#8220;master prompt&#8221; from selected sub-prompts</p></li><li><p>Reduces hallucination for structured interactions</p></li></ul><p><strong>Latency optimisation:</strong></p><ul><li><p>Parallel tool calls causing provider rate limits/timeouts</p></li><li><p>Hard to control with external APIs</p></li><li><p>Moving toward edge execution for speed</p></li></ul><h3><strong>8. MCP Discussion</strong></h3><p><strong>When MCP makes sense:</strong></p><ul><li><p>Remote tool servers where contract flexibility matters</p></li><li><p>Tools that need to adapt per-user/session</p></li><li><p>&#8220;Much better transport layer than API&#8221; for dynamic use cases</p></li></ul><p><strong>When NOT to use MCP:</strong></p><ul><li><p>Local execution: &#8220;Never suggest MCP server + client on same machine - pointless&#8221;</p></li><li><p>Well-documented TypeScript APIs: &#8220;Just use the API directly&#8221;</p></li><li><p>Code mode might reduce MCP need, though Matt sees them as complementary</p></li></ul><p><strong>Critical distinction:</strong> MCP = discovery + transport, not execution. Code mode = execution optimisation.</p><h3><strong>Security Considerations</strong></h3><ul><li><p>Injection attacks for graph queries (Neo4j)</p></li><li><p>Static analysis on generated tool code</p></li><li><p>Cloudflare&#8217;s isolated execution model mitigates many concerns</p></li><li><p>Access control in multi-tenant knowledge graphs</p></li></ul><h3>Further Reading</h3><ul><li><p><a href="https://vintagedata.org/blog/posts/model-is-the-product">The Model Is The Product</a></p></li><li><p><a href="https://jxnl.co/writing/2025/08/27/facets-context-engineering/#why-is-search-quality-your-ceiling">Beyond Chunks: Why Context Engineering is the Future of RAG</a></p></li><li><p><a href="https://blog.cloudflare.com/code-mode/">Code Mode: the better way to use MCP</a></p></li></ul><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/agent-frameworks-and-memory-roundtable?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/agent-frameworks-and-memory-roundtable?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/agent-frameworks-and-memory-roundtable/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/agent-frameworks-and-memory-roundtable/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Application Software Is Dead, Again]]></title><description><![CDATA[Data Fabrics, Agent Labs, Moving Up And Down The Stack]]></description><link>https://www.akashbajwa.co/p/application-software-is-dead-again</link><guid isPermaLink="false">https://www.akashbajwa.co/p/application-software-is-dead-again</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Mon, 17 Nov 2025 07:02:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nLr1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Gradient Descending Roundtables in London</em></h3><div class="pullquote"><p><strong>November 18th:</strong> <strong><a href="https://luma.com/9475fjje">Agent Frameworks &amp; Memory with Cloudflare</a></strong></p><p><strong>November 19th:</strong> <strong><a href="https://luma.com/zq485ytc">Designing AI-Native Software with SPACING</a></strong></p><p><strong>November 26th</strong>: <strong><a href="https://luma.com/43yu7r2e">Open Source AI with Alibaba Qwen</a></strong></p></div><blockquote><div><hr></div></blockquote><div class="pullquote"><p><strong>December 4th: <a href="https://luma.com/uk1t92xl">The Paper Club: AI Wrapped 2025</a></strong></p><p>Reinforcement Learning and Multimodal Models with Zoe (Dawn Capital) and Doubleword.ai</p></div><div><hr></div><h2>Application Software Is Dead, Again</h2><p>Three years in, the debate of value accrual in AI has only intensified as the model labs have transformed into product companies capable of shipping <em>excellent</em> consumer and developer-facing products at a clip that&#8217;s incomparable to incumbents of the past.</p><p>The debate was reignited on X this week.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/yishan/status/1987787127204249824" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K2xO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 424w, https://substackcdn.com/image/fetch/$s_!K2xO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 848w, https://substackcdn.com/image/fetch/$s_!K2xO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 1272w, https://substackcdn.com/image/fetch/$s_!K2xO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K2xO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png" width="597" height="696" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:696,&quot;width&quot;:597,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:158109,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/yishan/status/1987787127204249824&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K2xO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 424w, https://substackcdn.com/image/fetch/$s_!K2xO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 848w, https://substackcdn.com/image/fetch/$s_!K2xO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 1272w, https://substackcdn.com/image/fetch/$s_!K2xO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F890e99e0-99d3-442f-bbd5-36fbe858a06f_597x696.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There was a <a href="https://x.com/yishan/status/1988032751501799773?s=20">thoughtful follow-up from the author</a>, which roughly approximates to:  past technology cycles afforded more time for startups to create and capture value, whereas in AI the model layer is changing every 9-12 months, creating an impossibly short timeline for startups to outrun obsolescence by building enterprise relationships and an enduring brand.</p><p>The discourse on application software is of course one part of a broader discussion on how software is fundamentally changing, and, importantly, over <strong>what time horizon.</strong></p><h3>Agents Built On The Lakehouse</h3><p>One view of the future, <a href="https://www.akashbajwa.co/p/the-future-of-application-software">which I&#8217;ve written about before</a>, is that we&#8217;re marching towards enterprises preparing their data estates for economies of agents to perform work end-to-end.</p><p>The unbundling and bundling of the modern stack is instructive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nLr1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nLr1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nLr1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nLr1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nLr1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nLr1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg" width="680" height="309" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:309,&quot;width&quot;:680,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!nLr1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nLr1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nLr1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nLr1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7255ff4-694d-48fd-ac10-4b22a3a730f5_680x309.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Morgan Stanley</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LgOA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LgOA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 424w, https://substackcdn.com/image/fetch/$s_!LgOA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 848w, https://substackcdn.com/image/fetch/$s_!LgOA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!LgOA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LgOA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png" width="1456" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LgOA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 424w, https://substackcdn.com/image/fetch/$s_!LgOA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 848w, https://substackcdn.com/image/fetch/$s_!LgOA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!LgOA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea9f744-748a-4b48-bcd4-48c0b97c300f_2000x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://a16z.com/emerging-architectures-for-modern-data-infrastructure/">Source: a16z</a></figcaption></figure></div><p>As <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Ethan Ding&quot;,&quot;id&quot;:32952605,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/73e4cfcc-f919-4719-b77e-76e1eb2e35c4_414x375.png&quot;,&quot;uuid&quot;:&quot;43d1f336-8cc2-43fd-b3fe-1459cf95007b&quot;}" data-component-name="MentionToDOM"></span> wrote on the<a href="https://ethanding.substack.com/p/fivetran-and-dbt-gear-up-for-war"> fivetran and dbt merger</a>:</p><blockquote><p><em>every $1 to dbt means $10 to snowflake. i&#8217;ve heard numerous snowflake account execs confirm that while dbt only recently crossed $100m in revenue, they&#8217;re driving $500m in snowflake costs. which makes snowflake reps love to sell dbt in all of their transactions. this caused the dbt community to grow massively. which means snowflake account execs were basically dbt&#8217;s commission-only sales team. except instead of splitting the money, dbt got 2% and snowflake kept 98%. very generous partnership structure.</em></p></blockquote><p>The majority of the dollars spent on data and value always accrued to Databricks/Snowflake, where the computing/querying was happening. For a while, they were happy to support an ecosystem of adjacent tooling so long as they were driving consumption upstream/downstream. </p><p>Eventually, the Lakehouses decided to consolidate the sprawling landscape of data infrastructure vendors on the premise of a lower TCO for their customers (fewer FTEs needed to manage different vendors).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nzMN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nzMN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 424w, https://substackcdn.com/image/fetch/$s_!nzMN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 848w, https://substackcdn.com/image/fetch/$s_!nzMN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 1272w, https://substackcdn.com/image/fetch/$s_!nzMN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nzMN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png" width="793" height="472" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9682d17-289b-400b-9800-18c68e083d48_793x472.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:472,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:221246,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nzMN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 424w, https://substackcdn.com/image/fetch/$s_!nzMN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 848w, https://substackcdn.com/image/fetch/$s_!nzMN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 1272w, https://substackcdn.com/image/fetch/$s_!nzMN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9682d17-289b-400b-9800-18c68e083d48_793x472.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Matt Turck&#8217;s MAD 2024 Landscape, Morgan Stanley</figcaption></figure></div><p>This view of the world would see enterprises continue to embrace open table formats, consolidate their data and store it cheaply, and, <strong>most importantly</strong>, decide what query engine to use.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cc6e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cc6e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 424w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 848w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png" width="807" height="265" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:265,&quot;width&quot;:807,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:80912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cc6e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 424w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 848w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc6e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a8a5da6-7d38-4a19-8cc1-2621270b318c_807x265.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The same reasoning explains why Salesforce acquired Informatica, ServiceNow is focusing on their Data Fabric, and SAP announced a partnership with Databricks: help customers consolidate their data and <strong>perform work in your environment.</strong></p><p>The only difference is that instead of humans performing the queries, eventually <strong>it&#8217;ll be agents</strong>. </p><p>Tooling for observability, governance, analytics, and security will still be necessary for enterprises to comfortably deploy agents fairly autonomously across their data estate. </p><p>For this view of the world to prevail, the customer&#8217;s business logic or domain-specific reasoning will have been codified (whether that&#8217;s packaged by a vertical AI vendor focused on domain-specific reasoning, captured by a Palantir FDE sitting with your subject matter experts, or learned through processing of your data corpora).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LPdJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LPdJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 424w, https://substackcdn.com/image/fetch/$s_!LPdJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 848w, https://substackcdn.com/image/fetch/$s_!LPdJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 1272w, https://substackcdn.com/image/fetch/$s_!LPdJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LPdJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png" width="1456" height="1486" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1486,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LPdJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 424w, https://substackcdn.com/image/fetch/$s_!LPdJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 848w, https://substackcdn.com/image/fetch/$s_!LPdJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 1272w, https://substackcdn.com/image/fetch/$s_!LPdJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea8f1738-9cc6-4922-bf19-f4dc2da50e48_1460x1490.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://ethanding.substack.com/p/fivetran-and-dbt-gear-up-for-war">Ethan Ding</a></figcaption></figure></div><p>That&#8217;s of course across hundreds or thousands of processes, which is why the concept of an equally large number of small language models tuned to each specific process (<a href="https://docs.google.com/document/d/1HllKKBULFOAe8e780ISuaUbqKAR8KjRc7XTzO69MOcs/edit?tab=t.0">running locally at little to no inference cost/COGS</a>) makes so much sense.</p><h3>AI&#8217;s Diffusion Rate</h3><p>As plausible as it sounds, this scenario is still many, many years away, for two key reasons.</p><p>First, despite the unprecedented diffusion rate of AI for such a consequential technology, the change management required for enterprises to transform how they operate in a post-AI world is greater than anything that&#8217;s come before. </p><p><a href="https://www.youtube.com/watch?v=8-boBsWcr5A&amp;t=2286s&amp;pp=ygUIZHdhcmtlc2g%3D">Satya Nadella opined on the future of software in an interview by Dwarkesh Patel and Dylan Patel</a>:</p><blockquote><p><em>Our business, which today is an <strong>end user tools business,</strong> will become essentially an <strong>infrastructure business in support of agents doing work</strong>.</em></p></blockquote><p>However,</p><blockquote><p><em>What took 70 years, maybe 100, 150 years for the industrial revolution may happen in 20 years, 25 years. I would love to compress what happened in 200 years of the industrial revolution into a 20-year period if you&#8217;re lucky.</em></p><p><em>Even if the tech is diffusing fast this time around, for true economic growth to appear, it has to sort of diffuse to a point where the work, the work artifact and the workflow has to change. And so that&#8217;s kind of one place where I think the change management required for a corporation to truly change, I think is something we shouldn&#8217;t discount.</em></p></blockquote><p>The second key constraint is the development of agents, which we should measure over a time horizon of <strong>decades</strong> rather than years,<a href="https://www.youtube.com/watch?v=BlVnGXEzFow"> as Andrej Karpathy has argued</a>:</p><blockquote><p><em>When you&#8217;re talking about an agent, or what the labs have in mind and what maybe I have in mind as well, you should think of it almost like an employee or like an intern that you would hire to work with you. For example, you work with some employees here. When would you prefer to have an agent like Claude or Codex do that work? Currently, of course, they can&#8217;t. What would it take for them to be able to do that? Why don&#8217;t you do it today? </em></p><p><em>The reason you don&#8217;t do it today is because they just don&#8217;t work. They don&#8217;t have enough intelligence, they&#8217;re not multimodal enough, they can&#8217;t do computer use and all this kind of stuff. And they don&#8217;t do a lot of the things that you&#8217;ve alluded to earlier. You know, they don&#8217;t have continued learning. You can&#8217;t just tell them something and they&#8217;ll remember it. And they&#8217;re just cognitively lacking and it&#8217;s just not working. And I just think that it will take about a decade to work through all of those issues.</em></p></blockquote><p>This is by no means understating the diffusion rate of AI among consumers, where we&#8217;re still in the early innings of building AI-native experiences, let alone on new computing interfaces like wearables.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FN6X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FN6X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 424w, https://substackcdn.com/image/fetch/$s_!FN6X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 848w, https://substackcdn.com/image/fetch/$s_!FN6X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 1272w, https://substackcdn.com/image/fetch/$s_!FN6X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FN6X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png" width="1438" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1438,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:318131,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FN6X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 424w, https://substackcdn.com/image/fetch/$s_!FN6X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 848w, https://substackcdn.com/image/fetch/$s_!FN6X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 1272w, https://substackcdn.com/image/fetch/$s_!FN6X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90928cd2-6855-40c9-863e-9cb5834b4e8f_1438x621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Avenir</figcaption></figure></div><p>Computer using agents will no doubt soon be reliable at common consumer use cases like travel and shopping, with the enabling infrastructure for agentic commerce quickly coming together, further accelerating adoption. The consumerisation of enterprise software will continue to be a tailwind for AI penetration, and<a href="https://www.akashbajwa.co/p/second-order-effects-of-ai-in-vertical"> AI will</a> catalyse adoption among SMBs in verticals that have been resistant to software eating the world.</p><p><strong>The ground truth in the enterprise is that over half of enterprises have yet to put their first AI projects into production.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RG9z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RG9z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 424w, https://substackcdn.com/image/fetch/$s_!RG9z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 848w, https://substackcdn.com/image/fetch/$s_!RG9z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 1272w, https://substackcdn.com/image/fetch/$s_!RG9z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RG9z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png" width="911" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:911,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56140,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9dc08af-1dc2-4f32-ad5b-8d157e3e547e_911x526.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RG9z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 424w, https://substackcdn.com/image/fetch/$s_!RG9z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 848w, https://substackcdn.com/image/fetch/$s_!RG9z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 1272w, https://substackcdn.com/image/fetch/$s_!RG9z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a8db07-9ca1-44bf-a0e0-e648d2c3b0a4_911x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With Microsoft expected to be the biggest share gainer/beneficiary of AI workloads.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aU-b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aU-b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 424w, https://substackcdn.com/image/fetch/$s_!aU-b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 848w, https://substackcdn.com/image/fetch/$s_!aU-b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 1272w, https://substackcdn.com/image/fetch/$s_!aU-b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aU-b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png" width="1456" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96313,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aU-b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 424w, https://substackcdn.com/image/fetch/$s_!aU-b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 848w, https://substackcdn.com/image/fetch/$s_!aU-b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 1272w, https://substackcdn.com/image/fetch/$s_!aU-b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45640ad2-cfc7-4971-94bc-9c4f6d01465d_1770x718.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Morgan Stanley</figcaption></figure></div><p>There are a lot of reasons behind this, but the structural reasons of change management and model capabilities are the two biggest ones. </p><p>It goes without saying that there are already use cases where the capabilities meet the minimum viable quality threshold and outsourcing was the default, such as <strong>customer support</strong>, where large companies are already being built.</p><p>The breakout companies in those verticals, though, are not cleanly defined as application or infrastructure companies - <strong>they&#8217;re both</strong>. </p><h3>Agent Labs</h3><p>I&#8217;ve <a href="https://www.akashbajwa.co/p/ai-apps-agent-labs">written before</a> about how applied AI companies closest to the end user earn the right to collect valuable training/reward data that can then be used to tackle gnarly infrastructure/research problems that help AI products cover the last mile in the enterprise, which has always been the longest.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/sarahcat21/status/1987924298234032341" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YcVt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 424w, https://substackcdn.com/image/fetch/$s_!YcVt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 848w, https://substackcdn.com/image/fetch/$s_!YcVt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 1272w, https://substackcdn.com/image/fetch/$s_!YcVt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YcVt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png" width="597" height="293" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:293,&quot;width&quot;:597,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68989,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/sarahcat21/status/1987924298234032341&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YcVt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 424w, https://substackcdn.com/image/fetch/$s_!YcVt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 848w, https://substackcdn.com/image/fetch/$s_!YcVt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 1272w, https://substackcdn.com/image/fetch/$s_!YcVt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4db90ffd-30d0-4f9b-ba28-0bac4015e0cd_597x293.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The customer support market is an early indicator of this, where research engineers/scientists are becoming a more common hire in order to train custom models (e.g. reranker models) that drive higher resolution rates, one of the key metrics AI startups are being benchmarked on. </p><p>I&#8217;d predict that the title &#8216;<strong>Member of Technical Staff</strong>&#8217; will become much more common at companies that have historically been deemed &#8216;application software&#8217; companies. </p><p>This view of the world will see the lines between layers of the AI stack continuously blurring, perhaps even earlier than they already are in the current cohort of AI companies. Model labs will continue moving up the stack, whilst agent labs will relentlessly move down the stack to capture more margins. </p><p>An analysis I&#8217;ve been wanting to do for a while is to look at the attach-rate for model lab products to get a better sense of whether the right-to-win that is typically ascribed to the labs is justified. Of course, IBM, Microsoft, and Google were monoliths in different technology paradigms that had ambitions to win in as many markets as possible. </p><p>The notion that this time is different <em>is </em>appealing.</p><p>Reasoning from first principles, models continue to advance as a function of the data we can provide them, with the focus shifting to reasoning data. The labs will inevitably go after select verticals where marketplaces like Mercor make the acquisition of expert reasoning data easier than ever. </p><p>In vertical software, however, the heterogeneity in workflows/reasoning steps/data sets/integrations/GTM channels would make it very difficult for the model labs to execute against each opportunity with the level of focus that a dedicated agent lab can channel. </p><p>In horizontal software, being multi-model is a strength when measuring a portfolio of models against different criteria for specific use cases. </p><p>As agents become capable at handling existing workflows end-to-end, the best companies of the future will leverage AI to define new ways of creating value that we can only imagine. </p><p>We won&#8217;t have to wait long for that future, as the revenue ramp of Cursor, Sierra, Cognition, and various other companies is already proving. After all, some of the biggest companies in recent memory have built large businesses selling to other tech companies. </p><p>These customers embrace the future and build their technology stacks from the ground up for performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!owBG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!owBG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 424w, https://substackcdn.com/image/fetch/$s_!owBG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 848w, https://substackcdn.com/image/fetch/$s_!owBG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 1272w, https://substackcdn.com/image/fetch/$s_!owBG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!owBG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png" width="378" height="520.1404958677685" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:333,&quot;width&quot;:242,&quot;resizeWidth&quot;:378,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;The image is a screenshot of a table from Jared Sleeper's post on social media, detailing the 'tech concentration' of various well-known companies. The table lists 15 companies, including Gong, Workato, Figma, and others, alongside their respective percentages of customers that are high-growth tech companies. The highest tech concentration is for Gong at 54%, while the lowest is for Benchling at 1.6%. This data was generated using Savoir, Avenir's internal analytics platform, which Jared Sleeper has been enhancing. The context provided by the post text explains that this measure is likely an undercount due to the methodology used.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The image is a screenshot of a table from Jared Sleeper's post on social media, detailing the 'tech concentration' of various well-known companies. The table lists 15 companies, including Gong, Workato, Figma, and others, alongside their respective percentages of customers that are high-growth tech companies. The highest tech concentration is for Gong at 54%, while the lowest is for Benchling at 1.6%. This data was generated using Savoir, Avenir's internal analytics platform, which Jared Sleeper has been enhancing. The context provided by the post text explains that this measure is likely an undercount due to the methodology used." title="The image is a screenshot of a table from Jared Sleeper's post on social media, detailing the 'tech concentration' of various well-known companies. The table lists 15 companies, including Gong, Workato, Figma, and others, alongside their respective percentages of customers that are high-growth tech companies. The highest tech concentration is for Gong at 54%, while the lowest is for Benchling at 1.6%. This data was generated using Savoir, Avenir's internal analytics platform, which Jared Sleeper has been enhancing. The context provided by the post text explains that this measure is likely an undercount due to the methodology used." srcset="https://substackcdn.com/image/fetch/$s_!owBG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 424w, https://substackcdn.com/image/fetch/$s_!owBG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 848w, https://substackcdn.com/image/fetch/$s_!owBG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 1272w, https://substackcdn.com/image/fetch/$s_!owBG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F691955e8-e21f-44d3-8cb6-e0553bda186a_242x333.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://x.com/JaredSleeper/status/1914407853961679309">Source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/amywumartin/status/1987964497370509358?s=20" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zv12!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 424w, https://substackcdn.com/image/fetch/$s_!zv12!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 848w, https://substackcdn.com/image/fetch/$s_!zv12!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 1272w, https://substackcdn.com/image/fetch/$s_!zv12!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zv12!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png" width="598" height="428" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/792020cc-7184-4c3e-9750-d546447429e6_598x428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:428,&quot;width&quot;:598,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:78535,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/amywumartin/status/1987964497370509358?s=20&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zv12!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 424w, https://substackcdn.com/image/fetch/$s_!zv12!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 848w, https://substackcdn.com/image/fetch/$s_!zv12!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 1272w, https://substackcdn.com/image/fetch/$s_!zv12!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F792020cc-7184-4c3e-9750-d546447429e6_598x428.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Time will tell which version of the future this cohort of AI companies builds on.</p><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!v9bI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!v9bI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 424w, https://substackcdn.com/image/fetch/$s_!v9bI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 848w, https://substackcdn.com/image/fetch/$s_!v9bI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 1272w, https://substackcdn.com/image/fetch/$s_!v9bI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!v9bI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png" width="1456" height="817" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:314355,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!v9bI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 424w, https://substackcdn.com/image/fetch/$s_!v9bI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 848w, https://substackcdn.com/image/fetch/$s_!v9bI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 1272w, https://substackcdn.com/image/fetch/$s_!v9bI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bcb794f-ae94-4d6b-9293-549a245b111f_1614x906.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://x.com/mjmauboussin/status/1986796164788670562?s=20">Source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RKCi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RKCi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 424w, https://substackcdn.com/image/fetch/$s_!RKCi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 848w, https://substackcdn.com/image/fetch/$s_!RKCi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 1272w, https://substackcdn.com/image/fetch/$s_!RKCi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RKCi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png" width="476" height="539" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:539,&quot;width&quot;:476,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35311,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/179006682?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RKCi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 424w, https://substackcdn.com/image/fetch/$s_!RKCi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 848w, https://substackcdn.com/image/fetch/$s_!RKCi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 1272w, https://substackcdn.com/image/fetch/$s_!RKCi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c104818-2d8e-48bc-a7fb-f6e81236e712_476x539.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.tidemarkcap.com/vskp-chapter/2025-vertical-smb-saas-benchmark-report">Source: Tidemark</a></figcaption></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><a href="https://surgehq.ai/blog/rl-envs-real-world">RL Environments and the Hierarchy of Agentic Capabilities</a></p><p><a href="https://thechipletter.substack.com/p/how-big-was-ibm?r=1ac0y&amp;utm_medium=ios&amp;triedRedirect=true">How Big was IBM?</a></p><p><a href="https://earnedintuition.substack.com/p/earned-intuition-003-vertical-ai?r=1ac0y&amp;utm_medium=ios&amp;triedRedirect=true">Earned Intuition #003: Vertical AI: What Would You Never Ask a Human to Do?</a></p><p><a href="https://www.tidemarkcap.com/vskp-chapter/2025-vertical-smb-saas-benchmark-report">2025 Vertical &amp; SMB SaaS Benchmark Report</a></p><p><a href="https://elizlaraki.substack.com/p/the-future-of-ai-history">The Future of AI History</a></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>As busy as we are with these mega deals, our main focus is still to build our own core AI cloud business. We made great progress here with AI native start-ups like Cursor, Black Forest Labs and others. The economics and the cash flow of mega deals are attractive in their own right, but they also enable us to build our core AI cloud business faster. This is our real future opportunity.</p><p><strong>Arkady Volozh, Nebius Q3 Earnings Call</strong></p></div><div class="pullquote"><p>The data products, the semantics, the business context, the orchestration of the agents in a business context is all with SAP... Everything what happens on AI agents in the context of the SAP applications, in the context of the business processes, in the context of enterprise analytics that is SAP.</p><p><strong>Christian Klein, SAP Q3 Earnings Call</strong></p></div><div class="pullquote"><p>You used to have to take a company private to change the unit economics of it. What we&#8217;re doing in enterprise is providing a private equity like transformation in the public markets, in the public space under the current leadership.</p><p><strong>Alex Karp, Palantir Q3 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/application-software-is-dead-again?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/application-software-is-dead-again?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/application-software-is-dead-again/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/application-software-is-dead-again/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Small Language Models & Context Engineering Roundtable]]></title><description><![CDATA[With Microsoft AI's Marlene Mhangami]]></description><link>https://www.akashbajwa.co/p/small-language-models-and-context</link><guid isPermaLink="false">https://www.akashbajwa.co/p/small-language-models-and-context</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Fri, 14 Nov 2025 07:02:39 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/262f2769-19c8-40bb-8aee-52b0f916094f_2200x2200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Gradient Descending Roundtables in London</em></h3><div class="pullquote"><p><strong>November 18th:</strong> <a href="https://luma.com/9475fjje">Agent Frameworks &amp; Memory with Cloudflare</a></p><p><strong>November 19th:</strong> <a href="https://luma.com/zq485ytc">Designing AI-Native Software with SPACING</a></p></div><div><hr></div><p><em>This week, we hosted <a href="https://www.linkedin.com/in/marlenemhangami/">Marlene</a> and <a href="https://www.linkedin.com/in/christoffer-noring-3257061/">Chris</a> from Microsoft AI to discuss Small Language Models, Context Engineering and MCP. Thanks to everyone who came and made the discussion so insightful! </em></p><p><em>I&#8217;m sharing the summary of our discussion below.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yCRn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yCRn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yCRn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yCRn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yCRn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yCRn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg" width="615" height="819.8592032967033" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:615,&quot;bytes&quot;:2760636,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/178832362?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yCRn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yCRn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yCRn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yCRn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4271e346-b41c-4c54-8663-441cc43ecc5a_4032x3024.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Edge AI &amp; Microsoft Phi Models</strong></h2><h3><strong>Microsoft&#8217;s SLM Strategy</strong></h3><p>Marlene presented Microsoft&#8217;s edge AI push, centered around the <strong>Phi model family</strong> (currently at v4):</p><p><strong>Key components:</strong></p><ul><li><p><strong>Phi 4 flagship</strong>: General-purpose SLM optimized for edge deployment</p></li><li><p><strong>Phi 4 reasoning</strong>: Reasoning version with strength in mathematical reasoning</p></li><li><p><strong>Foundry Local</strong>: Microsoft&#8217;s Ollama-equivalent platform for downloading and running local models, with a path to Azure cloud services</p></li></ul><p><strong>Competitive positioning:</strong></p><ul><li><p>Compared favorably to Qwen models (described as &#8220;model of choice&#8221; by many practitioners)</p></li><li><p>Also competing with Mistral 7B, Gemma, and other distilled models</p></li><li><p>Cost advantage: Phi 4 is approximately <strong>150x cheaper</strong> than GPT-4.5 for serverless compute</p></li></ul><h3><strong>The Email Agent Case Study: A Context Engineering Example</strong></h3><p>Marlene&#8217;s email agent project serves as an excellent case study in the practical challenges of deploying SLMs:</p><p><strong>Architecture:</strong></p><pre><code>User Query &#8594; Supervisor Agent &#8594; {
    - Manage Email Agent
    - Scheduling Event Agent  
    - Search Email History Agent (with Postgres + semantic search)
} &#8594; MCP Server (M365 tools) &#8594; Results</code></pre><p><strong>Critical Design Decision: Sub-agent Architecture</strong></p><p>The most significant insight was the necessity of <strong>dividing context across specialised sub-agents</strong> rather than loading all MCP tools into a single agent. This addresses a fundamental problem: when you connect an MCP server with dozens of tools, the tool descriptions alone overwhelm the SLM&#8217;s context window, degrading performance catastrophically.</p><p><strong>Solution strategies employed:</strong></p><ol><li><p><strong>Tool segmentation</strong>: Each sub-agent receives only relevant tools for its domain</p></li><li><p><strong>Middleware layer</strong>: Custom JSON generation to work around Phi&#8217;s lack of native function calling (though this is being addressed soon)</p></li><li><p><strong>Result summarisation</strong>: Critical challenge of managing tool output that floods context windows</p></li><li><p><strong>Semantic search</strong>: Using Postgres for email history rather than raw context stuffing</p></li></ol><p><strong>Performance metrics:</strong></p><ul><li><p>Response latency: ~2 seconds (surprisingly fast for local inference)</p></li><li><p>Key bottleneck: Not compute, but memory/storage on the device</p></li></ul><h2><strong>The Gaming Paradigm</strong></h2><h3><strong>LLM Paradigm vs. SLM Paradigm</strong></h3><p><strong>LLM world:</strong></p><ul><li><p>Time is abundant (10+ seconds acceptable)</p></li><li><p>Quality is paramount</p></li><li><p>Single-threaded workflows</p></li><li><p>Cloud-centric</p></li><li><p>Cost scales with usage</p></li></ul><p><strong>SLM world (especially gaming):</strong></p><ul><li><p>Sub-second latency requirements (ideally &lt;1 second)</p></li><li><p>Minimum viable quality threshold</p></li><li><p>Massively parallel workflows</p></li><li><p>Device-constrained</p></li><li><p>Fixed cost model</p></li></ul><h3><strong>The GPU Budget War</strong></h3><p>Game developers traditionally allocate GPU budgets across departments (sound: 10%, graphics: 40%, etc.). AI is now demanding its own substantial budget allocation, forcing painful trade-offs. This explains why gaming hasn&#8217;t yet widely adopted generative AI - it&#8217;s not just a technical challenge but a <strong>fundamental resource reallocation problem</strong>.</p><h3><strong>Client-Side AI in Gaming: Current Reality vs. Future Vision</strong></h3><p><strong>Current constraints:</strong></p><ul><li><p>Can typically run 1-2 SLMs on-device simultaneously</p></li><li><p>Most prioritise single model for maximum quality after distillation</p></li><li><p>Hybrid approaches: Pre-generate assets server-side, deliver real-time locally</p></li><li><p>Examples: &#8220;Operators&#8221; game doing text-to-speech with smart caching</p></li></ul><p><strong>Future paradigm shift:</strong></p><ul><li><p>Multiple specialised SLMs running in parallel</p></li><li><p>One for dialogue generation</p></li><li><p>One for text-to-speech</p></li><li><p>Others for translation, behaviour trees, etc.</p></li></ul><p><strong>The killer use case: NPC conversations</strong></p><p>Walking into a room with 10 NPCs having simultaneous conversations. Cloud models can&#8217;t scale this (10 parallel LLM calls = cost explosion), but device-based SLMs could handle multiple concurrent agents efficiently.</p><h3><strong>The Streaming Gaming Convergence</strong></h3><p>A strategic insight emerged about cloud gaming services (AWS Luna, etc.): They create an opportunity to co-locate <strong>GPU streaming and GPU inference</strong>, enabling hybrid architectures where:</p><ul><li><p>Latency-critical elements run locally</p></li><li><p>High-fidelity generation happens cloud-side but close to streaming source</p></li><li><p>New compression paradigms could enable prompt-based video streaming</p></li></ul><h2><strong>Context Engineering: The Central Challenge</strong></h2><p>Context management is the <strong>defining challenge</strong> for SLM deployment:</p><p><strong>Problem 1: Tool Description Overload</strong></p><ul><li><p>Loading all MCP tools overwhelms context window</p></li><li><p>Model performance degrades even before executing any tools</p></li><li><p>Anthropic&#8217;s recent blog post confirmed this industry-wide issue</p></li></ul><p><strong>Problem 2: Tool Result Flooding</strong></p><ul><li><p>Example: &#8220;Find emails from past 3 months&#8221; returns massive data</p></li><li><p>Keeping results in context causes model failure</p></li><li><p>Need to balance information preservation with context limits</p></li></ul><h3><strong>Mitigation Strategies Discussed</strong></h3><ol><li><p><strong>Architectural solutions:</strong></p><ul><li><p>Sub-agent decomposition (Marlene&#8217;s approach)</p></li><li><p>Virtual file systems (Cloudflare&#8217;s code mode approach)</p></li><li><p>Durable execution patterns (Temporal workflows)</p></li></ul></li><li><p><strong>Data management:</strong></p><ul><li><p>Result trimming (lossy but pragmatic)</p></li><li><p>Summarisation layers (risk of losing critical context)</p></li><li><p>Semantic indexing (Postgres example)</p></li></ul></li><li><p><strong>Emerging solutions:</strong></p><ul><li><p><strong>Samba model</strong>: New Microsoft Research model claiming &#8220;unlimited context windows&#8221; through novel compression/persistence mechanisms</p></li><li><p>Middleware patterns for context transformation</p></li></ul></li></ol><h3><strong>The Agent vs. Workflow Debate</strong>:</h3><p><strong>Agent-heavy approaches:</strong></p><ul><li><p>Tool-calling paradigm with autonomous decision-making</p></li><li><p>Fills context windows quickly</p></li><li><p>Unpredictable latency</p></li><li><p>Popular but problematic for constrained environments</p></li></ul><p><strong>Workflow/chaining approaches:</strong></p><ul><li><p>Airflow-style deterministic pipelines</p></li><li><p>More predictable, cheaper, faster</p></li><li><p>Better for real-time requirements</p></li></ul><h2><strong>Hardware Evolution: The Coming NPU Revolution</strong></h2><h3><strong>The CUDA Parallel</strong></h3><p>When Nvidia first released CUDA, AMD had faster chips, and people questioned dedicating silicon to such a niche use case. Today, CUDA lock-in dominates AI infrastructure.</p><p><strong>The NPU trajectory will follow a similar path:</strong></p><ul><li><p>Current state: Hardware lagging software (unusual reversal)</p></li><li><p>Microsoft shipping next-gen PCs with NPUs optimized for inference</p></li><li><p>Gaming consoles (PlayStation, Xbox) will include AI-specific silicon</p></li><li><p>Mobile devices (especially iPhones) already have powerful NPUs</p></li></ul><p>AI is already embedded in every layer of computing (OS, browser, applications). Without optimized hardware, most features become unusable. This creates an inevitable forcing function for NPU adoption.</p><h2><strong>Infrastructure Layer: Inference Providers &amp; Trade-offs</strong></h2><h3><strong>Groq/Cerebras Discussion</strong></h3><p>The conversation revealed nuanced understanding of specialized inference providers:</p><p><strong>Groq advantages:</strong></p><ul><li><p>Extreme speed (2-second responses vs. 20 seconds for Anthropic)</p></li><li><p>Quality now on par with frontier models</p></li><li><p>Critical for customer support, voice agents</p></li><li><p>Cost-effective for fire-and-forget tasks</p></li></ul><p><strong>Groq/Cerebras limitations:</strong></p><ul><li><p><strong>No prompt caching</strong> (due to architectural choices around on-chip memory)</p></li><li><p>Makes agentic workflows economically unviable</p></li><li><p>Can&#8217;t support coding workflows that rely on caching</p></li></ul><p><strong>Optimal use cases:</strong></p><ul><li><p>Summarisation (no caching benefit anyway)</p></li><li><p>Single-pass tasks</p></li><li><p>Speed-critical applications without iterative loops</p></li></ul><h3><strong>The Caching Dependency</strong></h3><p><strong>The entire agentic paradigm is built on prompt caching</strong>. Providers without caching are fundamentally non-viable for agent workflows, regardless of speed advantages. This suggests prompt caching has become infrastructure-critical, not just an optimisation.</p><h2><strong>Cost-Performance Frontier</strong></h2><h3><strong>The 150x Cost Advantage</strong></h3><p><strong>Implications:</strong></p><ul><li><p>B2C AI becomes economically viable (currently mostly B2B due to costs)</p></li><li><p>Multiple SLM agents cheaper than single LLM call</p></li><li><p>Enables new business models previously impossible</p></li></ul><h3><strong>Quality Threshold Debate</strong></h3><p><strong>Current consensus:</strong></p><ul><li><p>Adequate for experimentation</p></li><li><p>Not yet production-ready for most applications</p></li><li><p>Quality gap expected to close within 1-2 years</p></li><li><p>Gaming sector: Still struggling to ship SLM-powered features</p></li></ul><p><strong>The deployment readiness spectrum:</strong></p><ul><li><p><strong>Ready now</strong>: Summarisation, classification, simple extraction</p></li><li><p><strong>Close</strong>: Customer support, basic agents</p></li><li><p><strong>Not ready</strong>: Complex reasoning, multimodal generation, parallel agent orchestration</p></li></ul><h2><strong>Model Context Protocol</strong></h2><p><strong>Why MCP matters:</strong></p><ul><li><p>Abstracts away complex APIs (Microsoft Graph example)</p></li><li><p>Pre-built authentication flows</p></li><li><p>Rapid prototyping without API expertise</p></li><li><p>Standardised tool interface</p></li></ul><h3><strong>The Cybersecurity Shadow</strong></h3><p>Chris&#8217;s book and warnings introduce crucial caution: MCP creates <strong>supply chain attack vectors</strong> similar to NPM/pip packages:</p><ul><li><p>Malicious MCP servers could exfiltrate credentials</p></li><li><p>Users connecting servers via Claude Desktop or GitHub Copilot may not understand risks</p></li><li><p>Analogous to the JavaScript package ecosystem&#8217;s security challenges</p></li></ul><p><strong>The tension:</strong> Ease of use vs. security. MCP democratizes AI tool integration but potentially at the cost of introducing new attack surfaces.</p><h3><strong>Is MCP the Right Abstraction?</strong></h3><p>Marlene posed whether MCP is even the correct model for providing context to SLMs. The underlying question: <strong>Should we be using RPC-style protocols for context?</strong></p><h2><strong>Quantisation Quality Cliff</strong></h2><p><strong>Model quantisation has a quality cliff</strong>:</p><ul><li><p>16-bit quantisation: Acceptable quality</p></li><li><p>12-bit and below: &#8220;Absolute garbage&#8221;</p></li><li><p>Industry marketing problem: Everyone reports performance on 16-bit variants</p></li></ul><p><strong>Implications for edge deployment:</strong> The vaunted small size of SLMs may be illusory if quality degradation from aggressive quantisation makes them unusable. This suggests edge devices need more capable NPUs than currently assumed.</p><h3>Further Reading</h3><ul><li><p>The <a href="https://github.com/microsoft/PhiCookBook">Phi Cook Book</a></p></li><li><p><a href="https://github.com/microsoft/Foundry-Local">Foundry Local</a> (Microsoft&#8217;s offering for local models)</p></li><li><p><a href="https://github.com/microsoft/Samba">Samba</a> (a new local model moving towards unlimited context)</p></li><li><p><a href="https://github.com/microsoft/edgeai-for-beginners">Edge AI for Beginners</a> (a course we have for working with local models)</p></li></ul><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/small-language-models-and-context?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/small-language-models-and-context?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/small-language-models-and-context/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/small-language-models-and-context/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Deterministic Inference: The Latency Tax]]></title><description><![CDATA[Another Pareto Frontier: Trading Latency for Determinism]]></description><link>https://www.akashbajwa.co/p/deterministic-inference-the-latency</link><guid isPermaLink="false">https://www.akashbajwa.co/p/deterministic-inference-the-latency</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Mon, 10 Nov 2025 07:01:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!V5V-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Gradient Descending Roundtables in London</em></h3><div class="pullquote"><p><strong>November 12th:</strong> <a href="https://luma.com/c6l3s3ly">Small Language Models, Context Engineering and MCP with Microsoft Azure AI</a></p><p><strong>November 18th:</strong> <a href="https://luma.com/9475fjje">Agent Frameworks &amp; Memory with Cloudflare</a></p><p><strong>November 19th:</strong> <a href="https://luma.com/zq485ytc">Designing AI-Native Software with SPACING</a></p></div><blockquote><div><hr></div></blockquote><p>There&#8217;s a new Pareto frontier defined by latency and determinism.</p><p>Elon Musk&#8217;s <a href="https://x.com/deedydas/status/1987220649602130130">claim that diffusion-based LLMs</a> will surpass autoregressive transformers comes on the heels of Inception Labs announcing a $50m to continue training their Mercury series of models that we <a href="https://www.akashbajwa.co/p/continued-vertical-ai-integration">first covered in March.</a></p><p>Time to first token <strong>is</strong> <strong>slower</strong> for diffusion-based language models but total generation speed is <em>much</em> faster. </p><p>Why does time to first token matter so much, though?</p><p><a href="https://www.latent.space/p/fal">Fal cofounder Gorkem has a front-row seat</a> powering image and video workloads for large enterprises:</p><blockquote><p><em>Latency is really important. One of our customers actually did a very extensive AB test of like they on purposely slow down latency on file to see how it impacts their metrics. And it had a huge part in it. And it&#8217;s, it&#8217;s almost like page load time when the page load slower, you know, you make less money. I think Amazon famously did a very big AB test on this. It&#8217;s, it&#8217;s very similar. Like when you, when the user asks for an image and, you know, iterating on it, if it&#8217;s slower to create, they&#8217;re less engaged, they create fewer number of images and, and things like that.</em></p></blockquote><p>Though slower to first token, thereafter diffusion models quickly eclipse autoregressive models:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V5V-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V5V-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 424w, https://substackcdn.com/image/fetch/$s_!V5V-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 848w, https://substackcdn.com/image/fetch/$s_!V5V-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 1272w, https://substackcdn.com/image/fetch/$s_!V5V-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V5V-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png" width="865" height="468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a99520c3-198c-44d8-8745-18237ae090a4_865x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:468,&quot;width&quot;:865,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:146669,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/178363514?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V5V-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 424w, https://substackcdn.com/image/fetch/$s_!V5V-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 848w, https://substackcdn.com/image/fetch/$s_!V5V-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 1272w, https://substackcdn.com/image/fetch/$s_!V5V-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa99520c3-198c-44d8-8745-18237ae090a4_865x468.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>New <a href="https://arxiv.org/pdf/2511.03276">research suggests</a> diffusion language models achieve roughly 3x the data efficiency of autoregressive models - pretty handy as we hit data walls!</p><p>There will be plenty of workloads willing to trading off time to first token for total generation time and data efficiency.</p><p>Determinism is now attainable with the same trade-off.</p><p>Significant capital and talent has been focusing on how to provide the harnesses for language models to serve verticals and workloads that have low tolerance for hallucinations and inaccuracies - RAG and guardrails are just two of the resulting mechanisms.</p><p>In September, Thinking Machines Lab released their first product tackling non-determinism. At last, I got around to reading the accompanying paper.</p><p>Like many others, I had long assumed that the probabilistic nature of LLMs is intrinsic to the architecture of transformer-based language models. </p><p><a href="https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/">On the contrary,</a> this non-determinism is a <strong>property of GPU operations</strong>.</p><p>The wonderful property of parallel processing that makes GPUs so powerful for accelerated computing compared to CPUs is actually the culprit behind a lack of determinism.</p><p>A primer on how this works:</p><ol><li><p>AI server load varies throughout the day</p></li><li><p>Batch size varies (for simplicity, imagine a batch size of 1 during quiet times, 32 during peak)</p></li><li><p>GPU kernel chooses algorithm based on batch size</p></li><li><p>Different algorithms sum numbers in different orders</p></li><li><p>Floating-point arithmetic makes addition order matter</p></li><li><p>Same input produces different outputs (nondeterminism)</p></li></ol><p>Let&#8217;s unpack some of those concept.</p><p><strong>Floating point non-associativity</strong></p><p>Why do we use floating point numbers? Floating point maintains &#8220;significant figures&#8221; across wildly different scales. </p><p>Because it gives us <strong>&#8220;dynamic precision&#8221;</strong> - we can represent both:</p><ul><li><p>Tiny numbers: 0.0000000486</p></li><li><p>Huge numbers: 3,450,000,000</p></li></ul><p>The price is that <strong>addition order matters</strong>, because floating point operations necessarily lose precision and are order dependent. </p><p><strong>GPU Kernels</strong></p><p>At runtime, different <strong>orders</strong> are used for different batch sizes. </p><p>To illustrate, let&#8217;s imagine a smaller batch size at a quiet time of the day.</p><p>GPUs are optimised for parallelism, so for small batches they will use split-reduction to use as many cores of the GPU as possible. This is different to how larger batches would be handled, where sequential addition would be used.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cDg4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cDg4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 424w, https://substackcdn.com/image/fetch/$s_!cDg4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 848w, https://substackcdn.com/image/fetch/$s_!cDg4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 1272w, https://substackcdn.com/image/fetch/$s_!cDg4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cDg4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png" width="737" height="470" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:470,&quot;width&quot;:737,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42264,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/178363514?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cDg4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 424w, https://substackcdn.com/image/fetch/$s_!cDg4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 848w, https://substackcdn.com/image/fetch/$s_!cDg4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 1272w, https://substackcdn.com/image/fetch/$s_!cDg4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9658b90c-9be6-4473-b22e-d3bcb5c7d1d2_737x470.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Thinking Machine Labs&#8217; library proposes to resolve this by using the same reduction order independent of batch size, achieving <strong>batch invariance</strong>.</p><p>In doing so, parallelism and latency are traded off for determinism.</p><p><strong>The results:</strong> Asked Qwen-3-235B &#8220;Tell me about Richard Feynman&#8221; 1000 times at temperature = 0</p><ul><li><p><strong>Without their kernels</strong>: 80 unique completions</p><ul><li><p>Most common appeared 78 times</p></li><li><p>Divergence at token 103: &#8220;Queens, New York&#8221; (992x) vs &#8220;New York City&#8221; (8x)</p></li></ul></li><li><p><strong>With their kernels</strong>: 1 completion, all 1000 identical</p></li></ul><p>The trade-off in latency is below.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DH1_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DH1_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 424w, https://substackcdn.com/image/fetch/$s_!DH1_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 848w, https://substackcdn.com/image/fetch/$s_!DH1_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 1272w, https://substackcdn.com/image/fetch/$s_!DH1_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DH1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png" width="435" height="169" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:169,&quot;width&quot;:435,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16992,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/178363514?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DH1_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 424w, https://substackcdn.com/image/fetch/$s_!DH1_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 848w, https://substackcdn.com/image/fetch/$s_!DH1_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 1272w, https://substackcdn.com/image/fetch/$s_!DH1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5d2b8f9-b4d9-4ecd-932a-f15368c1b42e_435x169.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>For enterprises in regulated industries, stuck in pilots waiting to move to production with their AI workloads, these trade-offs will be worth making..</p><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EUuU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EUuU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 424w, https://substackcdn.com/image/fetch/$s_!EUuU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 848w, https://substackcdn.com/image/fetch/$s_!EUuU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!EUuU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EUuU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png" width="1456" height="729" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:729,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EUuU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 424w, https://substackcdn.com/image/fetch/$s_!EUuU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 848w, https://substackcdn.com/image/fetch/$s_!EUuU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!EUuU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F180e91c3-6dde-41d3-951a-5f96ad1f50f2_2314x1158.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://newsletter.semianalysis.com/p/clustermax-20-the-industry-standard?_gl=1*cwb47j*_ga*MzUzMTM2MDguMTc2MjYyNzg1NA..*_ga_FKWNM9FBZ3*czE3NjI3MDIzNDgkbzIkZzAkdDE3NjI3MDIzNDgkajYwJGwwJGg4MjgyMzYzNDg.">SemiAnalysis</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ja5g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ja5g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 424w, https://substackcdn.com/image/fetch/$s_!Ja5g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 848w, https://substackcdn.com/image/fetch/$s_!Ja5g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 1272w, https://substackcdn.com/image/fetch/$s_!Ja5g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ja5g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png" width="912" height="533" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:533,&quot;width&quot;:912,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:84939,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/178363514?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ja5g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 424w, https://substackcdn.com/image/fetch/$s_!Ja5g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 848w, https://substackcdn.com/image/fetch/$s_!Ja5g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 1272w, https://substackcdn.com/image/fetch/$s_!Ja5g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8629563d-79c5-4f02-8c72-d404f960caa9_912x533.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Morgan Stanley</figcaption></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><em><a href="https://leoniemonigatti.com/blog/from-rag-to-agent-memory.html">The Evolution from RAG to Agentic RAG to Agent Memory</a></em></p><p><em><a href="https://joincolossus.com/article/inside-cursor/">Inside Cursor</a></em></p><p><em><a href="https://epoch.ai/blog/what-does-osworld-tell-us-about-ais-ability-to-use-computers">What does OSWorld tell us about AI&#8217;s ability to use computers?</a></em></p><p><em><a href="https://theinfraplay.ai/newsletter/why-behind-ai-data-centres-in-space?ss_source=sscampaigns&amp;ss_campaign_id=690b67ac02a3152b8c1fc1f1&amp;ss_email_id=690b683bad3319480013ae0c&amp;ss_campaign_name=Why+behind+AI%3A+Data+centres+in+space&amp;ss_campaign_sent_date=2025-11-05T15%3A08%3A02Z">Why behind AI: Data centres in space</a></em></p><p><em><a href="https://x.com/RampLabs/status/1985442169445105753">Post Training Ensemble vs. Singular Model Approaches with Tinker</a></em></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>During a period wheresome vibe coding tools are seeing slowing growth, Figma Make is speeding up. By the end of September, approximately 30% of customers spending $100,000 or more in ARR were creating in Figma Make on a weekly basis, and that number has continued to grow. We will continue investing heavily in AI, and we will trade near-term margin to build the right long-term platform for our customers.</p><p><strong>Dylan Field, Figma Q3 Earnings Call</strong></p></div><div class="pullquote"><p>We have now onboarded thousands of customers to the Bits AI SRE agent. And as we prepare for general availability, we are getting very enthusiastic feedback on the time and cost savings enabled by Bits AI.</p><p>As RUM user recently told us, with Bits AI SRE being on call 24/7 for us, mean time resolution for our services has improved significantly. For most cases, the investigation is already taken care of well before our engineers sit down and open their laptops to assess the issue. And this is not an isolated comment. We see the potential here for our agents to radically transform observability and operations.</p><p><strong>Olivier Pomel, Datadog Q3 Earnings Call</strong></p></div><div class="pullquote"><p>One thing that&#8217;s become quite evident is that power has become the bottleneck for everyone and power not only means access to energy, but everything underneath it in terms of infrastructure build-out, turbines, transformers, everything associated with generating power.</p><p>So in that environment, everyone wants to move to the most efficient compute platform as possible. Arm is about 50% more efficient than competitive solutions. We&#8217;ve seen that across the board in benchmarks, but also more importantly, in real-life performance. And that&#8217;s why we see NVIDIA, Amazon, Google, Microsoft, Tesla, all using Arm-based technology.</p><p><strong>Rene Haas, Arm Q2 Earnings Call</strong></p></div><div class="pullquote"><p>Because our pricing model scales with the value we deliver, not with seats, our success grows directly alongside our customers. We price on a per profile, message and resolution basis, which lines up perfectly with the outcome-oriented business models AI is enabling.</p><p>Today, more than half of our ARR comes from multi-product customers, which is clear proof that customers want to have everything running off of one platform. This deepens our relationships with customers, improves retention and drives long-term growth</p><p><strong>Amanda Whalen, Klaviyo Q3 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/deterministic-inference-the-latency?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/deterministic-inference-the-latency?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/deterministic-inference-the-latency/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/deterministic-inference-the-latency/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains]]></title><description><![CDATA[Codifying Tribal Knowledge Into Vertical-Specific Reasoning]]></description><link>https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement</link><guid isPermaLink="false">https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Mon, 03 Nov 2025 07:02:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a4c54fe3-6808-4370-a98d-0233272bf89d_590x164.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Read by thousands from OpenAI, Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><em>Gradient Descending Roundtables in London</em></h3><div class="pullquote"><p><strong>November 12th:</strong> <a href="https://luma.com/c6l3s3ly">Small Language Models, Context Engineering and MCP with Microsoft Azure AI</a></p><p><strong>November 18th:</strong> <a href="https://luma.com/9475fjje">Agent Frameworks &amp; Memory with Cloudflare</a></p><p><strong>November 19th:</strong> <a href="https://luma.com/zq485ytc">Designing AI-Native Software with SPACING</a></p></div><blockquote><div><hr></div></blockquote><p>Last week, <a href="https://simonwillison.net/2025/Oct/29/cursor-composer/">Cursor finally released their first frontier coding model</a>, Composer 1 Alpha. </p><p>Many are speculating whether the model is <a href="https://x.com/natolambert/status/1983584412412641496">a fine-tuned Chinese MoE model</a>, which <a href="https://simonwillison.net/2025/Oct/29/cursor-composer/">seems plausible</a> given Cursor&#8217;s stance on where their edge lies:</p><blockquote><p><em>Our primary focus is on RL post-training. We think that is the best way to get the model to be a strong interactive agent.</em></p></blockquote><p>Earlier that morning, we hosted <a href="https://www.linkedin.com/in/aidan-davies-31789992/">Aidan</a> and <a href="https://www.linkedin.com/in/mattspaul/">Matt</a> in our office to discuss <strong><a href="https://www.linkedin.com/in/aidan-davies-31789992/">Scale AI</a></strong>&#8217;s recently published research on <a href="https://arxiv.org/abs/2507.17746">&#8216;Rubrics as Rewards&#8217;</a> and its implications for post-training base models. </p><p>I&#8217;ve written at <a href="https://akashbajwa.substack.com/p/ai-apps-agent-labs">length</a> about how application-layer companies (or &#8216;agent labs&#8217;) capture reward signals that are becoming increasingly valuable as the cost and complexity of post-training collapse (fine-tuning APIs, managed infra). </p><p>&#8216;<a href="https://x.com/DrJimFan/status/1871247034080199129">Reward Engineering</a>&#8217; has proven to be one of the defining AI themes of the year as companies like Thinking Machines Lab, Applied Compute, Osmosis and others emerged with a value proposition of post-training custom models underpinned by high quality reward data. This is all against a backdrop of rapid advances in large model capabilities across objectively verifiable domains like coding and mathematics (where the reward is binary; a proposed code change either runs or it doesn&#8217;t). </p><p>As soon as this vector of scaling RL became clear, the immediate next question was how to apply it to non-verifiable domains where subjectivity plays a bigger role. </p><p>That&#8217;s what the Scale paper is focusing on, proposing a richer way of capturing tribal knowledge and reasoning than simple RLHF or preference-ranking. In effect, this is closer to having a process reward model that not only rewards the final outcome but also the steps taken to get to it - and the results are striking.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sQWU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sQWU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 424w, https://substackcdn.com/image/fetch/$s_!sQWU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 848w, https://substackcdn.com/image/fetch/$s_!sQWU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 1272w, https://substackcdn.com/image/fetch/$s_!sQWU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sQWU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png" width="590" height="164" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:164,&quot;width&quot;:590,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46154,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/177092244?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sQWU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 424w, https://substackcdn.com/image/fetch/$s_!sQWU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 848w, https://substackcdn.com/image/fetch/$s_!sQWU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 1272w, https://substackcdn.com/image/fetch/$s_!sQWU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59720b57-c495-47bf-9afa-f4e7853df5c2_590x164.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bJxs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bJxs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 424w, https://substackcdn.com/image/fetch/$s_!bJxs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 848w, https://substackcdn.com/image/fetch/$s_!bJxs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 1272w, https://substackcdn.com/image/fetch/$s_!bJxs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bJxs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png" width="666" height="261" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:261,&quot;width&quot;:666,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49265,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/177092244?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bJxs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 424w, https://substackcdn.com/image/fetch/$s_!bJxs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 848w, https://substackcdn.com/image/fetch/$s_!bJxs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 1272w, https://substackcdn.com/image/fetch/$s_!bJxs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85d4e31f-169b-45b1-9d69-80b1dc9ed713_666x261.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Below are the notes from our discussion. The relevant papers are <a href="https://arxiv.org/abs/2507.17746">here</a> and <a href="https://arxiv.org/abs/2509.21500">here</a> - you can also see <a href="https://console.scale.com/data-gallery-public">examples of Rubrics here on the Scale website</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ki3h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ki3h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ki3h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ki3h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ki3h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ki3h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2968075,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/177092244?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ki3h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ki3h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ki3h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ki3h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b9db59d-5393-49d7-99d4-1687b59313fb_4032x3024.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Evolution of Model Training: Post-Training</strong></h2><p>The progression of post-training approaches:</p><ol><li><p><strong>Supervised Fine-Tuning (SFT)</strong> - Early approach</p><ul><li><p>Provides prompt-response pairs with &#8220;correct&#8221; answers</p></li><li><p>Useful for smaller, specialised models</p></li><li><p>Falls away for larger, more generalised models</p></li><li><p>Can lead to overfitting in complex reasoning scenarios</p></li></ul></li><li><p><strong>RLHF (Reinforcement Learning from Human Feedback)</strong> - Current approach</p><ul><li><p>Humans select preferred responses between model outputs</p></li><li><p>Effective for simple queries but struggles with multi-step reasoning</p></li><li><p>Volume-intensive and requires extensive human evaluation</p></li><li><p>Risk of reward hacking</p></li></ul></li><li><p><strong>Rubrics-Based Approach</strong> - New approach</p><ul><li><p>Combines expert knowledge with scalable automation</p></li><li><p>Particularly valuable for unverifiable domains</p></li></ul></li></ol><h3><strong>The Continuous Evaluation Loop</strong></h3><p>Training follows a cyclical process:</p><ul><li><p><strong>Evaluation &#8594; Data Production &#8594; Performance Improvement &#8594; Re-evaluation</strong></p></li><li><p>Typically runs in 6-month cycles with 2-3 iterations per model development phase</p></li><li><p>Requires constant creation of new data based on previous evaluation results</p></li><li><p>Includes adversarial components (red teaming) to identify security and safety weaknesses</p></li></ul><div><hr></div><h2><strong>Understanding Rubrics as Rewards</strong></h2><h3><strong>What Are Rubrics?</strong></h3><p>Rubrics are structured evaluation frameworks consisting of:</p><ul><li><p><strong>Binary or weighted criteria</strong> that can be objectively assessed</p></li><li><p><strong>Multiple independent factors</strong> that together evaluate quality</p></li><li><p><strong>Specific rules</strong> stating what makes an ideal response</p></li><li><p><strong>Examples</strong> to help models understand expectations</p></li></ul><h3><strong>Simple Example Structure</strong></h3><p>For a basic prompt, a rubric might include:</p><ul><li><p>Must address the core question</p></li><li><p>Should provide specific details</p></li><li><p>Must avoid certain pitfalls</p></li><li><p>Should follow appropriate tone/style</p></li></ul><h3><strong>Complex Professional Example</strong></h3><p>In an investment context (e.g., market entry analysis for EVs in South America):</p><ul><li><p>20-30 different evaluation components</p></li><li><p>Checks for specific data points (market size, timelines, partnerships)</p></li><li><p>Validates reasoning steps</p></li><li><p>Assesses structural clarity and relevance</p></li><li><p>Can be binary (present/not present) or weighted</p></li></ul><div><hr></div><h2><strong>Three Key Advantages of Rubrics</strong></h2><h3><strong>1. Mitigating Reward Hacking</strong></h3><p><strong>The Problem:</strong> When using simple preference selection, models may learn unintended patterns</p><ul><li><p>Example: Accidentally training preference for &#8220;red cars over blue cars&#8221; when color wasn&#8217;t the intended differentiator</p></li><li><p>Human evaluators have intrinsic biases that can skew preference data</p></li></ul><p><strong>How Rubrics Help:</strong></p><ul><li><p>Break down evaluation into specific, objective criteria</p></li><li><p>Exclude components that might drive unintended behavior</p></li><li><p>Allow larger volumes of diverse data to be generated</p></li><li><p>A recent Scale AI paper empirically demonstrated reduced reward hacking with rubrics</p></li></ul><h3><strong>2. Adaptability</strong></h3><p><strong>The Challenge:</strong> Models need constant refinement based on real-world performance</p><ul><li><p>Errors emerge during deployment</p></li><li><p>Requirements change based on user feedback</p></li><li><p>New security issues are discovered</p></li></ul><p><strong>Rubrics Solution:</strong></p><ul><li><p>Much easier to tweak a rubric than retrain thousands of human evaluators</p></li><li><p>Can quickly adjust weighting or add new criteria</p></li><li><p>Enables faster iteration cycles</p></li><li><p>More agile response to identified problems</p></li></ul><h3><strong>3. Domain-Specific Control</strong></h3><p><strong>Different domains require different priorities:</strong></p><ul><li><p><strong>Creative writing:</strong> Style, tone, human-like quality matter most</p></li><li><p><strong>Mathematics:</strong> Correct answer and logical reasoning are paramount</p></li><li><p><strong>Professional contexts:</strong> Structural clarity, verifiable data, reasoning transparency</p></li></ul><p><strong>Rubrics enable precise weighting</strong> of different evaluation components based on what matters in each specific domain.</p><div><hr></div><h2><strong>The Role of LLMs in Rubric Evaluation</strong></h2><h3><strong>Automation Through AI</strong></h3><ul><li><p><strong>Humans design the rubrics</strong> (expert knowledge)</p></li><li><p><strong>LLMs automate the evaluation</strong> against those rubrics</p></li><li><p>This creates scalable volume while maintaining expert-defined standards</p></li></ul><h3><strong>Why This Works</strong></h3><p>LLMs are particularly good at:</p><ul><li><p>Checking if specific elements are present</p></li><li><p>Verifying binary conditions</p></li><li><p>Following clear, objective rules</p></li></ul><h3><strong>Model Selection for Evaluation</strong></h3><ul><li><p>For research purposes (as in the paper): Used GPT-4 Mini as LLM judge to ensure standardisation</p></li><li><p>For production: Smaller models can work if rubrics are well-designed</p></li><li><p>Frontier labs building cutting-edge models still prefer human-written rubrics</p></li><li><p>Smaller budget projects can use LLM-generated rubrics with acceptable results</p></li></ul><div><hr></div><h2><strong>Practical Applications</strong></h2><h3><strong>Investment Due Diligence</strong></h3><p><strong>Multi-step reasoning example:</strong></p><ul><li><p>Initial analysis of market conditions</p></li><li><p>Financial projections</p></li><li><p>Partnership identification</p></li><li><p>Risk assessment</p></li><li><p>Each step can have its own rubric</p></li><li><p>Important to maintain observable reasoning pathways</p></li></ul><h3><strong>Healthcare/Medicine</strong></h3><p><strong>Unverifiable domain characteristics:</strong></p><ul><li><p>No single &#8220;correct&#8221; diagnosis for complex presentations</p></li><li><p>Expert intuition difficult to codify</p></li><li><p>Rubrics help capture decision-making criteria</p></li><li><p>Can evaluate diagnostic reasoning steps</p></li></ul><h3><strong>Legal Services</strong></h3><p><strong>Contract analysis requirements:</strong></p><ul><li><p>Specific clause identification</p></li><li><p>Risk assessment</p></li><li><p>Compliance checking</p></li><li><p>Precedent application</p></li></ul><h3><strong>Insurance</strong></h3><p><strong>Knowledge extraction challenge:</strong></p><ul><li><p>Much expertise exists only &#8220;in people&#8217;s heads&#8221;</p></li><li><p>Retiring workforce creates knowledge drain</p></li><li><p>Rubrics help codify expert decision-making</p></li><li><p>Quality assurance already uses similar scoring systems</p></li></ul><div><hr></div><h2><strong>Technical Details from the Research</strong></h2><h3><strong>Experimental Setup</strong></h3><ul><li><p><strong>Base model:</strong> Qwen 2</p></li><li><p><strong>Comparison:</strong> Rubrics approach vs. direct preference ranking</p></li><li><p><strong>Evaluation:</strong> GPT-4 Mini as judge for standardisation</p></li><li><p><strong>Training:</strong> Offline RL (not online due to cost)</p></li></ul><h3><strong>Performance Results</strong></h3><p>The rubrics approach showed improvements in:</p><ul><li><p>Context awareness</p></li><li><p>Communication quality</p></li><li><p>Accuracy in reasoning-heavy tasks</p></li></ul><h3><strong>Key Finding</strong></h3><p>Rubrics are particularly effective for:</p><ul><li><p>Complex, multi-step reasoning</p></li><li><p>Domains without clear &#8220;correct&#8221; answers</p></li><li><p>Professional/expert knowledge domains</p></li><li><p>Tasks requiring observable reasoning chains</p></li></ul><div><hr></div><h2><strong>Critical Considerations and Debates</strong></h2><h3><strong>Evaluating the Evaluators</strong></h3><p><strong>The recursion problem:</strong></p><ul><li><p>How do you evaluate whether rubrics are good?</p></li><li><p>Ultimately requires testing impact on model performance</p></li><li><p>Need comprehensive evaluation datasets for each domain</p></li><li><p>Iterative refinement based on real-world results</p></li></ul><h3><strong>Human vs. Automated Rubric Creation</strong></h3><p><strong>For frontier performance:</strong> Human-written rubrics still superior</p><p><strong>For smaller models/budgets:</strong> LLM-generated rubrics can work</p><p><strong>Key insight:</strong> The intersection of domain expertise and understanding of model behaviour is rare and valuable</p><div><hr></div><h2><strong>Market Implications</strong></h2><h3><strong>The Changing Nature of AI Data Work</strong></h3><p><strong>From volume to expertise:</strong></p><ul><li><p>Less emphasis on mass preference labelling</p></li><li><p>More focus on expert rubric design</p></li><li><p>Smaller teams of highly skilled data creators</p></li><li><p>Higher per-person cost but better outcomes</p></li></ul><h3><strong>Professional Knowledge Extraction</strong></h3><p><strong>The hidden expertise problem:</strong></p><ul><li><p>Most professional knowledge isn&#8217;t documented</p></li><li><p>Exists in practitioners&#8217; intuition and experience</p></li><li><p>Examples: Insurance brokers, doctors, lawyers, investment analysts</p></li><li><p>Rubrics provide a framework to codify this tacit knowledge</p></li></ul><h3><strong>Future Opportunities</strong></h3><ul><li><p>Industries with retiring workforces (insurance, legal)</p></li><li><p>Domains with complex, multi-step reasoning</p></li><li><p>Applications requiring explainable AI decisions</p></li><li><p>Situations where &#8220;average&#8221; data isn&#8217;t sufficient</p></li></ul><div><hr></div><h2><strong>Autonomous Improvement Discussion</strong></h2><h3><strong>The Holy Grail Question</strong></h3><p>Can frontier models achieve fully autonomous self-improvement using only AI-generated rubrics?</p><p><strong>Current state:</strong></p><ul><li><p>Deepseek recently claimed full self-improvement in specific domains</p></li><li><p>Depends heavily on model size and domain complexity</p></li><li><p>Still an open research question</p></li><li><p>Likely varies significantly by application</p></li></ul><h3><strong>Hybrid Approaches</strong></h3><p>Most realistic near-term:</p><ul><li><p>Human-designed rubrics for frontier domains</p></li><li><p>LLM-automated evaluation and data generation</p></li><li><p>Human oversight for refinement and validation</p></li><li><p>Iterative improvement cycles</p></li></ul><div><hr></div><h2><strong>Key Takeaways</strong></h2><ol><li><p><strong>Rubrics represent a middle ground</strong> between fully manual RLHF and purely automated approaches</p></li><li><p><strong>Most effective for unverifiable domains</strong> where there&#8217;s no single correct answer but expert judgment exists</p></li><li><p><strong>Enables scaling expert knowledge</strong> by codifying it into structured, scalable evaluation criteria</p></li><li><p><strong>Reduces but doesn&#8217;t eliminate human involvement</strong> - shifts humans to higher-level rubric design rather than individual preference selection</p></li><li><p><strong>Particularly valuable for professional applications</strong> in law, medicine, finance, and other expert domains</p></li><li><p><strong>The data market is shifting</strong> from volume-based preference ranking to expert-driven, rubric-based approaches</p></li><li><p><strong>Quality of rubric design is critical</strong> - requires intersection of domain expertise and understanding of AI model behavior</p></li></ol><div><hr></div><h2><strong>Future Directions</strong></h2><ul><li><p>Application to visual language models and robotics</p></li><li><p>Integration with existing quality assurance frameworks</p></li><li><p>Automated rubric generation for non-frontier applications</p></li><li><p>Expansion into multimodal domains</p></li><li><p>Development of standardised rubric libraries for common domains</p></li></ul><div><hr></div><h3><em>Signals</em></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5lO4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5lO4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 424w, https://substackcdn.com/image/fetch/$s_!5lO4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 848w, https://substackcdn.com/image/fetch/$s_!5lO4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 1272w, https://substackcdn.com/image/fetch/$s_!5lO4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5lO4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7cee47a-aa40-402e-b953-71ab44814332_1728x972.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A line chart showing the number of new developers joining GitHub from 2020 to 2025. The line rises steadily, reaching 36.2 million in 2025, with a sharp increase after the launch of Copilot Free in late 2024. The chart has a dark background with blue data lines and the title &#8216;The number of new developers on GitHub.&#8217;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A line chart showing the number of new developers joining GitHub from 2020 to 2025. The line rises steadily, reaching 36.2 million in 2025, with a sharp increase after the launch of Copilot Free in late 2024. The chart has a dark background with blue data lines and the title &#8216;The number of new developers on GitHub.&#8217;" title="A line chart showing the number of new developers joining GitHub from 2020 to 2025. The line rises steadily, reaching 36.2 million in 2025, with a sharp increase after the launch of Copilot Free in late 2024. The chart has a dark background with blue data lines and the title &#8216;The number of new developers on GitHub.&#8217;" srcset="https://substackcdn.com/image/fetch/$s_!5lO4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 424w, https://substackcdn.com/image/fetch/$s_!5lO4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 848w, https://substackcdn.com/image/fetch/$s_!5lO4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 1272w, https://substackcdn.com/image/fetch/$s_!5lO4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cee47a-aa40-402e-b953-71ab44814332_1728x972.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/">Source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cc5F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cc5F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 424w, https://substackcdn.com/image/fetch/$s_!Cc5F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 848w, https://substackcdn.com/image/fetch/$s_!Cc5F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc5F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cc5F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png" width="1440" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A table listing the fastest-growing open source projects on GitHub in 2025 by contributors. The top ten are zen-browser/desktop, cline/cline, vllm-project/vllm, astral-sh/uv, microsoft/vscode, infiniflow/ragflow, sgl-project/sglang, continuedev/continue, comfyanonymous/ComfyUI, and home-assistant/core. Growth rates range from 2,301% to 6,836%, with most projects marked as AI-focused. Displayed on a blue gradient background with the GitHub Octoverse ribbon graphic.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A table listing the fastest-growing open source projects on GitHub in 2025 by contributors. The top ten are zen-browser/desktop, cline/cline, vllm-project/vllm, astral-sh/uv, microsoft/vscode, infiniflow/ragflow, sgl-project/sglang, continuedev/continue, comfyanonymous/ComfyUI, and home-assistant/core. Growth rates range from 2,301% to 6,836%, with most projects marked as AI-focused. Displayed on a blue gradient background with the GitHub Octoverse ribbon graphic." title="A table listing the fastest-growing open source projects on GitHub in 2025 by contributors. The top ten are zen-browser/desktop, cline/cline, vllm-project/vllm, astral-sh/uv, microsoft/vscode, infiniflow/ragflow, sgl-project/sglang, continuedev/continue, comfyanonymous/ComfyUI, and home-assistant/core. Growth rates range from 2,301% to 6,836%, with most projects marked as AI-focused. Displayed on a blue gradient background with the GitHub Octoverse ribbon graphic." srcset="https://substackcdn.com/image/fetch/$s_!Cc5F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 424w, https://substackcdn.com/image/fetch/$s_!Cc5F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 848w, https://substackcdn.com/image/fetch/$s_!Cc5F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc5F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2322a8e-1beb-4c83-870f-6f64da5170ba_1440x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/">Source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ULqR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ULqR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 424w, https://substackcdn.com/image/fetch/$s_!ULqR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 848w, https://substackcdn.com/image/fetch/$s_!ULqR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 1272w, https://substackcdn.com/image/fetch/$s_!ULqR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ULqR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png" width="573" height="604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:604,&quot;width&quot;:573,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/177092244?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ULqR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 424w, https://substackcdn.com/image/fetch/$s_!ULqR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 848w, https://substackcdn.com/image/fetch/$s_!ULqR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 1272w, https://substackcdn.com/image/fetch/$s_!ULqR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf18370-ca8e-408a-a5d7-be75c56c5f5c_573x604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://atomproject.ai/">Source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nQsr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nQsr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 424w, https://substackcdn.com/image/fetch/$s_!nQsr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 848w, https://substackcdn.com/image/fetch/$s_!nQsr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 1272w, https://substackcdn.com/image/fetch/$s_!nQsr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nQsr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png" width="562" height="623" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:623,&quot;width&quot;:562,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:64654,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/177092244?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nQsr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 424w, https://substackcdn.com/image/fetch/$s_!nQsr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 848w, https://substackcdn.com/image/fetch/$s_!nQsr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 1272w, https://substackcdn.com/image/fetch/$s_!nQsr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95d821b9-24d5-4e70-8ca8-a98b091cd655_562x623.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://atomproject.ai/">Source</a></figcaption></figure></div><div><hr></div><h3><em>What I&#8217;m Reading</em></h3><p><em><a href="https://amasad.me/keep-winning">How to Keep Winning</a></em></p><p><em><a href="https://docs.google.com/document/d/1BzMItvkAxkyu4cYPjJ3qvcK__FNTL-WhefDWKFuagFE/edit?tab=t.0">Top Down and Bottom Up Investors</a></em></p><p><em><a href="https://welovesota.com/article/the-case-for-the-return-of-fine-tuning">The Case for the Return of Fine-Tuning</a></em></p><p><em><a href="https://onelayerdeeper.substack.com/p/is-diffusion-the-future-of-llms-d0c">Is Diffusion the Future of LLMs?</a></em></p><p><em><a href="https://cheekypint.substack.com/p/marc-andreessen-and-charlie-songhurst">Marc Andreessen and Charlie Songhurst on the past, present, and future of Silicon Valley</a></em></p><p><em><a href="https://interconnect.substack.com/p/europes-best-sovereign-ai-hope-nebius">Europe&#8217;s Hidden Sovereign AI Gem: Nebius</a></em></p><div><hr></div><h3><em>Earnings Commentary</em></h3><div class="pullquote"><p>&#8220;Today, virtually every business is becoming a software business, and AI has made software easier than ever to create. In this world, we believe your design, your craft and your brand&#8217;s point of view is what&#8217;s going to make your product and your company stand out. <strong>Design is now the differentiator. It&#8217;s how companies win or lose.</strong>&#8220;&#8216;</p><p><strong>Dylan Field, Figma Q2 Earnings Call</strong></p></div><div class="pullquote"><p>When people hear [Data Cloud], naturally assume, this must be a Snowflake competitor or a Databricks competitor, and that&#8217;s just not the case. <strong>Snowflake and Databricks and BigQuery and Redshift are among our biggest partners.</strong></p><p>Salespeople do not log in to Snowflake. Snowflake is fantastic, but it&#8217;s really analysts that tend to log into Snowflake, not salespeople. People in the contact center do not log into Databricks... That&#8217;s the problem that Data 360 really, really, really solves.</p><p><strong>Stephen Fisher, Salesforce Analyst Day</strong></p></div><div class="pullquote"><p>Our most recent Adobe Digital Index data which is based on online transactions across over 1 trillion visits to U.S. retail sites, shows that <strong>LLM traffic grew 4,700% year-over-year in July 2025.</strong> The rapid changes in consumer behavior and expectations in the era of AI are forcing brands to reinvent marketing and customer experience.</p><p><strong>Anil Chakravarthy, Adobe Q3 Earnings Call</strong></p></div><div class="pullquote"><p>We estimate <strong>80% of the leading AI companies already rely on us</strong>. A huge percentage of the Internet sits behind us. The agents of the future will inherently have to pass through our network and abide by its rules. And as they do, we will help set the protocols, guardrails and business rules for the Agentic Internet of the future.</p><p><strong>Matthew Prince, Cloudflare Q3 Earnings Call</strong></p></div><div class="pullquote"><p>Machines simply can&#8217;t govern themselves, AI is like any other enterprise asset, it needs to be <strong>cataloged, tracked, supervised and secured.</strong> ServiceNow&#8217;s configuration management leadership gives us and our customers a clean single pane of glass to <strong>govern</strong> all artificial intelligence.</p><p><strong>Bill McDermott, ServiceNow Q3 Earnings Call</strong></p></div><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/rubrics-as-rewards-reinforcement/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item><item><title><![CDATA[Building Enterprise AI: Databricks' Chief AI Officer Maria Zervou]]></title><description><![CDATA[Governing Agents In Production]]></description><link>https://www.akashbajwa.co/p/building-enterprise-ai-databricks</link><guid isPermaLink="false">https://www.akashbajwa.co/p/building-enterprise-ai-databricks</guid><dc:creator><![CDATA[Akash Bajwa]]></dc:creator><pubDate>Mon, 20 Oct 2025 06:00:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uoBe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Software Synthesis analyses the evolution of <strong>software companies in the age of AI</strong> - from how they're built and scaled, to how they go to market and create enduring value. You can reach <strong><a href="https://www.linkedin.com/in/akashbajwa/">me</a></strong> at <strong>akash@earlybird.com</strong>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>Join thousands of founders, operators and investors from the likes of Databricks, Stripe, Figma, and more</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p><em><strong>Gradient Descending Roundtables in London:</strong></em></p><blockquote><ul><li><p><strong>October 29th: </strong><a href="https://luma.com/29bjx5vn">Rubrics as Reward: RL Beyond Verifiable Domains with Scale AI</a></p></li></ul><div><hr></div></blockquote><p>Last week, we hosted <a href="https://www.linkedin.com/in/maria-zervou-533222107/">Maria Zervou</a>, <strong>Chief AI Officer EMEA</strong> at <strong>Databricks</strong> for a <em><strong><a href="https://luma.com/gradientdescending?k=c">Gradient Descending</a></strong></em> roundtable. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uoBe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uoBe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uoBe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uoBe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uoBe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uoBe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg" width="1456" height="1941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2485412,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/176584563?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uoBe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uoBe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uoBe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uoBe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc45e84-e731-4a79-8471-a96c68cfa721_4032x3024.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Databricks&#8217; <a href="https://tomtunguz.com/databricks-center-ai/">phenomenal growth at scale</a> (50% YoY, $4bn run-rate, 140% NDR) underlines its position as the leading platform for enterprises to deploy AI.</p><p>The roundtable covered a lot of ground, with the below key takeaways:</p><h3><strong>Data Intelligence vs. General AI</strong></h3><p>Databricks&#8217; approach is distinct from generic LLM applications:</p><ul><li><p><strong>General AI</strong>: Using OpenAI/Anthropic models with prompt engineering alone</p></li><li><p><strong>Data Intelligence</strong>: Grounding AI in proprietary data with robust governance</p></li><li><p>Competitive advantage comes from data access, not just model selection</p></li></ul><h3><strong>Three Pillars of Databricks&#8217; AI Platform</strong></h3><p><strong>Governance (Unity Catalog)</strong></p><ul><li><p>Centralised permissions for data, models, and tools</p></li><li><p>End-to-end lineage tracking from data &#8594; application &#8594; end user</p></li><li><p>Critical for EU compliance and financial services</p></li><li><p>Open source governance layer</p></li></ul><p><strong>Grounded AI</strong></p><ul><li><p>Opening data/tools to agents in controlled ways</p></li><li><p>Python tools, MCP servers, SQL access, vector stores</p></li><li><p>Permission propagation from end user to data layer</p></li></ul><p><strong>Flexible AI</strong></p><ul><li><p>Partnership with all major cloud providers, OpenAI, Anthropic</p></li><li><p>Support for all open-source models (including Chinese models via self-hosting)</p></li><li><p>Model routing capabilities for cost/performance optimisation</p></li></ul><h3><strong>The Evaluation Challenge</strong></h3><p><strong>Current State:</strong></p><ul><li><p>Customers don&#8217;t know which models work best for specific tasks</p></li></ul><p><strong>Databricks&#8217; Solution:</strong></p><ul><li><p>Custom &#8220;judges&#8221; (fine-tuned models for specific evaluation tasks)</p><ul><li><p>Out-of-box judges for: grounding, safety, bias</p></li></ul></li><li><p>MLflow-based evaluation framework (open source)</p></li><li><p>Version tracking and A/B testing capabilities</p></li><li><p>SME labelling sessions for continuous improvement</p></li></ul><h3><strong>What&#8217;s Happening In The Field </strong></h3><p><strong>Potential Reasons For The MIT &#8220;95% Failure Rate&#8221; Report:</strong></p><ul><li><p>Primary reason: AI products cannot log, trace, understand, or prove value</p></li><li><p>Lack of end-user involvement in development process</p></li><li><p>Insufficient hands-on support and change management</p></li></ul><p><strong>Real-World Agent Limitations:</strong></p><ul><li><p>Multi-step execution: Maximum ~5 steps in production</p></li><li><p>Steps are <strong>templatised and controlled</strong>, not free-form</p></li><li><p>Financial services require predefined workflows with reasoning at each step</p></li></ul><p><strong>Human-in-the-Loop is Reality:</strong></p><ul><li><p>Automation dream vs. reality disconnect</p></li><li><p>Even with reasoning capabilities, manual approval required</p></li><li><p>Example: Credit limit increases still need human sign-off</p></li></ul><p><strong>Fine-tuning?</strong></p><ul><li><p><strong>Not much adoption</strong> despite initial hype</p></li><li><p>Reasons:</p><ul><li><p>Maintenance burden (retraining, drift monitoring)</p></li><li><p>Unclear ROI vs. larger base models</p></li><li><p>Only justified for specific industry vocabulary or behaviour constraints</p><ul><li><p>Example: Medical app - never give advice, only facilitate conversation</p></li></ul></li></ul></li></ul><h3><strong>Use Case Evolution</strong></h3><p><strong>Three Tiers of AI Adoption:</strong></p><ol><li><p><strong>Productivity Improvements</strong> (2023-2024)</p><ul><li><p>Copilots for individual users</p></li><li><p>Incremental efficiency gains</p></li></ul></li><li><p><strong>Process Automation</strong> (Current focus)</p><ul><li><p>Templatising business processes</p></li><li><p>Agent-driven workflow automation</p></li><li><p>Scaling human tasks</p></li></ul></li><li><p><strong>Business Model Innovation</strong> (Emerging)</p><ul><li><p>Complete rethinking of value propositions</p></li><li><p>Example: Behavioural assessment company &#8594; AI-powered team formation and task assignment platform</p></li><li><p>Represents genuine innovation beyond productivity</p></li></ul></li></ol><h3><strong>Agent Development: Different Approaches With Databricks</strong></h3><p><strong>1. Full Hands-On Development</strong></p><ul><li><p>Complete control over agent behaviour</p><ul><li><p>For complex, custom requirements</p></li><li><p>Requires engineering expertise</p></li></ul></li></ul><p><strong>2. Agent Bricks (Semi-Hands-On)</strong></p><ul><li><p>Pre-built components:</p><ul><li><p>Information extraction</p></li><li><p>Knowledge assistant (RAG)</p></li><li><p>AI BI Genie (text-to-SQL)</p></li><li><p>Customer LLM (fine-tuning)</p></li><li><p>Multi-agent supervisor</p></li></ul></li><li><p>Still requires understanding of extraction processes and evaluation</p></li><li><p>Not for non technical business users</p></li></ul><h3><strong>GTM Dynamics </strong></h3><p><strong>Historical Approach (Pre-2024):</strong></p><ul><li><p>Engineering-focused, bottom-up sales</p></li><li><p>Build POCs/MVPs with engineering teams</p></li><li><p>Blocked at large organisations without C-suite buy-in</p></li></ul><p><strong>Current Approach:</strong></p><ul><li><p>Top-down + bottom-up simultaneously</p></li><li><p>FDE (Forward Deployment Engineers) for hands-on work</p></li><li><p>C-suite positioning for change management</p></li><li><p>Industry roundtables for peer learning</p></li></ul><p><strong>What Works:</strong></p><ul><li><p>Bringing customers together to share implementations</p><ul><li><p>&#8220;Customers don&#8217;t want to be left behind&#8221;</p></li></ul></li><li><p>Success tracking: New use cases 1-2 months post-event</p></li><li><p>Digital natives adopt much faster than traditional enterprises</p></li></ul><h3><strong>Data Marketplace </strong></h3><p><strong>Delta Sharing:</strong></p><ul><li><p>Structured data exchange between companies</p></li><li><p>Use case examples:</p><ul><li><p>Retail: Suppliers &#8596; Supermarkets</p></li><li><p>Travel: Airports &#8596; Airlines</p></li></ul></li><li><p>Enables better collaborative models without competitive compromise</p></li></ul><p><strong>Solution Marketplace:</strong></p><ul><li><p>Customers can sell bespoke agents to other companies</p><ul><li><p>Example: Travel industry agent</p></li></ul></li><li><p>Revenue stream for customers</p></li></ul><h3><strong>MCP Integration</strong></h3><ul><li><p>Databricks has made a large investment in MCP support</p></li><li><p>Auto-release of tools as MCP servers (text-to-SQL, similarity search)</p></li><li><p>Bringing external MCPs into Unity Catalog with permission management</p></li><li><p>Planning MCP/agent marketplace</p></li><li><p>External MCP usage: Expect usage-based charging similar to APIs</p></li></ul><div><hr></div><h3><em><strong>Data</strong></em></h3><p><em>Coatue&#8217;s updated Fantastic 40</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mbIW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mbIW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 424w, https://substackcdn.com/image/fetch/$s_!mbIW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 848w, https://substackcdn.com/image/fetch/$s_!mbIW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 1272w, https://substackcdn.com/image/fetch/$s_!mbIW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mbIW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png" width="1456" height="664" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:664,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186989,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/176584563?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mbIW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 424w, https://substackcdn.com/image/fetch/$s_!mbIW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 848w, https://substackcdn.com/image/fetch/$s_!mbIW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 1272w, https://substackcdn.com/image/fetch/$s_!mbIW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05da47d6-f287-4747-b5a8-881cb6e57851_2624x1196.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.coatue.com/fantastic-40">Source</a></figcaption></figure></div><p><em>OpenAI&#8217;s ramp is unprecedented by a mile, dominated by ChatGPT</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mkMM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mkMM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 424w, https://substackcdn.com/image/fetch/$s_!mkMM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 848w, https://substackcdn.com/image/fetch/$s_!mkMM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 1272w, https://substackcdn.com/image/fetch/$s_!mkMM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mkMM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png" width="1042" height="486" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:486,&quot;width&quot;:1042,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:217218,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.akashbajwa.co/i/176584563?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mkMM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 424w, https://substackcdn.com/image/fetch/$s_!mkMM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 848w, https://substackcdn.com/image/fetch/$s_!mkMM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 1272w, https://substackcdn.com/image/fetch/$s_!mkMM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35246cf5-9dbe-4265-9533-807bc1fe5048_1042x486.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Morgan Stanley, The Information</figcaption></figure></div><p><em>Reflection AI might change things, but clear that a large % of AI ecosystem now reliant on truly open models from China</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Kfg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Kfg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 424w, https://substackcdn.com/image/fetch/$s_!8Kfg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 848w, https://substackcdn.com/image/fetch/$s_!8Kfg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 1272w, https://substackcdn.com/image/fetch/$s_!8Kfg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Kfg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png" width="647" height="496" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4755c48-a538-4960-9328-2843df604d6d_647x496.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:496,&quot;width&quot;:647,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Line chart titled How each companys best open weight model ranks on LMS Arena with rankings from January 2024 to July 2025 showing US company in blue Chinese in red and other in gray lines for models from Meta Google Nvidia Mistral Cohere Alibaba DeepSeek Zai Minmax Moonshot and others with source LMS Arena.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Line chart titled How each companys best open weight model ranks on LMS Arena with rankings from January 2024 to July 2025 showing US company in blue Chinese in red and other in gray lines for models from Meta Google Nvidia Mistral Cohere Alibaba DeepSeek Zai Minmax Moonshot and others with source LMS Arena." title="Line chart titled How each companys best open weight model ranks on LMS Arena with rankings from January 2024 to July 2025 showing US company in blue Chinese in red and other in gray lines for models from Meta Google Nvidia Mistral Cohere Alibaba DeepSeek Zai Minmax Moonshot and others with source LMS Arena." srcset="https://substackcdn.com/image/fetch/$s_!8Kfg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 424w, https://substackcdn.com/image/fetch/$s_!8Kfg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 848w, https://substackcdn.com/image/fetch/$s_!8Kfg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 1272w, https://substackcdn.com/image/fetch/$s_!8Kfg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4755c48-a538-4960-9328-2843df604d6d_647x496.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://x.com/home">Source</a></figcaption></figure></div><div><hr></div><h3><em>Reading</em></h3><p><a href="https://thedatasource.substack.com/p/the-data-source-28-mapping-out-the">The Data Source #28: Mapping out the Future of Compute &#127758;</a></p><p><a href="https://x.com/joshua_xu_/status/1978837502787219578/?s=12&amp;t=LZY0YqfYwlnmbBEhDFCL8Q&amp;rw_tt_thread=True">Building in the AI Era: The HeyGen Way</a></p><p><a href="https://review.firstround.com/sierra-design-partnership/">The Hard Way Pays Off: Inside Sierra&#8217;s Design Partner Strategy</a></p><p><a href="https://joincolossus.com/article/joshua-kushner-thrive-new-world/">The New World</a></p><p><a href="https://every.to/chain-of-thought/seeing-science-like-a-language-model?ph_email=akash%40earlybird.com">Seeing Science Like a Language Model</a></p><div><hr></div><p><em>Have any feedback? Email me at akash@earlybird.com.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/building-enterprise-ai-databricks?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/building-enterprise-ai-databricks?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.akashbajwa.co/p/building-enterprise-ai-databricks/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.akashbajwa.co/p/building-enterprise-ai-databricks/comments"><span>Leave a comment</span></a></p><h3></h3>]]></content:encoded></item></channel></rss>