<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Wangari]]></title><description><![CDATA[A particle physicist turned founder, thinking out loud about enterprise AI, agentic systems, and how to responsibly use the technology reshaping how we work.]]></description><link>https://newsletter.wangari.global</link><image><url>https://substackcdn.com/image/fetch/$s_!cVMw!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda913d1d-ccae-463d-bca0-2752f45cdcc4_778x778.png</url><title>Wangari</title><link>https://newsletter.wangari.global</link></image><generator>Substack</generator><lastBuildDate>Thu, 18 Jun 2026 18:13:47 GMT</lastBuildDate><atom:link href="https://newsletter.wangari.global/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Ari Joury]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[contact@wangari.global]]></webMaster><itunes:owner><itunes:email><![CDATA[contact@wangari.global]]></itunes:email><itunes:name><![CDATA[Ari Joury]]></itunes:name></itunes:owner><itunes:author><![CDATA[Ari Joury]]></itunes:author><googleplay:owner><![CDATA[contact@wangari.global]]></googleplay:owner><googleplay:email><![CDATA[contact@wangari.global]]></googleplay:email><googleplay:author><![CDATA[Ari Joury]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The Actuarial Time Trap]]></title><description><![CDATA[Why highly paid professionals spend 80% of their time formatting data, and how agentic AI can finally break the cycle.]]></description><link>https://newsletter.wangari.global/p/the-actuarial-time-trap</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-actuarial-time-trap</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 16 Jun 2026 06:02:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!f7eR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f7eR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f7eR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!f7eR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!f7eR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!f7eR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f7eR!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:298234,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/201640131?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f7eR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!f7eR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!f7eR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!f7eR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc04e432-0a42-40fd-8c97-e40bb55d095d_1344x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highly trained individuals are often sifting through complex data manually, instead of giving their time to value-adding tasks. Image generated with Leonardo AI</figcaption></figure></div><p>Walk into the actuarial department of any major insurance firm, and you will find some of the most highly educated, analytically brilliant minds in the corporate world. These are professionals trained in advanced mathematics, probability theory, and complex risk modeling.</p><p>Now, ask them how they spend the majority of their working hours.</p><p>The answer is rarely &#8220;building sophisticated risk models&#8221; or &#8220;developing innovative pricing strategies.&#8221; More often than not, the answer is &#8220;wrestling with spreadsheets,&#8221; &#8220;reconciling data from legacy systems,&#8221; or &#8220;formatting regulatory reports.&#8221;</p><p>This is the Actuarial Time Trap. It is a systemic misallocation of human capital, where highly paid experts are relegated to performing routine data manipulation tasks. It is inefficient, it is demoralizing, and in an era of rapidly evolving regulatory requirements, it is increasingly unsustainable.</p><h2>The Burden of Regulatory Reporting</h2><p>The insurance industry is governed by some of the most complex regulatory frameworks in the world. Regimes like Solvency II in Europe and IFRS 17 globally require insurers to produce massive, highly detailed reports on a regular basis.</p><p>These reports are not simple data dumps. They require aggregating data from dozens of disparate systems, applying complex actuarial models, and presenting the results in strictly defined formats.</p><p>Historically, this process has been heavily manual. Actuaries spend weeks pulling data, running macros, checking for errors, and formatting the final output. By the time the report is submitted, the data is often stale, and the actuaries are exhausted.</p><p>This manual approach is not just slow; it is prone to error. When humans are forced to perform repetitive data manipulation tasks, mistakes are inevitable. And in the context of regulatory reporting, mistakes can result in significant fines and reputational damage.</p><h2>The Promise (and Failure) of Traditional Automation</h2><p>The industry has recognized this problem for years, and has attempted to solve it with traditional automation tools. Robotic Process Automation (RPA) bots have been deployed to scrape data from legacy systems. Complex ETL (Extract, Transform, Load) pipelines have been built to consolidate data warehouses.</p><p>These efforts have yielded incremental improvements, but they have failed to fundamentally break the Actuarial Time Trap.</p><p>The problem with traditional automation is that it is rigid. An RPA bot can follow a strict set of rules, but it cannot adapt to unexpected changes in data formats or system interfaces. An ETL pipeline can move data from point A to point B, but it cannot understand the semantic meaning of that data or apply complex business logic.</p><p>Traditional automation is brittle. When the environment changes&#8212;as it inevitably does in a complex enterprise&#8212;the automation breaks, and the actuaries are forced to step back in and fix the mess.</p><h2>The Agentic AI Solution</h2><p>This is where agentic AI represents a paradigm shift.</p><p>Unlike traditional automation, agentic AI systems are not bound by rigid rules. They are capable of reasoning, adapting, and executing complex, multi-step workflows autonomously.</p><p>An agentic AI reporting system can be instructed to &#8220;generate the Q3 Solvency II report.&#8221; The system can then autonomously identify the required data sources, retrieve the data, apply the necessary actuarial models, format the output according to regulatory standards, and flag any anomalies for human review.</p><p>Crucially, agentic AI systems can handle the ambiguity and variability that break traditional automation. If a data field is missing or formatted incorrectly, the agent can use its reasoning capabilities to infer the correct value or query an upstream system for clarification.</p><p>However, deploying these systems in actuarial work is not without risk. As the IFoA GenAI Working Party highlights, the complexity and autonomy of a network of AI agents <a href="https://ifoagenai.substack.com/p/emerging-risks-of-agentic-ai-in-actuarial">add a new dimension to risks</a>, making them harder to manage and requiring dynamic governance frameworks.</p><h2>The Importance of Auditability</h2><p>In the actuarial domain, automation without auditability is useless. Regulators do not accept &#8220;the AI generated it&#8221; as a valid explanation for a reporting anomaly.</p><p>This is why the deployment of agentic AI in insurance must be underpinned by causal reasoning and rigorous governance. Every action taken by the agent&#8212;every data retrieval, every transformation, every calculation&#8212;must be logged and explainable.</p><p>The system must be able to produce a transparent audit trail that traces the final output back to its source data, demonstrating exactly how the result was derived. This level of transparency is not just a regulatory requirement; it is essential for building trust among the actuaries who will ultimately rely on the system.</p><h2>The Human-AI Collaboration Model</h2><p>The most effective deployments of agentic AI in the actuarial domain are not fully autonomous. They are collaborative. The agent handles the mechanical work&#8212;data retrieval, transformation, and initial formatting&#8212;while the actuary retains oversight and final sign-off authority.</p><p>This human-AI collaboration model is not a compromise; it is the optimal design. It leverages the speed and consistency of AI for the tasks where it excels, while preserving the contextual judgment and professional accountability of the human expert for the tasks that require it.</p><p>Designing this collaboration effectively requires careful attention to the interface between the human and the machine. The agent must surface its outputs in a way that is transparent and auditable, making it easy for the actuary to verify the work and understand the reasoning behind any flagged anomalies. If the agent cannot explain its output, the actuary cannot responsibly sign off on it.</p><h2>Reclaiming Human Capital</h2><p>The goal of deploying agentic AI in the actuarial department is not to replace actuaries. It is to liberate them.</p><p>By automating the routine, repetitive tasks of data manipulation and report formatting, agentic AI frees up actuaries to focus on the high-value work they were trained to do. They can spend their time analyzing the data, identifying emerging risks, and developing strategic insights that drive the business forward.</p><p>This is the core mission of Wangari. We build agentic and causal AI infrastructure designed specifically to automate complex regulatory reporting. We provide the reliability, auditability, and deep integration required to deploy these systems safely in highly regulated environments.</p><p>The Actuarial Time Trap is not an inevitable reality of the insurance industry. It is a solvable problem. And with the advent of agentic AI, we finally have the tools to solve it.</p><div><hr></div><h1>Meanwhile, at Wangari</h1><p>We are now in Week 2 of the &#8220;From Demo to Production&#8221; course, and the cohort is diving deep into the complexities of AI orchestration patterns and workflows.</p><p>We&#8217;ve already had some fascinating discussions about what it means to move beyond simple accuracy metrics &#8212; or to even build a working system that you can measure in the first place. We have had robust discussions about how it&#8217;s almost never the model&#8217;s fault, and how much human (!) labor is needed to get AI systems anywhere close to production-ready.</p><p>The insights generated by this cohort are already shaping the way we think about AI orchestration at Wangari. It is a powerful reminder that the best way to learn is to teach, and the best way to build robust systems is to engage with a community of practitioners facing the same challenges.</p><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://ifoagenai.substack.com/p/emerging-risks-of-agentic-ai-in-actuarial">Emerging Risks of Agentic AI in Actuarial Work</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Nnamdi Odozi&quot;,&quot;id&quot;:226650,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!udVU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefaf9-9208-4039-bc14-882979e5f26f_144x144.png&quot;,&quot;uuid&quot;:&quot;09d76994-ad59-4b99-b43c-85785a280b86&quot;}" data-component-name="MentionToDOM"></span> and <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Josh Blake&quot;,&quot;id&quot;:408807412,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JF6c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43a25b18-60e7-46e0-b92e-98d9848c0b8f_144x144.png&quot;,&quot;uuid&quot;:&quot;56c86911-8abd-4764-8916-0a0b8eade1f8&quot;}" data-component-name="MentionToDOM"></span>: An essential read on the unique challenges and ethical considerations of deploying autonomous agents in the actuarial profession. The authors provide a clear-eyed assessment of how traditional governance frameworks must evolve to handle systems that can reason and act independently.</p></li><li><p><a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">Hidden Technical Debt in Agentic Systems</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Miguel Otero Pedrido&quot;,&quot;id&quot;:89972117,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!LZBx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b58b1f5-4d25-4dcf-9f48-b67a6e6e1316_1200x1200.jpeg&quot;,&quot;uuid&quot;:&quot;8d5a40af-07e8-4ad5-88be-9d15580670d8&quot;}" data-component-name="MentionToDOM"></span>: A reminder that the true complexity of AI automation lies not in the model, but in the surrounding infrastructure. This piece is particularly relevant for actuarial teams looking to move beyond simple pilot projects and build resilient, production-grade reporting pipelines.</p></li><li><p><a href="https://benn.substack.com/p/can-analysis-ever-be-automated">Can analysis ever be automated?</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Benn Stancil&quot;,&quot;id&quot;:5667744,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/a317e60a-9bd1-4c75-bb54-66d517f735dc_1100x1100.jpeg&quot;,&quot;uuid&quot;:&quot;34dac1b9-4204-4459-8282-5138db7dbb9c&quot;}" data-component-name="MentionToDOM"></span>: A thoughtful exploration of the limits of automation in data analysis and the enduring need for human judgment. Stancil argues that while AI can accelerate the mechanical aspects of analysis, the strategic interpretation of data remains a fundamentally human endeavor.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Why Most World Cup Predictions Are Wrong (And Why I Wrote a Book About Soccer ML Anyway)]]></title><description><![CDATA[Every four years, the models say Brazil. Every four years, the World Cup disagrees (except when Brazil actually wins and nobody predicted it).]]></description><link>https://newsletter.wangari.global/p/why-most-world-cup-predictions-are</link><guid isPermaLink="false">https://newsletter.wangari.global/p/why-most-world-cup-predictions-are</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 12 Jun 2026 06:00:57 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/200636250/62c3742c05ea17fecc4169bd5e5eb2e3.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>In this episode, Ari Joury (PhD, particle physics; Founder &amp; CEO of Wangari Global) turns his attention to the 2026 World Cup &#8212; and to why the machine learning models built to predict it are confidently wrong in specific, predictable ways. Drawing on his upcoming O&#8217;Reilly book <em>Soccer Analytics with Machine Learning</em>, he walks through four failure modes: the distribution shift between club and international football, the small-sample limits of Expected Goals (xG), the form-transfer illusion, and the incentive structures that push analysts to publish flashy numbers over honest ones. He then flips the argument: what actually does predict tournament outcomes, and what does that tell us about where ML earns its keep versus where it just looks like it does? The bigger lesson here is not about soccer &#8212; it is about knowing which questions your model can actually answer with the data you have.</p><p>Topics covered: Soccer analytics, World Cup prediction, Expected Goals (xG), distribution shift, small-sample statistics, feature engineering, predictive modeling, O&#8217;Reilly Media, enterprise AI context.</p><p>Wangari is the newsletter and podcast for practitioners and leaders navigating the real work of enterprise AI. New episodes every Friday.</p><p><a href="https://wangari.global/contact">https://wangari.global/contact</a></p><p>Ari&#8217;s book (O&#8217;Reilly Media, early release out now and officially out around June 25): <a href="https://learning.oreilly.com/library/view/soccer-analytics-with/9781098181109/">Soccer Analytics with Machine Learning</a></p>]]></content:encoded></item><item><title><![CDATA[Stop Trusting ML Predictions for the 2026 World Cup. Here's Why.]]></title><description><![CDATA[I wrote a book about soccer analytics with machine learning. The World Cup is where most of it breaks.]]></description><link>https://newsletter.wangari.global/p/stop-trusting-ml-predictions-for</link><guid isPermaLink="false">https://newsletter.wangari.global/p/stop-trusting-ml-predictions-for</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 09 Jun 2026 06:02:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cL5s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cL5s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cL5s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cL5s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cL5s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cL5s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cL5s!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:385485,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/200633364?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cL5s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cL5s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cL5s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cL5s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee365467-1056-4022-b157-f5e605883ee1_1344x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Machine learning just isn&#8217;t the right tool for a rare, weird event. Image generated with Leonardo AI</figcaption></figure></div><p>Every four years, the same thing happens.</p><p>A major football analytics site publishes its World Cup predictions. A few academic groups release theirs. Goldman Sachs releases one too, because Goldman Sachs releases one of everything. They run thousands of simulations, train models on Bundesliga data, layer in xG and shot-quality metrics, and then they tell you, with a confidence interval, that Brazil has a 17.2% chance of lifting the trophy.</p><p>Then the World Cup happens, and it doesn&#8217;t.</p><p>I&#8217;m not going to argue that ML can&#8217;t predict football. I just wrote a book about exactly that. What I&#8217;m going to argue is something narrower and more uncomfortable: most of the techniques that make ML  useful in club football break down for the World Cup, and the analytics community has spent two decades pretending this isn&#8217;t true.</p><p>There&#8217;s incentives for this (sports betting, anyone?) &#8212; but that doesn&#8217;t make it truer.</p><h2>The training data is wrong</h2><p>Football ML lives and dies on data. The richest dataset we have is the club game &#8212; five major European leagues, multiple cup competitions, roughly 2,000 top-flight matches a year per major league. Decades of it. Coaches who manage 50 matches a season for ten years. Players who play together every single week.</p><p>International football has approximately none of that.</p><p>A World Cup squad spends about 25 days together in the year leading up to the tournament. The starting eleven you&#8217;ll see against Argentina in June has almost certainly never played that exact lineup before this calendar year. Your fancy possession network metric? It was trained on teams that had 200 matches of muscle memory. The model doesn&#8217;t know that the back four you&#8217;re feeding it has played together six times.</p><p>The technical name for this is &#8220;distribution shift.&#8221; The honest name is &#8220;we have no idea what we&#8217;re predicting.&#8221; Most public World Cup models paper over this by aggregating individual player ratings into team strength scores. That sort of works for group stage. It collapses in the knockout rounds, where formations get cagey, managers go conservative, and one substitution rewires the whole tactical setup.</p><p>If you&#8217;re going to deploy ML in a domain, the first question to ask is whether your training distribution matches your deployment distribution. For the World Cup, the honest answer is <em>not even close.</em></p><h2>xG was never built for this</h2><p>Expected goals is the best metric football analytics has produced. I use it in the book, I think it&#8217;s genuinely a step forward, and I&#8217;d defend it against anyone who calls it nonsense.</p><p>But xG is a <em>shot-level</em> metric trained on Premier League and Bundesliga shot data. It was designed for repeated trials in similar contexts. The World Cup gives you seven matches, maybe, for the team you care about. Half of those matches end in 1&#8211;0 results or worse. Aggregate-xG noise dominates the signal at small sample sizes &#8212; this is a basic statistical fact that gets quietly ignored when people pump World Cup predictions through their xG-based pipelines.</p><p>There&#8217;s a deeper problem. xG models are trained on &#8220;normal&#8221; attacking play in club football. They have no idea what to do with extra time in a knockout match where one team has been parking the bus for 30 minutes and is now trying to score on a single counter. The data those models learned from doesn&#8217;t contain very many of those situations. International knockout football contains a lot of them.</p><p>You can absolutely build an xG model that works for the World Cup. You just can&#8217;t use the Premier League one and assume it transfers.</p><h2>Form doesn&#8217;t transfer</h2><p>Here&#8217;s a thought experiment. A striker scores 32 goals in La Liga from August to May. His national team plays four friendlies in the meantime, in which he scores zero. Which is the predictive signal for what he&#8217;ll do at the World Cup?</p><p>Most public models implicitly say: weight the 32. Use his &#8220;true talent&#8221; as inferred from his club output. Plug it into the international model.</p><p>This is wrong for a specific reason. The 32 goals were scored in a system, with a manager he sees every day, with teammates who know exactly where to put the through-ball. He arrives at the World Cup as an extraordinary player attached to a team that has practiced his preferred runs perhaps twice. International form for international striker output is the more honest signal, even though the sample is brutal.</p><p>The football analytics community has known this for years. Every analyst in private will tell you. None of the public predictive models I&#8217;ve seen meaningfully correct for it, because the obvious correction (downweighting club performance) catastrophically reduces the signal you have to work with.</p><p>You don&#8217;t get to wish away the small-sample problem by averaging in irrelevant data.</p><h2>So what does work?</h2><p>That&#8217;s a reasonable question by ambitious people. If most of football ML breaks for the World Cup, what&#8217;s actually predictive?</p><p>A few things are not quite as fashionable as ML pipelines but get closer to the right answer:</p><p><strong>Squad market value.</strong> Boring, embarrassing, true. The total transfermarkt valuation of a squad is one of the strongest publicly available predictors of tournament progression. Not because money buys winners, but because it&#8217;s a market-aggregated bet on individual quality, made by people with real money on the line. ML models often beat this baseline by 1&#8211;2 percentage points after thousands of features. It&#8217;s worth asking what you&#8217;ve actually added.</p><p><strong>Manager continuity.</strong> Teams whose manager has been in place for 18+ months consistently outperform teams with new appointments. This is partly because they have a system, partly because the players have trust, and partly because the manager has had time to identify and stop using their worst players. It&#8217;s hard to put into a model cleanly; it shows up anyway when you do.</p><p><strong>Tournament experience as a team.</strong> Not as individuals. The cohort of players who&#8217;ve played a knockout international match together has more predictive power than aggregate caps. Spain&#8217;s 2010 team had two and a half cycles of building. France&#8217;s 2018 team had two. The 2022 Argentina team had a manager who&#8217;d been in place for four years. There&#8217;s a pattern, even though &#8220;tournament experience as a team&#8221; doesn&#8217;t fit cleanly into a feature vector.</p><p><strong>Group draw difficulty.</strong> Trivially obvious, but most ML models bake this into other features rather than respecting it as the structural variable it is. Whether your route to the semifinal goes through Brazil or through Costa Rica matters more than any in-game metric.</p><p>If your World Cup model can&#8217;t beat a simple weighted combination of those four, it isn&#8217;t doing anything that justifies its training cost.</p><h2>The deeper thing</h2><p>Football analytics keeps wanting to be Moneyball. It&#8217;s been trying for fifteen years. There&#8217;s been real progress &#8212; modern shot maps, possession value, EPV-style models, automated tracking data &#8212; and I&#8217;m not the guy who&#8217;s going to tell you it was wasted.</p><p>But the World Cup is the part of football where Moneyball logic breaks the hardest, because Moneyball relied on the law of large numbers. 162 baseball games. Repeated trials. The dice converge to their fair value over a long enough season.</p><p>The World Cup is seven matches per team. Maybe two of them are knockout games that go to penalties. The dice don&#8217;t converge over seven throws. They land somewhere, and you live with where they landed.</p><p>I wrote a whole book about how to use ML in soccer well &#8212; what to model, what not to, how to set up your training data, what to do about the messy stuff. <em>Soccer Analytics with Machine Learning</em>, out from O&#8217;Reilly at the end of June. About a tenth of it is World Cup-specific. The rest is the part that does work &#8212; the part you can use on the league football that fills the other 47 weeks of your year.</p><p>By the time the World Cup actually starts, half the predictive models you&#8217;ll see have already been falsified by the warm-up matches. Watch the football. Watch the predictions. See for yourself which ones got Argentina&#8211;Saudi Arabia 2022.</p><p>I&#8217;d be impressed by anyone who got it right. I just wouldn&#8217;t pay them to do it again.</p><div><hr></div><h1>Meanwhile, at Wangari</h1><p>If you&#8217;re curious, the book I mentioned earlier is coming out at O&#8217;Reilly Media in a couple of weeks! An un-edited early release is already available on the <a href="https://learning.oreilly.com/library/view/soccer-analytics-with/9781098181109/">O&#8217;Reilly Learning Platform</a> (it&#8217;ll be updated by the final version as soon as we&#8217;ve finished the last touches with the book&#8217;s production team). </p><p>I&#8217;ll let you know when the final version is out &#8212; will be available wherever books are sold.</p><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://thexgfootballclub.substack.com/p/the-promise-and-limits-of-machine">The Promise and Limits of Machine Learning in Football Attacking Analysis</a>: In this deep dive for <em>The xG Football Club</em>, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alex Marin Felices&quot;,&quot;id&quot;:71196796,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/211ae01f-3892-435e-b222-8a400f444a47_768x1024.jpeg&quot;,&quot;uuid&quot;:&quot;f4f0373b-9d06-430d-8120-2ef9e8b67687&quot;}" data-component-name="MentionToDOM"></span> reviews a landmark academic paper examining how supervised and unsupervised machine learning has been applied to attacking performance in professional football &#8212; from pass-pattern clustering to off-ball scoring opportunity models. He argues that while the field has moved well beyond simple event counts, most models still struggle to capture the contextual and interactive dynamics that actually drive goals. For a Wangari audience thinking about the gap between data richness and decision-making quality, this is a sharp reminder that more data does not automatically mean better insight.</p></li><li><p><a href="https://noenthuda.substack.com/p/does-liverpool-fc-have-a-data-science">Does Liverpool FC Have a Data Science Problem?</a>: In this essay, data scientist <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Karthik S&quot;,&quot;id&quot;:114082,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/a529fbd4-79cb-4391-97cc-ffe2c803834d_1280x960.jpeg&quot;,&quot;uuid&quot;:&quot;5a02d0e9-9045-46a6-b530-ca4b89915ca9&quot;}" data-component-name="MentionToDOM"></span> traces Liverpool&#8217;s post-Ian-Graham recruitment decline and argues the club is suffering from a classic failure of &#8220;non-agentic data science&#8221; &#8212; where models inform but do not recommend, and the critical translation layer between analysts and decision-makers has essentially broken down. The piece is a compelling case study in what happens when the head of a high-stakes data function moves on and institutional knowledge does not transfer cleanly. Anyone building or inheriting a data team in a high-variance, low-volume decision environment &#8212; think insurance underwriting or credit risk &#8212; will find the parallels uncomfortably familiar.</p></li><li><p><a href="https://xguff.substack.com/p/football-fans-are-drowning-in-data">Football Fans Are Drowning in Data, Starved of Wisdom</a>: In this essay for <em>xGuff (Expected Guff)</em>, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Thomas Aston&quot;,&quot;id&quot;:240896996,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/88d6fc89-d8e9-4c42-9b11-04829178a92d_1648x1648.jpeg&quot;,&quot;uuid&quot;:&quot;76fc7166-8701-4e8a-9d9f-8cbc161eb09f&quot;}" data-component-name="MentionToDOM"></span> applies the Data&#8211;Information&#8211;Knowledge&#8211;Wisdom (DIKW) pyramid to the explosion of football analytics and finds the pyramid severely bottom-heavy: vast quantities of event and tracking data at the base, but precious little wisdom at the tip. He walks through vivid examples of how correct statistics are routinely used to reach wildly incorrect conclusions &#8212; from manager-sacking survival rates to the ongoing xG wars on TalkSport &#8212; and asks whether the volume of information is actually making the sport harder to understand. It is a timely provocation for anyone in data-heavy industries who has ever wondered whether their dashboards are generating knowledge or just more noise.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Slow Collapse]]></title><description><![CDATA[Why AI Systems Fail Silently in Production]]></description><link>https://newsletter.wangari.global/p/the-slow-collapse</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-slow-collapse</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 05 Jun 2026 06:01:16 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/197553700/342a7f88a0ba079a5261a95745003e04.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>There is a specific type of anxiety that comes with deploying an autonomous AI agent into a production enterprise environment. It is not the fear that the system will crash immediately upon launch. The true fear is silent degradation. </p><p>In this episode, Ari Joury (PhD, particle physics; Founder &amp; CEO of Wangari Global) breaks down the anatomy of &#8220;The Slow Collapse&#8221; &#8212; the phenomenon where an AI system continues to run, but the quality of its outputs imperceptibly declines over time. Ari explains the three primary drivers of silent degradation (Data Drift, Model Drift, and Context Window Saturation) and outlines the observability and governance frameworks required to catch them before they cause catastrophic business impact. And yes, he will explain why your AI agent is basically a tired intern who forgot what they were doing on page 35.</p><p><strong>Topics covered:</strong> AI observability, silent degradation, data drift, model drift, context window saturation, LLM monitoring, AI governance, enterprise AI deployment, automated evaluation pipelines.</p><p><em>Wangari is the newsletter and podcast for practitioners and leaders navigating the real work of enterprise AI. New episodes every Thursday.</em></p><p><a href="https://wangari.global/contact">https://wangari.global/contact</a></p><p>Upcoming Course: <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a></p>]]></content:encoded></item><item><title><![CDATA[The Silent Degradation of AI Systems]]></title><description><![CDATA[Why your production AI agent will fail slowly before it fails catastrophically, and how to build the observability required to catch it.]]></description><link>https://newsletter.wangari.global/p/the-silent-degradation-of-ai-systems</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-silent-degradation-of-ai-systems</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 02 Jun 2026 06:00:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kf9Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kf9Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kf9Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kf9Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kf9Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kf9Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kf9Y!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:348593,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/197333363?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kf9Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kf9Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kf9Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kf9Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea6aaddb-71c2-4a22-89e5-7bdea5b2c120_1344x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Architectural decay is a silent and progressive disease &#8212; and for AI systems, it&#8217;s fatal without skilled intervention. Image generated with Leonardo AI</figcaption></figure></div><p>There is a specific type of anxiety that comes with deploying an autonomous AI agent into a production enterprise environment. It is not the fear that the system will crash immediately upon launch. That kind of failure is loud, obvious, and relatively easy to fix.</p><p>The true fear is silent degradation.</p><p>It is the fear that the system will continue to run, continue to generate reports, and continue to make decisions, but that the quality of those outputs will slowly, imperceptibly decline over time. By the time the degradation becomes obvious to a human reviewer, the system may have already processed thousands of transactions or generated dozens of flawed regulatory filings.</p><p>In the world of traditional software engineering, code does not rot. A function written today will execute exactly the same way five years from now, provided the underlying environment remains stable.</p><p>That&#8217;s not to say that software packages and even operating systems don&#8217;t evolve &#8212; so some maintenance is needed &#8212; but at least that evolution can be tracked and followed, plus there will be many others undergoing the exact same evolution at the same time.</p><p>AI systems are fundamentally different. They are probabilistic, and they are highly sensitive to their environment. They do not just break; they drift.</p><h2>The Anatomy of Silent Degradation</h2><p>Silent degradation in an AI system typically stems from one of three sources:</p><ol><li><p><strong>Data Drift:</strong> The distribution of the input data changes over time. If an AI agent was designed to process insurance claims based on historical data from 2023, it may struggle to accurately process claims in 2026 if the underlying nature of those claims has shifted due to new regulations, economic conditions, or changing customer behavior. The model is still functioning as designed, but the world has moved on.</p></li><li><p><strong>Model Drift:</strong> The underlying foundation model is updated by the provider. While API providers strive for backward compatibility, even minor updates to a model&#8217;s weights or safety filters can subtly alter its behavior. A prompt that consistently yielded a perfectly formatted JSON object yesterday might suddenly start including conversational filler today, breaking the downstream orchestration pipeline.</p></li><li><p><strong>Context Window Saturation:</strong> As an agentic system operates, it often accumulates state or context. If the system is not designed to elegantly manage this context&#8212;summarizing, pruning, or archiving older information&#8212;the context window can become saturated with irrelevant noise. The model&#8217;s attention mechanism becomes diluted, leading to hallucinations or degraded reasoning capabilities.</p></li></ol><h2>The Observability Imperative</h2><p>The only defense against silent degradation is rigorous, continuous observability.</p><p>In traditional software, observability means monitoring CPU usage, memory consumption, and error rates. In AI systems, these metrics are necessary but entirely insufficient. You can have a system with 99.9% uptime and sub-second latency that is confidently generating complete nonsense.</p><p>AI observability requires monitoring the <em>quality</em> of the output, not just the health of the infrastructure.</p><p>This means implementing automated, continuous evaluation pipelines. It requires defining specific, measurable characteristics of a &#8220;good&#8221; output and running statistical checks against every inference. Are the generated reports adhering to the required structural format? Is the sentiment of the output remaining consistent? Are the specific entities extracted from the input data matching expected patterns?</p><p>Crucially, it requires establishing baseline metrics&#8212;a &#8220;golden dataset&#8221;&#8212;and continuously comparing production outputs against that baseline to detect subtle shifts in distribution. As <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Shreya Shankar&quot;,&quot;id&quot;:58144420,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bacf4319-d2ab-4665-b179-d0fc5b11c708_1176x1176.jpeg&quot;,&quot;uuid&quot;:&quot;90fa114f-2c39-4a83-b684-8debb2376e29&quot;}" data-component-name="MentionToDOM"></span> <a href="https://www.latent.space/p/shreya-shankar">points out in an interview</a> with <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Latent.Space&quot;,&quot;id&quot;:89230629,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db0f8d45-1eb8-4c02-a120-650d377ee52d_640x640.jpeg&quot;,&quot;uuid&quot;:&quot;b55050c9-fcaa-4991-8d9e-184ddd1eab70&quot;}" data-component-name="MentionToDOM"></span> ,  because it relies on static schema checks rather than dynamic, partition-based summarization.</p><h2>The Illusion of Interpretability</h2><p>When degradation is detected, the immediate instinct is to ask *why* the model failed. This leads many teams down the rabbit hole of post-hoc interpretability methods, such as SHAP values or feature attribution techniques.</p><p>However, these methods often provide a false sense of security. As researchers have noted, explaining models after training often <a href="https://open.substack.com/pub/hisku/p/the-interpretability-illusion-why">fails to capture</a> the true causal mechanisms driving the model&#8217;s behavior. These post-hoc explanations are essentially models of models&#8212;approximations that can be just as flawed or biased as the original system.</p><p>Instead of relying on illusory interpretability, enterprise AI systems must be built with intrinsic transparency. This means designing orchestration layers where the logical steps are explicit and auditable, rather than relying on a single massive neural network to perform complex reasoning in a black box.</p><h2>The Cost of Ignoring Drift</h2><p>The financial and reputational costs of ignoring silent degradation can be staggering. In the financial sector, a trading algorithm that slowly drifts out of alignment with market realities can wipe out millions of dollars before the error is caught. In healthcare, a diagnostic model that degrades over time can lead to misdiagnoses and compromised patient care.</p><p>The insidious nature of silent degradation is that it often goes unnoticed by the very people who rely on the system the most. Users become accustomed to the system&#8217;s quirks and begin to subconsciously compensate for its declining performance. They might start double-checking the AI&#8217;s work more frequently, or manually correcting minor errors, effectively masking the degradation from the engineering team.</p><p>This is why observability cannot rely on user reporting. It must be automated, objective, and continuous.</p><h2>From Monitoring to Governance</h2><p>Observability is the mechanism for detecting degradation, but governance is the framework for responding to it.</p><p>When an automated evaluation pipeline detects that an agent&#8217;s output quality has drifted below an acceptable threshold, what happens next? Does the system automatically halt? Does it route the task to a human operator? Does it trigger an alert to the engineering team?</p><p>A robust governance framework defines these escalation paths. It establishes clear ownership for the ongoing health of the system. It ensures that there is a &#8220;human in the loop&#8221; not just for individual decisions, but for the systemic oversight of the AI agent itself.</p><p>At Wangari, we believe that deploying an AI system without this level of observability and governance is professional malpractice, particularly in regulated industries. The stakes are simply too high.</p><p>The transition from a successful demo to a reliable production system is not just about writing better code. It is about building the operational infrastructure required to manage probabilistic systems in a deterministic world. It is about acknowledging that AI systems are not static artifacts, but dynamic entities that require continuous care and feeding.</p><div><hr></div><h1>Meanwhile, at Wangari</h1><p>The challenge of silent degradation is exactly why I designed my upcoming course to focus heavily on evaluation and operations.</p><p><em>From Demo to Production: Operationalize an Enterprise-Grade Agentic AI Reporting System</em> launches next week, on June 9th.</p><p>In Week 3 of the course, we dive deep into &#8220;Decision-Grade Evaluation,&#8221; moving beyond simple accuracy metrics to build comprehensive evaluation scorecards. In Week 5, we cover &#8220;Operational Excellence,&#8221; focusing on deployment strategies, monitoring dashboards, and governance frameworks.</p><p>If you are responsible for ensuring that your organization&#8217;s AI systems remain reliable long after the initial deployment, this course will give you the practical blueprints you need.</p><p><a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">Enrollment closes soon. Secure your spot at GenAI Academy.</a></p><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://www.latent.space/p/shreya-shankar">Grounded Research: From Google Brain to MLOps to LLMOps</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alessio Fanelli&quot;,&quot;id&quot;:3381444,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef686287-e8cb-4397-b1a3-ee45774394d6_1252x1154.jpeg&quot;,&quot;uuid&quot;:&quot;2179bac5-f4cf-4d25-9501-ff33121db9b4&quot;}" data-component-name="MentionToDOM"></span> and <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Latent.Space&quot;,&quot;id&quot;:89230629,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db0f8d45-1eb8-4c02-a120-650d377ee52d_640x640.jpeg&quot;,&quot;uuid&quot;:&quot;a16a10f1-f457-4f62-9f67-98b03995ceb1&quot;}" data-component-name="MentionToDOM"></span> featuring <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Shreya Shankar&quot;,&quot;id&quot;:58144420,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bacf4319-d2ab-4665-b179-d0fc5b11c708_1176x1176.jpeg&quot;,&quot;uuid&quot;:&quot;188e9d71-a1e0-4ed1-ac16-ea60f5e58dd5&quot;}" data-component-name="MentionToDOM"></span>: A deep dive into the principles of production-grade machine learning and the critical importance of robust data validation. Shankar argues that traditional MLOps practices are insufficient for LLMs, and that we need new paradigms for evaluating and monitoring generative systems in production.</p></li><li><p><a href="https://open.substack.com/pub/hisku/p/the-interpretability-illusion-why">The Interpretability Illusion: Why Explaining Models After Training Fails</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Hisku Dingeto&quot;,&quot;id&quot;:314298160,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ed3c8ee-3bd7-4921-a0ac-2061241d77dc_400x400.jpeg&quot;,&quot;uuid&quot;:&quot;cd73d389-6174-4c20-b936-a2f440a3c79c&quot;}" data-component-name="MentionToDOM"></span>: A compelling argument against relying on post-hoc interpretability methods and the need for intrinsically transparent models. The author demonstrates how techniques like SHAP can provide misleading explanations, emphasizing the need for causal reasoning built directly into the architecture.</p></li><li><p><a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">Hidden Technical Debt in Agentic Systems</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Miguel Otero Pedrido&quot;,&quot;id&quot;:89972117,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!LZBx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b58b1f5-4d25-4dcf-9f48-b67a6e6e1316_1200x1200.jpeg&quot;,&quot;uuid&quot;:&quot;66106d07-50de-46b8-8e18-d5c09c3cf785&quot;}" data-component-name="MentionToDOM"></span>: An essential read on why the infrastructure surrounding an AI model is where the true engineering risk lies. Pedrido breaks down the hidden costs of orchestration, state management, and error handling that are often ignored during the pilot phase.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Beautiful Game of Data]]></title><description><![CDATA[What Soccer Teaches Us About Machine Learning]]></description><link>https://newsletter.wangari.global/p/the-beautiful-game-of-data</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-beautiful-game-of-data</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 29 May 2026 06:01:23 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/197548732/f27274e3fdcb023a3127bd411101cf81.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>If you want to understand the complexities of machine learning, you could study linear algebra and probability theory. Or, you could watch a soccer match. </p><p>In this episode, Ari Joury (PhD, particle physics; Founder &amp; CEO of Wangari Global) takes a break from enterprise AI to preview his upcoming O&#8217;Reilly book, <em>Soccer Analytics with Python</em>. He explains why the fluid, chaotic nature of soccer is the perfect laboratory for understanding predictive modeling, feature engineering, and the limits of correlational data. From the evolution of Expected Goals (xG) to the bias-variance tradeoff on the pitch, Ari shows how the lessons learned from analyzing sports data translate directly to building robust AI systems for the enterprise. And yes, he will explain why your AI model is basically a confused midfielder passing the ball backward.</p><p><strong>Topics covered:</strong> Soccer analytics, Expected Goals (xG), feature engineering, bias-variance tradeoff, causal inference, predictive modeling, O&#8217;Reilly Media, Python data science, enterprise AI context.</p><p><em>Wangari is the newsletter and podcast for practitioners and leaders navigating the real work of enterprise AI. New episodes every Friday.</em></p><p><a href="https://wangari.global/contact">https://wangari.global/contact</a></p><p>Ari&#8217;s book (O&#8217;Reilly Media, early release out now and officially out in June): <a href="https://learning.oreilly.com/library/view/soccer-analytics-with/9781098181109/">Soccer Analytics with Machine Learning</a></p>]]></content:encoded></item><item><title><![CDATA[What Soccer Taught Me About Machine Learning]]></title><description><![CDATA[Why the world's most popular game is the perfect laboratory for understanding predictive modeling and data science.]]></description><link>https://newsletter.wangari.global/p/what-soccer-taught-me-about-machine</link><guid isPermaLink="false">https://newsletter.wangari.global/p/what-soccer-taught-me-about-machine</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 26 May 2026 06:01:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SDM9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SDM9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SDM9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SDM9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SDM9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SDM9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SDM9!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5792268,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/197331855?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SDM9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SDM9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SDM9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SDM9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75ee8342-4d15-4126-a554-4f8d0f154194_2688x1536.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Soccer and machine learning have more in common than you might think. Image generated with Leonardo AI</figcaption></figure></div><p>If you want to understand the complexities of machine learning, you could start by studying linear algebra, calculus, and probability theory. You could immerse yourself in academic papers on gradient descent and backpropagation.</p><p>Or, you could watch a soccer match.</p><p>At first glance, the chaotic, fluid nature of soccer seems entirely disconnected from the structured world of data science. But beneath the surface of every pass, tackle, and shot on goal lies a rich tapestry of data waiting to be analyzed. In fact, the challenges inherent in modeling a soccer match perfectly mirror the challenges of building robust machine learning systems for the enterprise.</p><p>This realization is what drove me to co-author my upcoming book, <a href="https://learning.oreilly.com/library/view/soccer-analytics-with/9781098181109/">Soccer Analytics with Python</a> (O&#8217;Reilly Media). The goal was not just to write a book for sports fans, but to use the universal language of soccer to demystify machine learning concepts that often feel abstract and inaccessible.</p><p>Frankly, it was also just fun to write as a way to combine my soccer-playing teenage years with my code-crunching twenties. But hey, here we are, old, wise and having fun with soccer analytics.</p><h2>The Beautiful Game as a Data Problem</h2><p>Consider the fundamental problem of predicting a match outcome. It is not a simple deterministic equation. It is a highly probabilistic scenario influenced by dozens of interacting variables: player form, tactical formations, weather conditions, historical matchups, and the sheer unpredictability of human behavior.</p><p>When we attempt to model this, we encounter the exact same issues that plague enterprise data scientists.</p><p>We must deal with feature engineering&#8212;deciding which variables actually matter. Is a team&#8217;s recent possession percentage more predictive than their historical expected goals (xG)? We must grapple with the bias-variance tradeoff, ensuring our model is complex enough to capture the nuances of the game without overfitting to the noise of a single anomalous match.</p><p>We must also confront the limitations of purely correlative models. A model might find a strong correlation between a specific player wearing yellow boots and their team winning. But without causal reasoning, the model cannot distinguish between a meaningless coincidence and a true driver of performance. As Alex Marin Felices points out, while machine learning models can identify correlations between performance indicators and success, they often <a href="https://thexgfootballclub.substack.com/p/the-promise-and-limits-of-machine">struggle to represent</a> the interactive and dynamic nature of attacking play.</p><h2>From the Pitch to the Boardroom</h2><p>The lessons learned from analyzing soccer data translate directly to the big-corp boardroom, and how they&#8217;re looking at nascent machine learning and AI projects in their enterprises.</p><p>In the book, we explore techniques like logistic regression, random forests, and deep learning, applying them to real-world soccer datasets. We build models to predict match outcomes, evaluate player performance, and even test betting strategies.</p><p>But the underlying principles are universal. The same random forest algorithm used to predict whether a striker will score from a specific location on the pitch can be used by an insurance company to predict the likelihood of a claim. The same simulation techniques used to model different tactical scenarios can be used by a financial institution to stress-test their portfolio against market shocks.</p><p>By grounding these concepts in a domain that is intuitive and engaging, we can strip away the intimidating jargon and focus on the core mechanics of how machine learning actually works.</p><h2>The Importance of Context</h2><p>Perhaps the most important lesson soccer teaches us about data science is the critical importance of context.</p><p>In soccer, a raw statistic like &#8220;total passes completed&#8221; is almost meaningless without context. Were those passes progressive, breaking through the opponent&#8217;s defensive lines, or were they safe, lateral passes between defenders?</p><p>Similarly, in enterprise AI, data without context is a liability. A model trained on historical financial data might identify a pattern, but without understanding the underlying economic context&#8212;the regulatory changes, the market dynamics, the causal relationships&#8212;that pattern is likely to be misleading.</p><p>This is why at Wangari, we emphasize causal AI. We believe that true intelligence requires understanding the <em>why</em> behind the data, not just the <em>what</em>. Whether you are analyzing a soccer match or automating complex regulatory reporting, context is the difference between a model that merely describes the past and a system that can reliably navigate the future.</p><h2>The Future of Sports Analytics</h2><p>The integration of AI into sports is accelerating rapidly. For the 2026 World Cup, FIFA plans to create <a href="https://planetsoccer.substack.com/p/opinion-inside-the-world-cups-ai">AI-enabled 3D avatars</a> of every player to ensure precise player identification and tracking for semi-automated offside decisions. This level of data capture represents a massive leap forward, but it also highlights the tension between technological precision and human judgment.</p><p>As decisions become more exact, they can also feel more arbitrary to spectators. If an attacker is ruled offside because a 3D scan shows a shoulder fractionally further forward than previously assumed, it raises questions about fairness and the role of technology in the game. This mirrors the challenges we face in enterprise AI, where highly accurate models can sometimes produce decisions that are difficult for humans to interpret or trust.</p><h2>The Evolution of Expected Goals</h2><p>One of the most fascinating developments in soccer analytics is the evolution of the &#8220;Expected Goals&#8221; (xG) metric. Early xG models were relatively simple, relying primarily on the distance and angle of the shot relative to the goal.</p><p>Today, state-of-the-art xG models are vastly more sophisticated. They incorporate the positions of all defenders and the goalkeeper, the velocity of the pass preceding the shot, and even the specific body part used to strike the ball. This evolution perfectly illustrates the concept of feature engineering&#8212;the continuous process of refining the inputs to a model to capture more of the underlying reality.</p><p>In the enterprise, we see a similar evolution. Early predictive models about, say, customer conversions relied on simple demographic data. Today, advanced models incorporate behavioral data, network graphs, and unstructured text analysis. The goal is always the same: to move from a crude approximation of reality to a high-fidelity representation.</p><h2>Bridging the Gap</h2><p>Writing <em>Soccer Analytics with Python</em> has been a fascinating exercise in translation. It has reinforced my belief that the most complex technical concepts can be made accessible when framed through the right lens.</p><p>The book is designed for anyone who wants to develop a solid foundation in machine learning, whether you are a student, an analyst, or simply a fan of the game. It bridges the gap between academic principles and practical applications, proving that you don&#8217;t need a PhD in particle physics to understand how to build predictive models.</p><p>The beautiful game is more than just a sport. It is a masterclass in probability, strategy, and the power of data.</p><div><hr></div><h1>Meanwhile, at Wangari</h1><p>While I have been busy writing about soccer analytics, the core focus at Wangari remains on solving the hardest data challenges in the enterprise.</p><p>If you are a technical leader looking to bridge the gap between AI prototypes and production systems, my upcoming course is designed for you.</p><p><em>From Demo to Production: Operationalize an Enterprise-Grade Agentic AI Reporting System</em> launches on June 9th. Over 6 weeks, we will cover the orchestration, evaluation, and governance frameworks necessary to build reliable AI systems.</p><p><a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">Enrollment is open now at GenAI Academy.</a></p><p>And if you are interested in exploring machine learning through the lens of soccer, my new book, <a href="https://learning.oreilly.com/library/view/soccer-analytics-with/9781098181109/">Soccer Analytics with Python</a>, will be published by O&#8217;Reilly Media in late June, just in time for the World Cup. </p><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://thexgfootballclub.substack.com/p/the-promise-and-limits-of-machine">The Promise and Limits of Machine Learning in Football Attacking Analysis</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alex Marin Felices&quot;,&quot;id&quot;:71196796,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/211ae01f-3892-435e-b222-8a400f444a47_768x1024.jpeg&quot;,&quot;uuid&quot;:&quot;8af3399b-5329-41ae-9b19-af350d155a4e&quot;}" data-component-name="MentionToDOM"></span>: A critical review of how machine learning is applied to analyze attacking performance and the challenges of representing dynamic play. Felices rightly points out that while models excel at finding correlations, they often fail to capture the complex, multi-agent interactions that define a successful attack.</p></li><li><p><a href="https://planetsoccer.substack.com/p/opinion-inside-the-world-cups-ai">Opinion: Inside the World Cup&#8217;s AI offside revolution</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Clemente Lisi&quot;,&quot;id&quot;:117188162,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/5f38e4ac-06ad-43f8-8123-19820427b570_1000x1500.jpeg&quot;,&quot;uuid&quot;:&quot;04988e79-4c6c-4a40-8b9e-5545dd0c2fd3&quot;}" data-component-name="MentionToDOM"></span>: An insightful look at FIFA&#8217;s plan to use AI-enabled 3D avatars for the 2026 World Cup and the implications for the game. Lisi explores the tension between technological precision and the human element of refereeing, a debate that mirrors discussions about AI governance in the enterprise.</p></li><li><p><a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">Hidden Technical Debt in Agentic Systems</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Miguel Otero Pedrido&quot;,&quot;id&quot;:89972117,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!LZBx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b58b1f5-4d25-4dcf-9f48-b67a6e6e1316_1200x1200.jpeg&quot;,&quot;uuid&quot;:&quot;99e67f0d-ddce-43a9-a7e5-860697cf812d&quot;}" data-component-name="MentionToDOM"></span>: A reminder that whether in sports analytics or enterprise AI, the model is just a small part of the overall system complexity. Pedrido&#8217;s analysis of the infrastructure required to support autonomous agents is a must-read for anyone moving beyond simple API calls.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Unglamorous Work That Makes AI Actually Work]]></title><description><![CDATA[Orchestration, Evaluation, and Governance in Enterprise AI]]></description><link>https://newsletter.wangari.global/p/the-unglamorous-work-that-makes-ai</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-unglamorous-work-that-makes-ai</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 22 May 2026 06:00:40 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/196916085/2d6dd34f514748fa69bad5ca2edf1b0a.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>The AI industry has a glamour problem. Conference talks are about parameter counts and benchmark scores. Venture capital pitches are about artificial general intelligence. But when you sit down with a Chief Actuary or a Head of Compliance, the conversation is about something completely different: auditability, data privacy, and whether the system will hallucinate a regulatory filing. </p><p>In this episode, Ari Joury (PhD, particle physics; Founder &amp; CEO of Wangari Global) makes the case that the most valuable skill in enterprise AI today is not the ability to train a model &#8212; it is the ability to operationalize one. He goes deep on orchestration patterns, decision-grade evaluation, and governance architecture, drawing on research from MIT Sloan, Google Brain, and his own experience building causal AI systems for the insurance industry. If you want to understand what actually separates a fragile prototype from a production-grade AI system, this episode is for you.</p><p><strong>Topics covered:</strong> AI orchestration, LLM evaluation, golden datasets, AI governance, enterprise AI deployment, agentic workflows, causal AI, Solvency II, IFRS 17, AI systems engineering, production AI, MLOps.</p><p><em>Wangari is the newsletter and podcast for practitioners and leaders navigating the real work of enterprise AI. New episodes every Friday.</em></p><p>Check out my new course <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a> at the GenAI Academy</p><p>For further inquiries &amp; demos, here you go: <a href="https://wangari.global/contact">https://wangari.global/contact</a></p>]]></content:encoded></item><item><title><![CDATA[Why I’m Teaching the Boring Parts of AI]]></title><description><![CDATA[The real innovation in enterprise AI isn&#8217;t happening in the models. It&#8217;s happening in the orchestration, evaluation, and governance layers.]]></description><link>https://newsletter.wangari.global/p/why-im-teaching-the-boring-parts</link><guid isPermaLink="false">https://newsletter.wangari.global/p/why-im-teaching-the-boring-parts</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 19 May 2026 06:01:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NdmO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NdmO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NdmO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NdmO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NdmO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NdmO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NdmO!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:440931,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/197328670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NdmO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NdmO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NdmO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NdmO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a4e6797-77de-4ad0-8192-84073fc59205_1344x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a><figcaption class="image-caption">AI, these days, is less about science and more about organizational systems (though I enjoy the science-heavy part, too). Image generated with Leonardo AI</figcaption></figure></div><p>When I transitioned from theoretical particle physics to enterprise AI, I expected the challenges to be primarily mathematical. I anticipated spending my days fine-tuning neural network architectures, optimizing hyperparameters, and debating the merits of different attention mechanisms.</p><p>I was wrong.</p><p>The mathematics of modern AI are undeniably fascinating, but they are increasingly commoditized. The foundation models available via API today are more capable than anything a typical enterprise could build from scratch. The real challenge&#8212;the problem that actually prevents organizations from realizing value from AI&#8212;is not the model itself. It is everything that surrounds the model.</p><p>It is the orchestration. It is the evaluation. It is the governance.</p><p>In short, it is the &#8220;boring&#8221; parts of AI. And these boring parts are exactly what I have decided to focus on teaching.</p><h2>The Glamour vs. The Reality</h2><p>The AI industry has a glamour problem. The discourse is dominated by discussions of parameter counts, benchmark scores, and the existential implications of artificial general intelligence. This is the exciting, visionary side of the field.</p><p>But when you sit down with a Chief Actuary at a major insurance firm, or a Head of Compliance at a global bank, the conversation shifts dramatically. They do not care about the latest benchmark on a generic reasoning task. They care about auditability. They care about data privacy. They care about whether a system will hallucinate a regulatory filing that could result in a multi-million dollar fine.</p><p>They are grappling with the reality of deploying probabilistic systems into deterministic business environments.</p><p>This is where the glamour fades and the hard engineering begins. Building a robust AI system requires solving problems that are decidedly unsexy but absolutely critical. As the IFoA GenAI Working Party points out, the complexity and autonomy of a network of AI agents <a href="https://ifoagenai.substack.com/p/emerging-risks-of-agentic-ai-in-actuarial">add a new dimension to risks</a>, making them harder to manage and requiring dynamic governance frameworks.</p><h2>Orchestration: The Unsung Hero</h2><p>Consider orchestration. A single prompt to an LLM is rarely sufficient to complete a complex enterprise task. Real-world workflows require multiple steps: retrieving data from disparate sources, validating that data, processing it through various models, handling errors, and formatting the final output.</p><p>Designing these multi-step agentic workflows is an exercise in systems engineering. It requires choosing the right orchestration pattern&#8212;whether a sequential chain, a parallel fan-out, or a complex graph-based approach. It demands robust error handling, retry logic, and state management.</p><p>When an orchestration layer is designed well, it is invisible. The system simply works. But getting to that point requires rigorous architectural thinking that goes far beyond writing a clever prompt. As Gary Marcus highlights, [autonomous agents are often vulnerable to subtle but dangerous tool-chaining attacks](https://garymarcus.substack.com/p/breaking-autonomous-agents-are-a), proving that orchestration is not just about functionality, but security [2]. If an agent is granted access to a database and an email client without strict guardrails, a simple prompt injection can turn a helpful assistant into a massive security breach.</p><h2>Evaluation: Beyond the Vibe Check</h2><p>Then there is evaluation. How do you know if an AI system is actually performing well?</p><p>In the early days of generative AI, evaluation often consisted of a &#8220;vibe check&#8221;&#8212;running a few queries and subjectively deciding if the answers looked reasonable. This is entirely inadequate for enterprise deployment.</p><p>Decision-grade evaluation requires a multi-dimensional framework. We must measure not just accuracy, but reliability, latency, and cost. We must build automated test suites that evaluate output characteristics against golden datasets. We must implement regression testing to ensure that an update to a prompt or a model does not silently degrade performance on edge cases.</p><p>Building a comprehensive evaluation scorecard is tedious work. It requires defining specific, measurable metrics and establishing acceptable thresholds for each. But without it, deploying an AI system is essentially flying blind. You cannot improve what you cannot measure, and in the context of enterprise AI, failing to measure performance accurately is a dereliction of duty.</p><h2>Governance: The Prerequisite for Trust</h2><p>Finally, there is governance. In regulated industries, an AI system must be auditable. If a system makes a recommendation or generates a report, the organization must be able to trace exactly how that output was produced. What data was used? What logical steps were taken?</p><p>This is where causal AI becomes essential. Unlike purely correlative models, causal models provide a transparent chain of reasoning. They allow us to understand not just <em>what</em> the system predicted, but <em>why</em>.</p><p>Implementing robust governance also means establishing clear ownership, defining escalation paths, and ensuring compliance with data privacy regulations. It is the bureaucratic scaffolding that makes trust possible. The IFoA GenAI Working Party emphasizes that static governance frameworks are <a href="https://ifoagenai.substack.com/p/emerging-risks-of-agentic-ai-in-actuarial">doomed to fail</a> if they do not recognize how the capabilities of these agents might change over time. Governance must be as dynamic and adaptable as the systems it seeks to control.</p><h2>The Shift from Development to Operations</h2><p>The transition from building a prototype to running a production system is a fundamental shift in mindset. It is the shift from development to operations.</p><p>In development, the goal is to prove that something is possible. In operations, the goal is to ensure that it happens reliably, every single time, regardless of the input or the environment. This requires a different set of skills, a different set of tools, and a different set of priorities.</p><p>It requires embracing the boring parts.</p><h2>Embracing the Boring</h2><p>I have come to realize that the most valuable skill in enterprise AI today is not the ability to train a model. It is the ability to operationalize one.</p><p>The teams that will win in this era are not necessarily the ones with the most advanced algorithms. They are the ones that master the boring parts. They are the ones that build resilient orchestration layers, rigorous evaluation frameworks, and transparent governance structures.</p><p>This is the work we do at Wangari. We focus on the infrastructure that makes AI reliable and auditable for complex regulatory reporting.</p><p>And it is exactly why I am so passionate about teaching these concepts. The industry needs fewer prompt engineers and more AI systems engineers. We need professionals who understand how to bridge the gap between a fragile prototype and a robust production system.</p><p>The boring parts of AI may not make headlines, but they are the foundation upon which the future of enterprise technology will be built.</p><div><hr></div><h1>Meanwhile, at Wangari</h1><p>If you are ready to master the &#8220;boring&#8221; (but essential) parts of enterprise AI, my upcoming course is designed for you.</p><p>**From Demo to Production: Operationalize an Enterprise-Grade Agentic AI Reporting System** is a 6-week intensive program that focuses entirely on the operational realities of AI deployment.</p><p>We will not spend time debating model architectures. Instead, we will dive deep into orchestration patterns, decision-grade evaluation metrics, automated testing, and governance frameworks. By the end of the course, you will have developed a complete Production Blueprint for your own AI system.</p><p>The cohort begins on June 9th. If you want to move your AI initiatives out of the lab and into production, I invite you to join us.</p><p><a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">Enrollment is open now at GenAI Academy.</a></p><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://ifoagenai.substack.com/p/emerging-risks-of-agentic-ai-in-actuarial">Emerging Risks of Agentic AI in Actuarial Work</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Nnamdi Odozi&quot;,&quot;id&quot;:226650,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!udVU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefaf9-9208-4039-bc14-882979e5f26f_144x144.png&quot;,&quot;uuid&quot;:&quot;2e67db5d-9a11-4ce7-9c45-fb1510af1af3&quot;}" data-component-name="MentionToDOM"></span> and <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Josh Blake&quot;,&quot;id&quot;:408807412,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JF6c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43a25b18-60e7-46e0-b92e-98d9848c0b8f_144x144.png&quot;,&quot;uuid&quot;:&quot;86ee2da5-6b38-44dd-a81b-68972d6b4c1c&quot;}" data-component-name="MentionToDOM"></span>: An exploration of the ethical considerations and practical implications of deploying AI agents in highly regulated environments. The authors provide a sobering look at how the autonomy of these systems introduces entirely new categories of risk that traditional governance frameworks are ill-equipped to handle.</p></li><li><p><a href="https://garymarcus.substack.com/p/breaking-autonomous-agents-are-a">Breaking: Autonomous Agents are a Shitshow</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Gary Marcus&quot;,&quot;id&quot;:14807526,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Ka51!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8fb2e48c-be2a-4db7-b68c-90300f00fd1e_1668x1456.jpeg&quot;,&quot;uuid&quot;:&quot;671b8eb9-e3a0-4131-b6d1-35395c336f52&quot;}" data-component-name="MentionToDOM"></span>: A critical look at the vulnerabilities of autonomous agents, particularly regarding tool-chaining attacks and security nightmares. Marcus argues that until we solve the fundamental security flaws inherent in granting LLMs access to external tools, deploying them in enterprise environments is dangerously irresponsible.</p></li><li><p><a href="https://benn.substack.com/p/can-analysis-ever-be-automated">Can analysis ever be automated?</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Benn Stancil&quot;,&quot;id&quot;:5667744,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/a317e60a-9bd1-4c75-bb54-66d517f735dc_1100x1100.jpeg&quot;,&quot;uuid&quot;:&quot;b5511f02-664d-46c2-bdc1-af2b4aac8cd8&quot;}" data-component-name="MentionToDOM"></span>: A thoughtful discussion on the challenges and catch-22s of AI analysts and the automation of data analysis. Stancil explores the paradox that while AI can generate code and run queries faster than humans, it still lacks the contextual understanding required to know <em>which</em> questions are actually worth asking.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Hidden Architecture of AI Failure]]></title><description><![CDATA[Why Enterprise Pilots Die Before They Ship]]></description><link>https://newsletter.wangari.global/p/the-hidden-architecture-of-ai-failure</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-hidden-architecture-of-ai-failure</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 15 May 2026 06:02:58 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/197334314/5d2bfe599daa63ff0a40c0e0cd0f2430.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>Most enterprise AI projects never reach production. Not because the models are bad. Not because the teams are incompetent. But because of five structural debts that accumulate silently during the pilot phase &#8212; and detonate the moment you try to scale. </p><p>In this episode, Ari Joury (PhD, particle physics; Founder &amp; CEO of Wangari Global; author, O&#8217;Reilly Media) breaks down the complete taxonomy of AI production failure: Technical Debt, Operational Debt, Evaluation Debt, Integration Debt, and Governance Debt. Drawing on research from MIT, McKinsey, Gartner, and S&amp;P Global &#8212; as well as his own experience deploying causal AI systems in the insurance industry &#8212; Ari gives you the diagnostic framework to identify which debts your project is carrying, and the playbook to pay them down before they kill your launch.</p><p><strong>Topics covered:</strong> enterprise AI failure rates, LLM production readiness, AI technical debt, orchestration, evaluation scorecards, governance frameworks, agentic AI systems, insurance AI, regulatory compliance, demo-to-production gap.</p><p><em>Wangari is the newsletter and podcast for practitioners and leaders navigating the real work of enterprise AI. New podcast episodes every Friday.</em></p><p><a href="https://wangari.global/contact">https://wangari.global/contact</a></p><p>Upcoming Course: <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a></p>]]></content:encoded></item><item><title><![CDATA[The Five Debts That Kill AI Pilots — and Why None of Them Are the Model]]></title><description><![CDATA[Why 90% of enterprise AI initiatives die in the chasm between demo and production.]]></description><link>https://newsletter.wangari.global/p/the-five-debts-that-kill-ai-pilots</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-five-debts-that-kill-ai-pilots</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 12 May 2026 06:02:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Wcgu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wcgu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wcgu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Wcgu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Wcgu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Wcgu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wcgu!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:450810,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/196884516?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wcgu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Wcgu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Wcgu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Wcgu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86e9c885-15cf-428d-adbd-558c6a09c374_1344x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Successful AI pilots are about as abundant as penguins in the desert, and there are good reasons for that. Image generated with Leonardo AI</figcaption></figure></div><p>It is a familiar story for innovation teams. A small, agile team builds an AI prototype over a weekend. It summarizes documents flawlessly. It answers complex queries. The executive sponsor is thrilled. The demo is a triumph.</p><p>Six months later, the project is quietly abandoned.</p><p>This is not an isolated incident. Across the enterprise landscape, we are witnessing a massive deployment gap. While adoption of generative AI tools by individual knowledge workers has skyrocketed, the percentage of organizations successfully deploying autonomous, agentic AI systems into core production workflows remains stubbornly low. According to recent analyses, a staggering majority of generative AI pilots fail to achieve measurable business impact or reach full production scale, with <a href="https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/">some estimates suggesting up to 95% of pilots fail</a>.</p><p>The problem is rarely the underlying foundation model. The models are increasingly capable, reasoning with nuance and processing vast amounts of context. The failure occurs because organizations mistake a successful demo for a viable system.</p><p>A demo proves that a model can perform a task in isolation. A production system must perform that task reliably, securely, and economically, thousands of times a day, integrated into existing workflows, and governed by strict compliance standards. The space between these two realities is what I call the Production Gap.</p><p>When an AI pilot fails to cross this gap, it is usually because the team has accumulated one or more of five specific types of &#8220;debt.&#8221; Understanding these debts is the first step to building AI systems that actually survive contact with the real world. As Miguel Otero Pedrido notes, <a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">the model code is merely a rounding error</a> in the actual size of the system; the real complexity lies in the surrounding infrastructure.</p><h2>1. Technical Debt: The Brittle Foundation</h2><p>In the rush to build a compelling demo, teams often hardcode prompts, bypass error handling, and ignore edge cases. This is acceptable for a proof of concept, but fatal in production.</p><p>Technical debt in AI systems manifests as brittleness. When the API response format changes slightly, the system crashes. When a user inputs an unexpected query, the agent hallucinates wildly rather than failing gracefully. Production-grade AI requires robust orchestration. It needs retry logic with exponential backoff, fallback mechanisms to alternative models or data sources, and state management that can handle interruptions.</p><p>If your architecture diagram consists of a single arrow pointing from a user interface to an LLM API, you are carrying massive technical debt. The naive setup of running every request through a single expensive frontier model is <a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">economically deranged in production</a> [2]. A real production setup requires a router, a model fleet, and a fallback chain to ensure reliability and cost-efficiency.</p><p>Furthermore, the code that glues these components together is often written hastily. This &#8220;glue code&#8221; becomes a massive liability as the system scales, making it nearly impossible to debug when the system inevitably fails in unexpected ways.</p><h2>2. Operational Debt: The Orphaned System</h2><p>A traditional software application, once deployed, is relatively stable. An AI system is a living, breathing entity that degrades over time. Models drift. Underlying data distributions change. The prompts that worked perfectly in January may produce suboptimal results in June.</p><p>Operational debt occurs when an organization deploys an AI system without establishing clear ownership for its ongoing maintenance. Who monitors the system for silent degradation? Who updates the prompts when a new model version is released? Who handles the escalation when the agent encounters a scenario it cannot resolve?</p><p>Without a dedicated operations layer&#8212;including robust observability tools and a clear RACI (Responsible, Accountable, Consulted, Informed) matrix&#8212;an AI system will inevitably become an orphaned liability. Markdown configs, such as prompts and skill definitions, must be <a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">treated as source code with proper version control</a> and peer review.</p><p>The lack of operational readiness is often the silent killer of AI projects. Teams celebrate the launch, but fail to allocate the resources required to keep the system healthy in the months and years that follow.</p><h2>3. Evaluation Debt: The Accuracy Illusion</h2><p>How do you know if your AI system is working? If the answer is &#8220;we spot-checked a few outputs and they looked good,&#8221; you are suffering from evaluation debt.</p><p>In traditional software, testing is deterministic: given input X, the system must produce output Y. AI systems are probabilistic. They require a fundamentally different approach to evaluation. Relying solely on average accuracy metrics is a trap. A system that is 95% accurate might still fail catastrophically on the 5% of edge cases that matter most to the business.</p><p>Decision-grade evaluation requires measuring multiple dimensions: reliability (consistency of output), latency, cost per inference, and ultimately, decision impact. It requires automated test suites that evaluate output characteristics against a &#8220;golden dataset&#8221; of expected behaviors, rather than demanding exact string matches. As Hamel Husain emphasizes, <a href="https://hamelhusain.substack.com/p/why-ai-evals-are-an-increasingly">systematically measuring your AI product is crucial</a> to escape &#8220;vibe-check hell.&#8221;</p><p>Building these evaluation frameworks is tedious, unglamorous work. But without them, you are flying blind, unable to distinguish between a minor model update and a catastrophic system failure.</p><h2>4. Integration Debt: The Workflow Mismatch</h2><p>The most brilliant AI agent is useless if it does not fit seamlessly into the way people actually work. Integration debt occurs when an AI system is built in a silo, disconnected from the enterprise&#8217;s core data systems and operational workflows.</p><p>This often looks like a standalone chatbot interface that requires users to manually copy and paste data from their CRM or ERP systems. True enterprise value comes from agentic workflows that can autonomously retrieve data, process it, and execute actions across multiple systems.</p><p>Overcoming integration debt requires treating the AI not as a destination, but as a routing and processing layer embedded within existing enterprise architecture. It requires deep collaboration between the AI engineers and the domain experts who actually understand the business processes being automated.</p><p>When AI systems are forced upon users without considering their existing workflows, adoption rates plummet, and the project ultimately fails to deliver a return on investment.</p><h2>5. Governance Debt: The Compliance Blindspot</h2><p>In highly regulated industries like insurance and financial services, governance is not an afterthought; it is a prerequisite for deployment. Governance debt accumulates when teams build AI systems without considering data privacy, auditability, and regulatory compliance from day one.</p><p>Can you explain exactly why the AI made a specific recommendation? Can you prove that no sensitive customer data was used to train a public model? If the system hallucinates a regulatory report, who is liable?</p><p>As frameworks like Solvency II and IFRS 17 demand increasing rigor in financial reporting, deploying &#8220;black box&#8221; AI systems is a non-starter. Production-grade AI requires causal reasoning and full audit trails, ensuring that every output can be traced back to its source data and logical steps.</p><p>Ignoring governance debt during the pilot phase is a guaranteed way to ensure the project is killed by the compliance team before it ever reaches production.</p><h2>Bridging the Gap</h2><p>The transition from demo to production is not merely a matter of writing better code. It requires a paradigm shift from experimenting with models to engineering robust, governed systems.</p><p>At Wangari, we focus on building agentic and causal AI infrastructure that addresses these five debts head-on. We believe that for AI to deliver on its promise in the enterprise, it must be built on a foundation of reliability, auditability, and deep integration.</p><p>The era of the impressive AI demo is over. The era of the resilient AI system has begun.</p><div><hr></div><h1>Meanwhile, at Wangari</h1><p>If you are a technical leader, product manager, or data scientist struggling to move your AI initiatives out of &#8220;pilot purgatory,&#8221; I am teaching a 6-week live cohort course designed specifically to solve this problem.</p><p><strong>From Demo to Production:</strong> Operationalize an Enterprise-Grade Agentic AI Reporting System launches on June 9th.</p><p>Over 10 live sessions, we will move beyond the hype and focus on the hard engineering and operational realities of enterprise AI. You will learn how to design robust orchestration layers, implement decision-grade evaluation metrics, and build systems that survive contact with the real world.</p><p><a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">Enrollment is open now at GenAI Academy.</a></p><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://theneuralmaze.substack.com/p/hidden-technical-debt-in-agentic">Hidden Technical Debt in Agentic Systems</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Miguel Otero Pedrido&quot;,&quot;id&quot;:89972117,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!LZBx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b58b1f5-4d25-4dcf-9f48-b67a6e6e1316_1200x1200.jpeg&quot;,&quot;uuid&quot;:&quot;08e7d60f-cd08-405e-ada4-9c07beecd9e3&quot;}" data-component-name="MentionToDOM"></span>: A deep dive into why the model code is just a small part of an agentic system, and how the real engineering risk lives in the infrastructure around it. This piece is essential reading for anyone who thinks deploying an LLM is just a matter of making an API call, as it exposes the massive hidden costs of orchestration and state management.</p></li><li><p><a href="https://hamelhusain.substack.com/p/why-ai-evals-are-an-increasingly">Why AI Evals Are An Increasingly Important Skill</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Hamel Husain&quot;,&quot;id&quot;:2260358,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7sqx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Feee58cd7-9a81-4ef6-b0f4-faeed62d5166_400x400.jpeg&quot;,&quot;uuid&quot;:&quot;fefd316c-70ff-404d-8b76-2d5262a73df8&quot;}" data-component-name="MentionToDOM"></span>: A practical guide on how to systematically measure AI products and escape the trap of relying on subjective &#8220;vibe checks.&#8221; Husain argues convincingly that without rigorous, automated evaluation frameworks, teams are essentially flying blind when they push updates to production.</p></li><li><p><a href="https://www.latent.space/p/shreya-shankar">Grounded Research: From Google Brain to MLOps to LLMOps</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alessio Fanelli&quot;,&quot;id&quot;:3381444,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef686287-e8cb-4397-b1a3-ee45774394d6_1252x1154.jpeg&quot;,&quot;uuid&quot;:&quot;137dfaff-6246-4340-b147-0bf3078772d6&quot;}" data-component-name="MentionToDOM"></span> and <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Latent.Space&quot;,&quot;id&quot;:89230629,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db0f8d45-1eb8-4c02-a120-650d377ee52d_640x640.jpeg&quot;,&quot;uuid&quot;:&quot;33acb364-a35f-41d1-b8d7-f87152bb2fe1&quot;}" data-component-name="MentionToDOM"></span> featuring <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Shreya Shankar&quot;,&quot;id&quot;:58144420,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bacf4319-d2ab-4665-b179-d0fc5b11c708_1176x1176.jpeg&quot;,&quot;uuid&quot;:&quot;f9ad2401-d32b-436d-9969-22b03f4ba75a&quot;}" data-component-name="MentionToDOM"></span> Shreya Shankar: An insightful discussion on the principles of production-grade machine learning and the importance of data validation. This conversation highlights how the lessons learned from traditional MLOps are both highly relevant and fundamentally insufficient for the new era of generative AI.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Agentic Enterprise: When Your AI Won't Stop]]></title><description><![CDATA[Why the AI systems we deploy are optimizing for continuation, not completion.]]></description><link>https://newsletter.wangari.global/p/the-agentic-enterprise-when-your</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-agentic-enterprise-when-your</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 08 May 2026 06:01:53 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/196770561/6968bac1fddcab0082043f5b6d827bfd.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>A few weeks ago, I needed to set up a frontend deployment for a new project. I opened an AI agent, gave it the parameters, and within minutes it had recommended DigitalOcean, configured the deployment, and handed me the setup. I clicked confirm, entered my payment details, and moved on with my day.</p><p>It was a perfectly smooth experience. It was also, in retrospect, slightly unsettling.</p><p>If I had done that task myself a year ago, I would have spent an hour reading documentation and comparing providers. This time, I outsourced the judgment entirely. The agent made a reasonable call, but I didn&#8217;t actually know why it chose that specific provider. I just accepted the efficiency gain and paid the bill.</p><p>We talk a lot about the fear of &#8220;runaway AI&#8221;&#8212;the sci-fi scenario where autonomous systems hijack our businesses. But the reality of agentic AI is much quieter. The danger isn&#8217;t that the agent goes rogue. The danger is that the agent is always optimizing for something, and it&#8217;s not always what you think.</p><h2>The Illusion of Shared Intent</h2><p>When you delegate a task to a human analyst, you share a broad context. They know the implicit goal is to find a reliable, cost-effective solution that fits the company&#8217;s existing tech stack. They know when to stop researching and make a decision.</p><p>Agents do not share this context. They operate on objective functions. You give them a prompt, and they translate that prompt into a mathematical target to maximize. In the case of my DigitalOcean deployment, the agent was likely optimizing for the fastest path to a working configuration. It wasn&#8217;t optimizing for long-term cost efficiency or vendor lock-in risk, because I didn&#8217;t explicitly tell it to.</p><p>When the cost of making a decision drops to zero, we stop making decisions. We let the model choose based on its training data and hidden system prompts, not our strategic priorities. We get the efficiency, but we lose the steering wheel.</p><h2>The Agent That Wouldn&#8217;t Stop</h2><p>There is a second, more frustrating form of misalignment. I recently watched an agent try to pull a dataset from a public API where the endpoint had changed. A human would have stopped to ask for help. The agent did not. It retried the call, rephrased the headers, wrote a Python script for a different authentication method, and looped relentlessly, burning through API tokens.</p><p>Why? Because the underlying model was trained on a specific objective function: continue the conversation. Most commercial LLMs are fine-tuned to be helpful and conversational. They are penalized for giving up. When you wrap that conversational model in an agentic loop and give it an API key, that &#8220;helpful&#8221; persistence becomes a liability. It optimizes for continuation rather than completion.</p><h2>The Bottom Line</h2><p>We are entering a phase of technology where the primary skill is no longer execution, but delegation. The people and companies that thrive will not be the ones who write the best code. They will be the ones who know how to explicitly define their intent, and how to build the guardrails that keep their agents aligned with that intent.</p><p>The next time an agent does something perfectly for you, take a moment to ask yourself: what was it actually optimizing for? And are you sure it&#8217;s the same thing you wanted?</p>]]></content:encoded></item><item><title><![CDATA[Who is my AI agent really working for?]]></title><description><![CDATA[The subtle misalignment between what you want and what your AI is actually trying to do]]></description><link>https://newsletter.wangari.global/p/who-is-my-ai-agent-really-working</link><guid isPermaLink="false">https://newsletter.wangari.global/p/who-is-my-ai-agent-really-working</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 05 May 2026 06:02:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dtVw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dtVw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dtVw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!dtVw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!dtVw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!dtVw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dtVw!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:232488,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.wangari.global/i/196439090?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dtVw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!dtVw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!dtVw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!dtVw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe432a11f-ad19-4b13-9266-b3da0ec62561_1344x768.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agents are capable and powerful &#8212; but what if they chip away at the wrong problem? Image generated with Leonardo AI</figcaption></figure></div><p><em>Author's note: Wangari is evolving, and so is this newsletter. We've refreshed the name, the look, and the focus &#8212; you'll find more on the company and what we're building at <a href="https://wangari.global/">wangari.global</a>. If you've been here since the early days, thank you. If you're new, welcome.</em></p><div><hr></div><p>A few weeks ago, I needed to set up a frontend deployment for a new project. I opened an AI agent, gave it the parameters, and within minutes it had recommended DigitalOcean, configured the deployment, and handed me the setup. I clicked confirm, entered my payment details, and moved on with my day.</p><p>It was a perfectly smooth experience. It was also, in retrospect, slightly unsettling.</p><p>If I had done that task myself a year ago, I would have spent an hour reading documentation. I would have compared DigitalOcean against AWS, Vercel, and Heroku. I would have checked pricing tiers and read a few Reddit threads about latency. This time, I did none of that. I outsourced the judgment entirely. The agent made a reasonable call, but the truth is, I didn&#8217;t actually know why it chose that specific provider over the others. I just accepted the efficiency gain and paid the bill.</p><p>We talk a lot about the fear of &#8220;runaway AI&#8221;&#8212;the sci-fi scenario where autonomous systems hijack our businesses or our infrastructure. But the reality of agentic AI is much subtler, and in some ways, much more insidious. The danger isn&#8217;t that the agent goes rogue. The danger is that the agent is always optimizing for something, and it&#8217;s not always what you think.</p><h2>The Illusion of Shared Intent</h2><p>When you delegate a task to a human&#8212;say, an analyst on your team&#8212;you share a broad context. If you ask them to research a software vendor, they know that the implicit goal is to find a reliable, cost-effective solution that fits the company&#8217;s existing tech stack. They know when to stop researching and make a decision. They know what &#8220;done&#8221; looks like.</p><p>Agents do not share this context. They operate on objective functions. You give them a prompt, and they translate that prompt into a mathematical target to maximize.</p><p>In the case of my DigitalOcean deployment, the agent was likely optimizing for &#8220;fastest path to a working configuration based on the user&#8217;s prompt.&#8221; It wasn&#8217;t optimizing for long-term cost efficiency, because I didn&#8217;t explicitly tell it to. It wasn&#8217;t optimizing for vendor lock-in risk. It just found the shortest distance between my request and a successful execution.</p><p>This is the first form of misalignment: outsourced judgment. When the cost of making a decision drops to zero, we stop making decisions. We let the model choose. But the model is choosing based on its training data and its hidden system prompts, not based on your strategic priorities. You get the efficiency, but you lose the steering wheel.</p><h2>The Agent That Wouldn&#8217;t Stop</h2><p>There is a second, more frustrating form of misalignment, and if you&#8217;ve used agentic workflows recently, you&#8217;ve probably seen it.</p><p>I was watching an agent try to pull a specific dataset from a public API last week. The API endpoint had changed, and the agent&#8217;s initial request failed. A human would have looked at the error, realized the documentation was out of date, and stopped to ask for help or search for the new endpoint.</p><p>The agent did not stop. It retried the exact same call. When that failed, it slightly rephrased the headers and tried again. It wrote a Python script to try a different authentication method. It looped, and looped, and looped, burning through API tokens with relentless, cheerful persistence.</p><p>Why did it do this? Because the underlying model was trained on a specific objective function: continue the conversation.</p><p>Most commercial LLMs are fine-tuned using Reinforcement Learning from Human Feedback (RLHF) to be helpful, harmless, and conversational. They are penalized for giving up or saying &#8220;I don&#8217;t know.&#8221; When you wrap that conversational model in an agentic loop and give it a credit card or an API key, that &#8220;helpful&#8221; persistence becomes a liability. The agent doesn&#8217;t know when to quit because quitting is penalized in its training data. It optimizes for continuation rather than completion.</p><h2>The Hidden Cost of Misaligned Agency</h2><p>These two scenarios&#8212;the agent that decides too quickly and the agent that won&#8217;t stop trying&#8212;are symptoms of the same structural problem. The agent&#8217;s objective function is a proxy for yours, not the real thing.</p><p>In financial services, where we spend a lot of time at Wangari, this gap between proxy and reality is a known systemic risk. If you incentivize a trader based purely on quarterly returns (the proxy), they will take on hidden tail risks that blow up the fund in year three (the reality). We are currently doing the exact same thing with our software.</p><p>When we deploy agents to negotiate contracts, optimize supply chains, or manage cloud infrastructure, we are handing over agency to systems that do not share our risk tolerance. They do not feel the pain of a blown budget. They do not care if a vendor relationship sours. They only care about the mathematical proxy we gave them.</p><p>And because they operate at machine speed, the drift happens faster than we can monitor it. You let the agent choose the deployment provider today. Tomorrow, you let it negotiate the enterprise tier. Next month, it&#8217;s automatically renewing subscriptions across your entire tech stack based on a &#8220;cost optimization&#8221; prompt that actually just locks you into longer contracts.</p><h2>How to Stay in the Loop</h2><p>The solution is not to turn the agents off. The productivity gains are too massive to ignore, and frankly, I don&#8217;t want to go back to reading AWS documentation if I don&#8217;t have to. The solution is to change how we define the boundaries of their autonomy.</p><p>First, we have to stop treating agents like human employees. You cannot manage an agent through &#8220;vibes&#8221; or implicit context. You have to manage it through explicit constraints. If you want an agent to optimize a process, you must mathematically define the cost of failure, the budget limit, and the exact conditions under which it must stop and ask for human intervention.</p><p>Second, we need to demand better observability from the platforms building these tools. I don&#8217;t just want to see the final output; I want to see the decision tree. If an agent recommends a vendor, it should be required to show the three alternatives it discarded and the specific weights it applied to make the choice. Explainability is not just a regulatory requirement; it is a prerequisite for trust.</p><h2>The Bottom Line</h2><p>We are entering a phase of technology where the primary skill is no longer execution, but delegation. The people and companies that thrive will not be the ones who write the best code or do the fastest research. They will be the ones who know how to explicitly define their intent, and how to build the guardrails that keep their agents aligned with that intent.</p><p>The next time an agent does something perfectly for you, take a moment to ask yourself: what was it actually optimizing for? And are you sure it&#8217;s the same thing you wanted?</p><div><hr></div><h1>Meanwhile, at Wangari</h1><h4>Scaling Sustainable Digital Platforms</h4><p>Together with Bern University, we are conducting academic research on how sustainable digital platforms grow and scale responsibly. If your company embeds environmental or social goals into its core business model, we&#8217;d love to speak with you.</p><p>The study involves 2&#8211;3 short interviews with key employees. Participation is anonymous, confidential, and low time commitment &#8212; and you&#8217;ll receive early access to our findings.</p><p>If this sounds like your company, or if you know someone it might fit, please reach out directly:</p><ul><li><p>Ari Joury, Cofounder &amp; CEO, Wangari Global &#8212; <a href="mailto:ari.joury@wangari.global">ari.joury@wangari.global</a></p></li><li><p>Melanie Gertschen, PhD Candidate, University of Bern &#8212; <a href="mailto:melanie.gertschen@unibe.ch">melanie.gertschen@unibe.ch</a></p></li></ul><div><hr></div><h1>Reads of the Week</h1><ul><li><p><a href="https://insuranceintel.substack.com/p/the-silent-ai-trap-why-legacy-carriers">The AI Liability Playbook: Monetizing the Silent AI Carve-Out</a>: In this piece for Insurance Intel, the author explores how the insurance industry is grappling with the new liability class forming around enterprise AI deployment. It perfectly complements our discussion on the liability gap, showing how the market is struggling to price risks it doesn&#8217;t fully understand.</p></li><li><p><a href="https://drericcole.substack.com/p/the-regulation-is-already-here-your">The Regulation Is Already Here. Your Program Isn&#8217;t Ready.</a> Writing for his eponymous newsletter, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Dr. Eric Cole&quot;,&quot;id&quot;:176246416,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89132108-a240-4c7d-bbe7-be0a6f41b5f4_1080x1080.png&quot;,&quot;uuid&quot;:&quot;337b0b9c-c907-4c67-bccd-bee4987e3d15&quot;}" data-component-name="MentionToDOM"></span> argues that cyber compliance has shifted from a checkbox exercise to personal liability for executives. This directly connects to the governance challenges of deploying autonomous agents in regulated environments.</p></li><li><p><a href="https://maxcorbridge.substack.com/p/update-46-who-is-actually-accountable">Who Is Accountable When Your Agent Goes Rogue?</a> In this deep dive for his newsletter, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Max Corbridge&quot;,&quot;id&quot;:325139711,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d20ed10e-9c2f-4fa1-9b1b-b7d1cd9817c5_3251x3251.jpeg&quot;,&quot;uuid&quot;:&quot;d382e013-f5ba-422d-8154-11ed9686138f&quot;}" data-component-name="MentionToDOM"></span> breaks down the accountability vacuum created when AI providers disclaim liability for security flaws in their models. Read this to understand the immediate risks of third-party agent integration.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[How Agentic AI Finally Makes Causal Inference Deployable]]></title><description><![CDATA[A technical walkthrough of the five bottlenecks that kept causal models out of production &#8212; and how AI agents are removing them one by one]]></description><link>https://newsletter.wangari.global/p/how-agentic-ai-finally-makes-causal</link><guid isPermaLink="false">https://newsletter.wangari.global/p/how-agentic-ai-finally-makes-causal</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 01 May 2026 06:00:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Las-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Las-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Las-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!Las-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!Las-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!Las-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Las-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138828,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://wangari.substack.com/i/195233339?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Las-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!Las-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!Las-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!Las-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ed0300-ae22-48c3-bb15-e1d12f07e96b_1344x768.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">In a changing world, logic is what we trust in &#8212; not just experience. Image generated with Leonardo AI</figcaption></figure></div><p>Earlier this week I made a claim that I want to back up properly: that agentic AI has made causal inference tractable at enterprise scale for the first time. This is not a marketing statement. It is a specific technical argument, and it deserves a specific technical treatment.</p><p>In this post, I want to walk through the five stages of deploying a causal model in a production environment, explain why each stage was a bottleneck before the current generation of AI agents, and show concretely what changes when you introduce agents into the pipeline. I will also be honest about what agents cannot do &#8212; because the failure modes of this architecture are as important as its strengths.</p><p>This is not a tutorial. There is no step-by-step guide to follow. The depth here comes from the reasoning, not from the implementation. If you want to understand why this architecture is different &#8212; not just that it is different &#8212; this is the post for you.</p><h2>Background: what causal inference actually requires</h2><p>Before we can talk about what agents change, we need to be precise about what causal inference requires. The framework I am working with is Judea Pearl&#8217;s Structural Causal Model (SCM) approach, which is the dominant formal framework for causal reasoning in statistics and machine learning.</p><p>An SCM represents a system as a set of variables and a set of structural equations &#8212; one for each variable &#8212; that specify how that variable is determined by its causes. The causal structure is represented as a Directed Acyclic Graph (DAG): nodes are variables, directed edges are direct causal relationships. The graph is acyclic because we assume no variable can be its own cause (in the relevant time window).</p><p>The key operation in this framework is the do-operator, written do(X = x). This represents an intervention: we set variable X to value x, severing its connection to its usual causes. The do-calculus &#8212; a complete set of inference rules developed by Pearl &#8212; allows us to compute the effect of such interventions from observational data, given a known causal graph. This is what makes the framework powerful: you can answer &#8220;what if we change X?&#8221; without ever having run the experiment.</p><p>Deploying this in a production environment requires five distinct stages. Each one, historically, has been a serious bottleneck.</p><h3>Stage 1: Variable selection and domain scoping</h3><p>The first stage is deciding which variables belong in the model. This sounds straightforward. It is not.</p><p>In a financial system, the space of potentially relevant variables is enormous: macroeconomic indicators, firm-level financials, regulatory metrics, ESG scores, market microstructure variables, sentiment signals, and more. Not all of them belong in the causal model. Including too many variables introduces spurious paths and makes the graph unidentifiable. Including too few means missing important confounders and getting biased estimates of causal effects.</p><p>The traditional approach is expert workshops: bring together domain specialists &#8212; actuaries, risk officers, portfolio managers &#8212; and have them argue about which variables matter and why. This process is valuable. It is also slow, expensive, and heavily dependent on who is in the room. A variable that one expert considers obviously relevant may not occur to another.</p><p>What agents change here is the breadth of the initial search. An agent can synthesise a large body of domain literature &#8212; academic papers, regulatory guidance, industry reports &#8212; and propose a candidate variable set with citations. It can identify variables that appear repeatedly in the literature as important drivers of a given outcome, flag variables that are commonly treated as confounders, and surface domain knowledge that might not be in the room. The expert&#8217;s job shifts from generating the list from scratch to evaluating and pruning a well-researched proposal. This is faster and less dependent on any single expert&#8217;s knowledge.</p><p>The caveat is important: agents can propose, but they cannot validate. The final variable set must be approved by domain experts who understand the business context. An agent that has read every paper on ESG and financial performance still does not know which variables are actually available in your data infrastructure, or which ones your compliance team will accept in a regulatory submission.</p><h3>Stage 2: Causal graph construction</h3><p>Once you have a variable set, you need to specify the causal structure: which variables cause which, and in which direction. This is the hardest stage, and historically the most expensive.</p><p>There are two broad approaches: constraint-based methods (like the PC algorithm, from Spirtes, Glymour, and Scheines) that infer causal structure from conditional independence tests in the data, and score-based methods that search over graph structures to maximise a goodness-of-fit criterion. Both have well-known limitations. Constraint-based methods are sensitive to the faithfulness assumption and to sample size. Score-based methods face a combinatorial search problem that becomes intractable for large variable sets. Neither approach produces a unique graph from observational data alone &#8212; the best you can get is a Markov equivalence class of graphs that are statistically indistinguishable.</p><p>In practice, this means that automated causal discovery algorithms can narrow the space of plausible graphs, but they cannot determine the final structure without domain input. The direction of edges &#8212; which is often the most important thing &#8212; frequently cannot be determined from the data alone and must be specified by a domain expert.</p><p>What agents change here is the iteration speed. An agent can run multiple causal discovery algorithms, compare their outputs, flag edges where the algorithms disagree, and present the domain expert with a structured set of decisions: &#8220;these edges are agreed across all methods; these are contested; these are determined by the data but conflict with the following domain knowledge.&#8221; The expert&#8217;s job shifts from running algorithms and interpreting raw output to making a set of well-framed decisions. The number of decisions is the same; the cost of each decision is lower.</p><p>The failure mode to watch for: agents that present a single &#8220;best&#8221; graph without surfacing the uncertainty. Causal graphs are not uniquely determined by data. Any system that presents a causal structure as if it were a fact &#8212; rather than a hypothesis to be validated &#8212; is misrepresenting the epistemics of the problem.</p><h3>Stage 3: Graph validation and sensitivity testing</h3><p>A causal graph is a set of assumptions. Every edge is a claim: &#8220;this variable directly causes that one.&#8221; Every missing edge is also a claim: &#8220;these two variables are not directly causally connected, conditional on everything else in the graph.&#8221; These claims can be wrong, and the consequences of getting them wrong can be severe.</p><p>The standard approach to validation is a combination of domain review (does the graph make sense to experts?) and statistical testing (do the conditional independence relationships implied by the graph hold in the data?). The latter is formalised through the concept of d-separation: a graph implies that certain pairs of variables are conditionally independent given certain other variables, and these implications can be tested.</p><p>Sensitivity analysis asks a related question: how much do the causal effect estimates change if we modify the graph &#8212; add an edge, reverse a direction, introduce an unmeasured confounder? This is important because the graph is always an approximation of reality, and you want to know which parts of your conclusions are robust to that approximation and which are fragile.</p><p>Historically, this stage required a specialist statistician who could run the tests, interpret the results, and translate them back into graph modifications. It was slow and iterative. What agents change is the automation of the test battery: an agent can systematically run all implied conditional independence tests, flag violations, identify which edges are implicated in each violation, and generate a structured report. It can also run systematic sensitivity analyses &#8212; varying edge weights, introducing hypothetical confounders, testing the stability of effect estimates &#8212; and summarise the results. The statistician&#8217;s job shifts from running tests to interpreting a structured diagnostic report.</p><h3>Stage 4: Interventional query answering</h3><p>This is the stage where the causal model earns its keep. An interventional query asks: &#8220;what is the expected value of outcome Y if we set variable X to value x?&#8221; In Pearl&#8217;s notation: </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;E[Y | \\text{do}(X = x)].&quot;,&quot;id&quot;:&quot;KRQPABJMES&quot;}" data-component-name="LatexBlockToDOM"></div><p>This is different from the conditional expectation that a standard regression model estimates:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;E[Y | X = x]&quot;,&quot;id&quot;:&quot;SSAOJFCQOO&quot;}" data-component-name="LatexBlockToDOM"></div><p>Computing interventional queries requires applying the do-calculus: a set of three rules that allow you to transform expressions involving the do-operator into expressions that can be computed from the observational distribution, given the causal graph. For simple graphs, this is straightforward. For complex graphs with many variables, multiple intervention targets, and time-series structure, it can require significant algebraic manipulation.</p><p>Historically, translating a business question (&#8221;what happens to our loss ratio if we change our pricing structure?&#8221;) into a formal interventional query, and then computing that query from the causal model, required a specialist who understood both the business context and the mathematical machinery. This was the primary reason that causal inference remained in academic settings: the translation cost was too high.</p><p>What agents change here is the translation layer. An agent can take a natural-language business question, identify the relevant variables, formulate the appropriate do-expression, apply the do-calculus to derive an estimable expression, and return the result with a plain-language explanation. The domain expert does not need to know the do-calculus. They need to be able to evaluate whether the agent&#8217;s translation of their question is correct &#8212; which is a much lower bar.</p><p>The failure mode: agents that answer the wrong question confidently. A business question is often ambiguous between a conditional query and an interventional query. &#8220;What happens to our loss ratio when ESG scores are high?&#8221; could mean &#8220;what do we observe in cases where ESG scores happen to be high?&#8221; (conditional) or &#8220;what would happen if we forced ESG scores to be high?&#8221; (interventional). These have different answers. An agent that does not surface this ambiguity &#8212; and ask the user to resolve it &#8212; is a liability.</p><h3>Stage 5: Audit trail and documentation</h3><p>In regulated industries, the output of an analysis is not just the answer. It is the answer plus the full chain of reasoning that produced it. Every modelling assumption, every data transformation, every analytical choice must be documented and defensible.</p><p>For causal models, this is particularly demanding. The audit trail must cover: the variable selection rationale, the causal graph structure and the basis for each edge, the validation tests and their results, the specific interventional queries that were run, and the mapping from those queries to the business decisions they informed. In a manual process, this documentation is often incomplete, inconsistent, and produced after the fact.</p><p>What agents change here is that the audit trail can be automatic and contemporaneous. Every agent action &#8212; every literature search, every algorithm run, every graph modification, every query &#8212; is logged with a timestamp, the inputs, the outputs, and the reasoning. The documentation is not a separate task; it is a byproduct of the process. For a regulatory submission or an internal audit, this is not a minor convenience. It is the difference between a defensible analysis and one that cannot be reconstructed.</p><h2>What this architecture cannot do</h2><p>I want to be direct about the limitations, because they matter.</p><p>Agents cannot determine causal structure from data alone. The direction of causal edges is often underdetermined by observational data, and no amount of computational power changes this. Domain expertise is not optional; it is load-bearing.</p><p>Agents cannot validate their own translations. When an agent translates a business question into a formal query, it may translate it incorrectly &#8212; and it may do so confidently. The human review step at Stage 4 is not a formality. It is the primary defence against a class of errors that are invisible in the output but consequential in the decision.</p><p>Agents are not a substitute for experimental data. The do-calculus allows you to compute interventional effects from observational data under certain assumptions &#8212; primarily that the causal graph is correctly specified and that there are no unmeasured confounders. When these assumptions are violated, the estimates can be badly wrong. Agents cannot tell you when the assumptions are violated; only domain knowledge and, ultimately, experimental evidence can do that.</p><h2>The Bottom Line</h2><p>The case for this architecture is not that it makes causal inference easy. It doesn&#8217;t. The case is that it makes causal inference viable &#8212; that it removes the cost barriers that kept a rigorous and well-established methodology out of production for thirty years.</p><p>The methodology is not new. The infrastructure is. And the organisations that combine the two &#8212; that build systems where agents handle the process layer and domain experts hold the judgment layer &#8212; are building something that correlation-based AI cannot replicate: a rigorous, auditable, interventionally valid model of the systems they operate in.</p><p>In regulated industries, that is worth building. The window for building it first is open now.</p>]]></content:encoded></item><item><title><![CDATA[Your Model Knows What Happened. It Doesn't Know Why.]]></title><description><![CDATA[The gap between correlation and causation is not a technical detail &#8212; it is the whole problem]]></description><link>https://newsletter.wangari.global/p/your-model-knows-what-happened-it</link><guid isPermaLink="false">https://newsletter.wangari.global/p/your-model-knows-what-happened-it</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Thu, 30 Apr 2026 06:00:32 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/195231788/c70f7080055bd2234406046810eac535.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>There is a version of AI in financial services that is very good at finding patterns. It has been trained on years of data, it can process millions of variables, and it will tell you, with impressive confidence, what tends to happen next. What it cannot tell you is why. And in the moment you need to make a decision &#8212; to intervene, to change something, to act &#8212; &#8220;what tends to happen&#8221; is the wrong answer to the wrong question.</p><h2>The moment correlation breaks</h2><p>Correlation-based models are built on an implicit assumption: that the future will look enough like the past that past patterns will hold. This assumption is reasonable in stable conditions. It breaks precisely when you need it most &#8212; during structural shifts, regulatory changes, or market disruptions. More fundamentally, it breaks the moment you intervene. When you change your underwriting criteria, restructure a portfolio, or alter your ESG policy, you are not observing the world. You are changing it. A model trained on observation has nothing principled to say about what happens when you act.</p><p>This is not a failure of the model. It is a failure of the question. Correlation can tell you what co-occurs. It cannot tell you what will happen if you force something to change. That requires a different kind of reasoning &#8212; one that encodes not just patterns, but mechanisms.</p><h2>What causal models do differently</h2><p>A causal model does not just learn that two things tend to move together. It encodes the directional mechanism: this variable drives that one, through this pathway, under these conditions. Once you have that structure, you can ask the question that actually matters for decision-making: if we intervene here, what happens there? Not &#8220;what tends to happen when X is high?&#8221; but &#8220;what would happen if we set X to this value &#8212; deliberately, right now?&#8221;</p><p>For an actuary, this is the difference between a model that describes historical loss patterns and one that can tell you what happens to your loss ratio if you change your pricing structure. For an ESG analyst, it is the difference between a model that shows ESG scores correlating with returns and one that can tell you whether improving your sustainability practices will actually improve your financial performance &#8212; or whether both are being driven by something else entirely.</p><h2>The Bottom Line</h2><p>The methodology to build these models has existed for thirty years. What has changed is the infrastructure to apply it at scale &#8212; and that infrastructure is here now. The organisations that make this shift are not just getting better predictions. They are getting answers to questions that correlation-based systems cannot answer at all. In regulated industries, that is not a marginal improvement. It is a different game.</p>]]></content:encoded></item><item><title><![CDATA[Causal Inference is Finally There]]></title><description><![CDATA[Why causal inference is finally arriving in industry &#8212; thirty years after it was invented]]></description><link>https://newsletter.wangari.global/p/causal-inference-is-finally-there</link><guid isPermaLink="false">https://newsletter.wangari.global/p/causal-inference-is-finally-there</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 28 Apr 2026 06:01:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!OANR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OANR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OANR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!OANR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!OANR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!OANR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OANR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:142218,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://wangari.substack.com/i/195230159?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OANR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!OANR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!OANR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!OANR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42ad27b6-ea22-4358-b72c-3af32313e973_1344x768.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Causal inference is like the lighthouse of statistics: You can see further, even over uncharted waters. Image generated with Leonardo AI</figcaption></figure></div><p>For three decades, a small community of researchers has known something that the rest of the data world is only now beginning to absorb: the most important question you can ask about your data is not &#8220;what correlates with what?&#8221; It is &#8220;what causes what?&#8221; The methodology to answer that question rigorously has existed since the early 1990s. The tools to apply it at enterprise scale have not &#8212; until now.</p><p>This is the story of a thirty-year gap between a scientific breakthrough and its practical arrival. And it is a story that matters enormously for anyone in financial services who is trying to build AI systems they can actually trust.</p><h2>The problem with correlation</h2><p>Correlation is the workhorse of modern data analysis. It is fast, scalable, and surprisingly powerful. When you train a model on historical data and ask it to predict future outcomes, you are essentially asking it to find and exploit correlations &#8212; patterns that held in the past and might hold in the future. This works well enough in stable conditions. It fails, often silently, when conditions change.</p><p>The deeper problem is that correlation cannot answer the question that actually matters in many regulated industries: what happens if we intervene? If you change your underwriting criteria, restructure a portfolio, or shift your ESG policy &#8212; you are not observing the world, you are changing it. A correlation-based model has nothing principled to say about what happens next. It can only extrapolate from the past. So if unprecedented conditions occur, it&#8217;s flummoxed: It was trained on a world where things co-occurred; it has no mechanism for reasoning about a world you have deliberately altered in previously unseen ways.</p><p>This is not a data quality problem. It is not a model size problem. It is a fundamental limitation of correlation as a mode of reasoning. And it is why industry professionals like actuaries, risk officers, and investment analysts have always maintained a healthy scepticism toward purely statistical models &#8212; even when they perhaps could not always articulate exactly why.</p><h2>What causal inference actually does</h2><p>Causal inference, in the technical sense developed by Judea Pearl and colleagues, is a framework for reasoning about interventions. Instead of asking &#8220;what tends to happen when X is high?&#8221;, it asks &#8220;what would happen if we set X to a specific value &#8212; holding everything else constant?&#8221; The two questions sound similar. They have very different answers, and they require very different mathematics.</p><p>The key tool is the Structural Causal Model: a formal representation of a system as a set of variables and the directional mechanisms that connect them. Not just correlations, but causes. The model encodes which variables drive which outcomes, through which pathways, and with what structure. Once you have that model, you can answer interventional questions directly &#8212; not by extrapolating from historical patterns, but by reasoning through the causal structure of the system.</p><p>For industry and financial services, this matters in ways that are immediately practical. A model of a manufacturing plant that&#8217;s built on causal structure can tell you whether improving your sustainability practices will actually improve your financial performance &#8212; or whether both are driven by a third factor, like management quality or regulatory environment. A risk model built on causal structure can tell you which interventions will actually reduce tail risk &#8212; not just which variables happen to be correlated with it. These are the questions that senior decision-makers are actually asking. Correlation-based models cannot answer them.</p><h2>Why it took thirty years</h2><p>If the methodology was ready in the 1990s, why are we only now seeing it arrive in enterprise software? The honest answer is that applying causal inference at scale has always required an enormous amount of expert labor.</p><p>Building a causal model is not like training a neural network. You cannot simply feed it data and let it find patterns. You need to specify the causal structure of the system &#8212; which variables are causes, which are effects, which are confounders. This requires domain expertise, iterative validation, and careful reasoning about the mechanisms at play. </p><p>For a complex system with dozens of interacting variables, this process could take weeks of expert workshops. And that was before you got to the question of how to translate the resulting model into answers to specific business questions.</p><p>The bottleneck was never the mathematics. It was the cost of applying the mathematics to real-world problems. Causal inference was tractable in academic settings, where a team of specialists could spend months on a single model. It was not tractable in enterprise settings, where you need answers in days, not months, and where the domain experts who could validate the causal structure are also the people running the business.</p><h2>What changed: AI agents, of course</h2><p>The emergence of capable AI agents has changed this equation in a way that is genuinely new. Tasks that previously required weeks of expert time &#8212; synthesising domain literature to identify candidate variables, proposing and testing causal graph structures, running systematic validation checks, translating business questions into formal interventional queries &#8212; can now be completed in hours. The methodology has not changed. The infrastructure for applying it at scale has.</p><p>This is not the same as saying that AI agents can replace domain expertise. They cannot, and they should not. The judgment layer &#8212; validating the causal structure against real-world knowledge, deciding which interventions are worth modelling, interpreting results in context &#8212; remains human. What agents automate is the process layer: the high-volume, well-defined, error-prone work that was consuming most of the expert&#8217;s time without requiring most of the expert&#8217;s judgment.</p><p>The combination of mature causal methodology and modern agentic AI infrastructure is genuinely new. It is not a marginal improvement on existing approaches. It is a different class of tool &#8212; one that can answer questions that correlation-based systems cannot, at a cost that is now commercially viable for the first time.</p><h2>The Bottom Line</h2><p>The organisations that build causal AI capabilities now are not just getting better analytics. They are building a fundamentally different relationship with their data &#8212; one where the question &#8220;why?&#8221; has a rigorous, auditable answer, not just a plausible-sounding one. In regulated industries, where the cost of a wrong answer is measured in capital requirements, regulatory penalties, and reputational damage, that difference is not academic. It is the whole game.</p><p>The methodology has been ready for thirty years. The infrastructure just caught up. </p><p>The window for first-mover advantage is open. It will not stay open.</p><div><hr></div><h2>Reads of the Week</h2><ul><li><p><a href="https://platforms.substack.com/p/the-problem-with-agentic-ai-in-2025">The problem with agentic AI in 2025</a>: In this essay, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Sangeet Paul Choudary&quot;,&quot;id&quot;:3927722,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!T1l9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F54a044f6-e037-4f1d-8694-cd4ef885134d_600x400.png&quot;,&quot;uuid&quot;:&quot;b763e5f6-f683-41db-b83d-0e8a30ac8845&quot;}" data-component-name="MentionToDOM"></span> argues that most organisations are treating agentic AI as a faster version of robotic process automation &#8212; and missing the point entirely. His central claim is that the real value of agents is not in executing workflows more cheaply, but in eliminating the logic of workflows altogether, and that governance &#8212; not execution speed &#8212; is the primary performance driver of a well-designed agentic system. Directly relevant to anyone thinking about how AI agents should be deployed in regulated, high-stakes environments. </p></li><li><p><a href="https://practicalainvestor.substack.com/p/correlation-vs-causation-why-it-matters">Correlation vs. Causation: Why It Matters for Investors</a>: <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alessio Sancetta&quot;,&quot;id&quot;:340155203,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/856a6d09-8353-41be-830b-82b6cf6197b0_1280x1280.jpeg&quot;,&quot;uuid&quot;:&quot;0572d46e-5af3-47c4-9b63-25dccb458f9e&quot;}" data-component-name="MentionToDOM"></span>&#8217;s take makes the core argument with unusual clarity: correlation describes a pattern, but without a causal anchor, even robust-looking relationships can collapse the moment conditions change. The 2022 equity-bond drawdown is the worked example &#8212; a correlation that held for two decades, built on a conditional relationship that most practitioners had mistaken for a structural one. A useful complement to this week&#8217;s post, written for a portfolio construction audience rather than a technical one. </p></li><li><p><a href="https://www.grumpy-economist.com/p/causation-does-not-imply-variation">Causation Does Not Imply Variation</a>: <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;John H. Cochrane&quot;,&quot;id&quot;:18572918,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!XH8t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6edc5e62-79ae-4d03-b5f3-4732fbea4277_500x500.png&quot;,&quot;uuid&quot;:&quot;5e2a1541-d654-4199-a54f-97b2c911719f&quot;}" data-component-name="MentionToDOM"></span>  offers a useful corrective to the other direction: just because you have identified a causal effect does not mean it explains much of the variation in the outcome you care about. Cochrane&#8217;s argument &#8212; that the causality revolution in econometrics has produced many well-identified but tiny effects, and that practitioners often jump from &#8220;this causes that&#8221; to &#8220;this explains that&#8221; without stopping to think &#8212; is an important caveat for anyone building causal models in production. Read it as a reminder that causal inference is a tool for answering specific questions, not a general-purpose explanation of the world.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Architecting for Autonomy: Beyond the Chatbot Paradigm]]></title><description><![CDATA[A deep-dive into the structural differences between conversational LLMs and agentic frameworks like OpenClaw and NanoClaw.]]></description><link>https://newsletter.wangari.global/p/architecting-for-autonomy-beyond</link><guid isPermaLink="false">https://newsletter.wangari.global/p/architecting-for-autonomy-beyond</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 24 Apr 2026 06:02:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!WjYy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WjYy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WjYy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!WjYy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!WjYy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!WjYy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WjYy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:114640,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://wangari.substack.com/i/194498086?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WjYy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!WjYy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!WjYy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!WjYy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4d278da-1e12-45f6-af02-5f606350a443_1344x768.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI is no longer just your tech support. If you build it right, it can start building for you. Image generated with Leonardo AI</figcaption></figure></div><p>The transition from conversational AI to agentic AI is not merely a change in user interface; it is a fundamental architectural shift. For the past two years, the dominant paradigm has been the stateless, prompt-response loop. A user provides a prompt, the Large Language Model (LLM) generates a response, and the interaction ends. The system&#8217;s &#8220;memory&#8221; is limited to the context window of the current session.</p><p>Agentic frameworks like OpenClaw and NanoClaw break this paradigm. They introduce persistent memory, autonomous task planning, and the ability to execute actions across external systems. This shift from passive generation to active execution introduces profound new challenges in system architecture, state management, and security.</p><p>In this deep dive, we will examine the mechanics of the &#8220;Agent Loop,&#8221; explore how memory and context are managed without traditional databases, and analyze the architectural trade-offs between monolithic agent frameworks (OpenClaw) and lightweight, isolated approaches (NanoClaw).</p><h3>The Anatomy of the Agent Loop</h3><p>At the core of any autonomous agent is the Agent Loop&#8212;a continuous cycle of observation, reasoning, and action. Unlike a standard LLM call, which is a single forward pass through the network, the Agent Loop is iterative and stateful.</p><p>When a message or trigger arrives, the agent does not immediately generate a final response. Instead, it enters a reasoning phase. It assembles context from its environment, including conversation history, workspace files, and available tools. It then queries the LLM not for an answer, but for a plan.</p><p>The LLM, acting as the reasoning engine, evaluates the context and determines the next necessary step. If the task requires external data or action, the LLM outputs a tool call (e.g., a JSON object specifying an API endpoint and parameters). The agent framework intercepts this tool call, executes the action (e.g., querying a database, sending an email), and appends the result to the context.</p><p>This loop repeats&#8212;often up to 20 times per request in frameworks like OpenClaw&#8212;until the LLM determines that the objective has been met and generates a final response to the user.</p><p>This iterative process is what enables agents to handle complex, multi-step workflows. However, it also introduces significant latency and cost, as each step requires a separate LLM inference call. More importantly, it creates a massive attack surface. If the LLM&#8217;s reasoning is compromised&#8212;for example, through a prompt injection attack hidden in a retrieved document&#8212;the agent may execute malicious tool calls with its delegated authority.</p><h3>State Management Without Databases</h3><p>One of the most fascinating architectural choices in OpenClaw is its approach to state management. Traditional enterprise applications rely on relational or NoSQL databases to manage state and persist data. OpenClaw, by default, eschews this approach in favor of plain text Markdown files.</p><p>In the OpenClaw architecture, everything from the agent&#8217;s core instructions (AGENTS.md) to its personality (SOUL.md) and long-term memory (MEMORY.md) is stored as Markdown in a local workspace directory.</p><p>This design choice has several profound implications:</p><ol><li><p>Transparency and Version Control: Because the entire state of the agent is represented as plain text, it can be easily inspected, audited, and version-controlled using standard tools like Git. Developers can see exactly what the agent &#8220;knows&#8221; at any given time.</p></li><li><p>Context Injection: When the agent needs to recall past interactions, it doesn&#8217;t query a database. Instead, it uses a local SQLite database with vector embeddings to perform semantic search across its Markdown files, injecting the relevant text directly into the LLM&#8217;s context window.</p></li><li><p>Concurrency Challenges: Relying on file system operations for state management introduces significant concurrency issues. If multiple asynchronous processes attempt to update the agent&#8217;s memory simultaneously, race conditions and file corruption can occur. OpenClaw mitigates this by serializing the agent loop per session&#8212;processing one task at a time, in order.</p></li></ol><p>While this file-based approach is elegant in its simplicity, it scales poorly in multi-tenant enterprise environments where high throughput and robust transaction management are required.</p><h3>The Monolith vs. The Micro-VM: OpenClaw and NanoClaw</h3><p>As the security implications of autonomous agents have become apparent, the architectural debate has centered on isolation. How do we prevent an agent from exceeding its intended scope?</p><p>OpenClaw represents the monolithic approach. It is a sprawling framework with hundreds of thousands of lines of code, designed to manage multiple messaging platforms, tool integrations, and agent sessions within a single Node.js process (the Gateway). Security in OpenClaw is primarily handled at the application level, relying on internal rules and permissions to restrict agent behavior.</p><p>This monolithic design is powerful and extensible, but it is also fragile. A vulnerability in any one of its dependencies or integrations can compromise the entire Gateway, granting an attacker access to all active agent sessions and their associated credentials.</p><p>NanoClaw emerged as a direct response to this fragility. It adopts a fundamentally different architectural philosophy: OS-level isolation.</p><p>Instead of running all agents within a single process, NanoClaw runs each agent in its own isolated container (using Docker or Apple Containers). The codebase is intentionally minimalist&#8212;often under 5,000 lines&#8212;reducing the attack surface and making security audits practical.</p><p>If a NanoClaw agent is compromised via prompt injection or a malicious tool, the blast radius is confined to that specific container. The attacker cannot pivot to the host operating system or access the memory of other agents.</p><h3>The Limits of Containerization</h3><p>While NanoClaw&#8217;s containerized approach provides robust protection against host compromise, it is crucial to understand its limitations. Containerization solves the problem of system security, but it does not solve the problem of identity security.</p><p>Consider an agent deployed within a NanoClaw container and granted an OAuth token to access a corporate CRM system. The container prevents the agent from reading the host&#8217;s /etc/passwd file, but it does nothing to prevent the agent from deleting every record in the CRM if it is manipulated into doing so.</p><p>The agent is operating exactly as designed, using the legitimate credentials it was provided. The container is intact, but the enterprise data is gone.</p><p>This highlights the core architectural challenge of agentic AI: we must move beyond securing the execution environment and begin securing the actions themselves.</p><h3>Building Verifiable AI Agents</h3><p>To safely deploy autonomous agents in enterprise environments, developers must adopt a defense-in-depth architecture that addresses both system isolation and identity governance.</p><ol><li><p><strong>Explicit Identity Boundaries:</strong> Every agent must be treated as a distinct Non-Human Identity (NHI) with its own ephemeral credentials. Long-lived API keys and broad OAuth scopes must be deprecated in favor of just-in-time, least-privilege access tokens.</p></li><li><p><strong>Verifiable Decision Paths:</strong> The Agent Loop must be instrumented to provide a verifiable audit trail of its reasoning. It is not enough to log the tool calls an agent makes; we must log the context and the LLM outputs that justified those calls. This allows security teams to reconstruct the agent&#8217;s &#8220;intent&#8221; during an incident investigation.</p></li><li><p><strong>Semantic Circuit Breakers:</strong> We cannot rely solely on the LLM to police its own behavior. Agent architectures must incorporate deterministic, semantic circuit breakers&#8212;independent validation layers that inspect proposed tool calls before they are executed. If an agent attempts an action that violates predefined safety invariants (e.g., transferring funds above a certain threshold, modifying production infrastructure), the circuit breaker must halt execution and require human intervention.</p></li></ol><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;f11dd667-5913-4963-ac42-1e475327f1c4&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python"># Example: A conceptual semantic circuit breaker
def execute_tool_call(agent_intent, proposed_action, context):
    # 1. Validate the action against deterministic safety invariants
    if not is_action_safe(proposed_action):
        raise SecurityException("Action violates safety invariants.")
    
    # 2. Verify the action aligns with the agent's authorized scope
    if not is_action_authorized(agent_intent, proposed_action, context):
        request_human_approval(agent_intent, proposed_action)
        return
        
    # 3. Execute the action
    return perform_action(proposed_action)</code></pre></div><h3>The Bottom Line</h3><p>The shift to agentic AI requires a fundamental rethinking of enterprise architecture. We are moving from systems that process data to systems that make decisions and take actions.</p><p>While lightweight, containerized frameworks like NanoClaw offer significant improvements over monolithic designs, they are only part of the solution. True security in the agentic era requires us to govern the identity and the actions of the software itself. We must build systems that are not just isolated, but verifiable, ensuring that autonomy always operates within clearly defined and strictly enforced boundaries.</p><div><hr></div><h2>I&#8217;m Launching a Course!</h2><p>So many AI projects die. And that&#8217;s not the fault of the tech nerds: They built the demo, and it worked. Still, 90% (yes, really) of all AI models never make it into production. So let&#8217;s dig deep into the big organizational underbellies, and let&#8217;s find out how we can make those numbers a bit better.</p><p>That&#8217;s the challenge I&#8217;ll be tackling in a new course starting April 21 at GenAI Academy, where we walk through how to actually move an agentic AI system from demo to production &#8212; including the organizational architecture required to make it work. This is for technical leaders, senior engineers, product managers, and AI/ML team leads. If you haven&#8217;t joined yet, it&#8217;s not too late to sign up!</p><p>I&#8217;m really excited to be able to bring what I&#8217;ve seen from the inside and outside to you in this format. You&#8217;ll experience me teaching live over 6 weeks! You&#8217;ll find all the details here: <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a>.</p>]]></content:encoded></item><item><title><![CDATA[The Illusion of the Isolated Agent]]></title><description><![CDATA[Why containerizing AI won't save you from the real risks of autonomy.]]></description><link>https://newsletter.wangari.global/p/the-illusion-of-the-isolated-agent</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-illusion-of-the-isolated-agent</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Thu, 23 Apr 2026 06:01:36 GMT</pubDate><enclosure url="https://api.substack.com/feed/podcast/194498369/936990be8334ebbab129fa40027b7476.mp3" length="0" type="audio/mpeg"/><content:encoded><![CDATA[<p>I remember the exact moment I realized the chatbot era was over. It was a quiet Tuesday afternoon when a colleague showed me a terminal window running a new open-source tool called OpenClaw. They didn&#8217;t type a prompt asking for a summary. They typed: &#8220;Prepare the weekly sales update.&#8221; The system didn&#8217;t just generate text; it executed the task across multiple systems, without a single human click in between.</p><p>For a brief moment, it felt like magic. Then, the reality of enterprise security set in.</p><p>As the hype around autonomous agents like OpenClaw grows, a counter-narrative has emerged: the promise of the &#8220;secure, local agent.&#8221; Tools like NanoClaw are being pitched as the safe alternative for enterprises. Their core value proposition is isolation. By running each agent in its own container&#8212;a secure, OS-level sandbox&#8212;they promise to keep the agent from breaking out and wreaking havoc on your host system.</p><p>It&#8217;s a compelling pitch. It&#8217;s also dangerously incomplete.</p><h2>The Container Fallacy</h2><p>The problem with focusing on containerization is that it solves the wrong problem. Yes, putting an agent in a secure box prevents it from directly attacking the server it runs on. But the real risk of an autonomous agent isn&#8217;t that it will escape its box. The real risk is what it does with the permissions you gave it.</p><p>If you give an agent access to your CRM, your email server, and your financial databases so it can &#8220;prepare the weekly sales update,&#8221; it doesn&#8217;t matter how secure its local container is. The agent now holds the keys to your enterprise.</p><p>If that agent is manipulated via a prompt injection attack, or if it simply hallucinates a destructive command, it will execute that command using the legitimate, authorized access you provided. The logs will show that an authorized account performed the action. The container will have done its job perfectly, isolating the agent while the agent systematically dismantles your data integrity.</p><h2>Identity is the New Perimeter</h2><p>We are still trying to apply legacy security concepts to a fundamentally new paradigm. We think of security as a perimeter&#8212;a wall around our applications or a container around our agents. But when software acts with delegated authority across multiple systems, the perimeter dissolves.</p><p>In the era of autonomous AI, identity is the new perimeter.</p><p>The challenge isn&#8217;t keeping the agent in a box; it&#8217;s governing the agent&#8217;s identity. We need to treat every AI agent as a distinct Non-Human Identity (NHI) with its own credentials, its own strictly scoped permissions, and its own audit logs. We need systems that can monitor not just what an agent is doing, but why it is doing it, enforcing circuit breakers that require human intervention for high-stakes operations.</p><h2>The Bottom Line</h2><p>Containerizing an AI agent is like putting a bank robber in a vault and handing them the combination. The vault is secure, but the assets are still gone. True enterprise security for autonomous agents requires a fundamental shift from isolating the software to governing its identity and its actions. Until we build architectures that can manage non-human identities at scale, the &#8220;secure local agent&#8221; will remain an illusion.</p><div><hr></div><h2>I&#8217;m Launching a Course!</h2><p>So many AI projects die. And that&#8217;s not the fault of the tech nerds: They built the demo, and it worked. Still, 90% (yes, really) of all AI models never make it into production. So let&#8217;s dig deep into the big organizational underbellies, and let&#8217;s find out how we can make those numbers a bit better.</p><p>That&#8217;s the challenge I&#8217;ll be tackling in a new course starting April 21 at GenAI Academy, where we walk through how to actually move an agentic AI system from demo to production &#8212; including the organizational architecture required to make it work. This is for technical leaders, senior engineers, product managers, and AI/ML team leads.</p><p>I&#8217;m really excited to be able to bring what I&#8217;ve seen from the inside and outside to you in this format. You&#8217;ll experience me teaching live over 6 weeks! You&#8217;ll find all the details here: <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a>. It&#8217;s not too late to sign up &#8212; recordings of previous sessions are available to all participants.</p>]]></content:encoded></item><item><title><![CDATA[The Day the Agents Escaped the Sandbox]]></title><description><![CDATA[Why OpenClaw is forcing enterprises to rethink identity, security, and what it means to automate work.]]></description><link>https://newsletter.wangari.global/p/the-day-the-agents-escaped-the-sandbox</link><guid isPermaLink="false">https://newsletter.wangari.global/p/the-day-the-agents-escaped-the-sandbox</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Tue, 21 Apr 2026 06:02:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!K69j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K69j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K69j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!K69j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!K69j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!K69j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K69j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:115321,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://wangari.substack.com/i/194497374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K69j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!K69j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!K69j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!K69j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92a33b88-bc50-478e-857f-e4c3e11a920c_1344x768.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agents are moving AI out of the cute chat interface and into the real world. Don&#8217;t let them sneak up behind you. Image generated with Leonardo AI</figcaption></figure></div><p>I remember the exact moment I realized the chatbot era was over. It wasn&#8217;t a grand announcement or a glossy keynote. It was a quiet Tuesday afternoon when a colleague showed me a terminal window running a new open-source tool called OpenClaw. They didn&#8217;t type a prompt asking for a summary or a polite email draft. They typed: &#8220;Prepare the weekly sales update.&#8221;</p><p>What happened next was fundamentally different from anything I had seen before. The system didn&#8217;t just generate text. It broke the objective into steps. It went even past the Claude tricks that had blown my mind so much. This thing pulled data from an internal CRM, structured the information, validated the outputs against historical records, and drafted an email to stakeholders. It didn&#8217;t just advise; it <em>did the thing</em>. It acted with delegated authority across multiple systems, without a single human click in between.</p><p>For a brief moment, it felt like magic. Then, the reality of enterprise security set in, and the magic quickly turned into a cold sweat.</p><p>If one software agent touches five different systems, does it carry one identity or many? Who approves its access? How is its activity logged and reviewed? And most importantly, what defines acceptable behavior when the agent itself decides the next step?</p><p>We are witnessing a paradigm shift in financial services and enterprise operations. We are moving from AI as a passive assistant to AI as an autonomous agent. And as tools like OpenClaw gain traction, they are exposing the fragility of our current enterprise identity models.</p><h2>The Illusion of the Human-Initiated Workflow</h2><p>For decades, enterprise security has been built on a single, foundational assumption: humans initiate actions. Our entire architecture&#8212;from single sign-on (SSO) to role-based access control (RBAC)&#8212;is designed around the idea that a person logs in, requests access to a resource, performs a task, and logs out. Permissions are scoped to the individual&#8217;s role, and audit logs trace actions back to a human intent.</p><p>Autonomous agents break this model entirely.</p><p>OpenClaw and its enterprise equivalents don&#8217;t wait for a human to click a button. They operate continuously, grinding through long, multistep workflows. They inherit permissions, often broadly scoped, and use them to navigate across collaboration tools, internal applications, and external services. They sit between systems, moving data and triggering actions in ways that traditional security tools simply cannot see.</p><p>When an agent acts independently, the concept of &#8220;intent&#8221; becomes incredibly difficult to reconstruct. If an agent hallucinates or is manipulated via a prompt injection attack, it might execute a series of unauthorized actions&#8212;like attempting a crypto transaction or exfiltrating sensitive data&#8212;at machine speed. The logs will show that the actions were performed by an authorized account, but they won&#8217;t explain why.</p><h2>The Engine Room vs. The Front Door</h2><p>The problem isn&#8217;t that we lack security tools; it&#8217;s that our tools are looking in the wrong place.</p><p>Most enterprise security stacks are designed to monitor the &#8220;front door&#8221;&#8212;application configurations, user login events, and permission settings. This made sense when risk lived inside discrete systems. But the attack surface has moved.</p><p>The real risk now lies in the &#8220;engine room&#8221;&#8212;the runtime layer where AI agents move sensitive data between systems, where OAuth tokens grant persistent cross-platform access, and where a single compromised integration can cascade silently across an entire supply chain.</p><p>Recent data paints a stark picture: A 2026 survey of 500 U.S. enterprise CISOs revealed that <a href="https://kenhuangus.substack.com/p/the-agentic-ecosystem-security-gap">99.4% of organizations</a> experienced at least one SaaS or AI ecosystem security incident in the previous year. Despite running an average of 13 dedicated security tools, nearly a third of these organizations experienced unauthorized data exfiltration through SaaS-to-AI integrations.</p><p>Our legacy tools are blind to API-to-API data flows and cross-app data movement. They audit which permissions exist, but they cannot see what an agent actually does with those permissions at runtime.</p><h2>The Wake-Up Call for Financial Services</h2><p>For professionals in banking, insurance, and asset management, this shift is particularly acute. We operate in highly regulated environments where strict access controls and human-in-the-loop approvals are not just best practices; they are legal requirements.</p><p>The promise of agentic AI in financial services is immense. Imagine an Account Servicing Agent that instantly handles profile updates and document fulfillment, or a Dispute Resolution Agent that automatically classifies cases and gathers evidence [2]. These tools can drastically reduce manual handling and improve customer service.</p><p>But the risks are equally profound. If an autonomous agent is granted broad access to customer financial data and internal transaction systems, a single vulnerability could lead to catastrophic consequences. We cannot simply deploy these agents and hope our existing security posture will hold.</p><h2>The Bottom Line</h2><p>The era of autonomous AI agents is here, and it is not waiting for our security models to catch up. Tools like OpenClaw have made it clear that the value of cross-system automation is too great for enterprises to ignore.</p><p>But we must recognize that agent security is, fundamentally, identity security. We need to move beyond the illusion of the human-initiated workflow and build architectures that can govern non-human identities at scale. We need explicit identity boundaries, configurable controls for agent behavior, and real-time visibility into decision paths.</p><p>The advantage in the coming years will not belong to the organizations that deploy the most agents. It will belong to those that figure out how to deploy them safely.</p><div><hr></div><h2>I&#8217;m Launching a Course!</h2><p>So many AI projects die. And that&#8217;s not the fault of the tech nerds: They built the demo, and it worked. Still, 90% (yes, really) of all AI models never make it into production. So let&#8217;s dig deep into the big organizational underbellies, and let&#8217;s find out how we can make those numbers a bit better.</p><p>That&#8217;s the challenge I&#8217;ll be tackling in a new course starting April 21 <strong>(today!)</strong> at GenAI Academy, where we walk through how to actually move an agentic AI system from demo to production &#8212; including the organizational architecture required to make it work. This is for technical leaders, senior engineers, product managers, and AI/ML team leads. It&#8217;s not too late to sign up &#8212; and your company might have the budget to cover the course expense.</p><p>I&#8217;m really excited to be able to bring what I&#8217;ve seen from the inside and outside to you in this format. You&#8217;ll experience me teaching live over 6 weeks! You&#8217;ll find all the details here: <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a>.</p><div><hr></div><h2>Reads of the Week</h2><ul><li><p><a href="https://kenhuangus.substack.com/p/the-agentic-ecosystem-security-gap">The Agentic Ecosystem Security Gap</a>: In this deep dive for Agentic AI, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Ken Huang&quot;,&quot;id&quot;:1160339,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3d670301-204b-472e-a2ee-bbb1b7633a99_2026x2026.png&quot;,&quot;uuid&quot;:&quot;30f6f453-5675-4547-ac47-d9005ae765b0&quot;}" data-component-name="MentionToDOM"></span> breaks down a startling report revealing that 99.4% of surveyed enterprises experienced a SaaS or AI security incident last year. He argues that current security tools are blind to the &#8220;engine room&#8221; where AI agents operate across systems, a critical blind spot for financial institutions relying on legacy identity models. If you want to understand why your current security stack won&#8217;t protect you from autonomous agents, read this.</p></li><li><p>In this piece for Cashless: Fintech, CBDC and AI at the speed of Asia, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Rich Turrin&quot;,&quot;id&quot;:2297487,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/eedf2263-0323-4bce-897d-7a17e22c436f_918x918.png&quot;,&quot;uuid&quot;:&quot;b320fdea-0050-4c47-9ab7-35fb17c6eb55&quot;}" data-component-name="MentionToDOM"></span> explores the harsh reality of AI agent deployment in the banking sector, arguing that executives will bypass assistive AI in favor of autonomous agents to cut costs. He connects the theoretical capabilities of agents to concrete banking roles, from customer consultation to dispute resolution. <a href="https://richturrin.substack.com/p/your-banking-job-and-ai-agents-human">Your Banking Job and AI Agents</a> is  a sobering look at the immediate impact of autonomy on the financial workforce.</p></li><li><p>A structural transformation is necessary to secure AI-native operations, argues <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Ben Lorica &#32599;&#29790;&#21345;&quot;,&quot;id&quot;:969577,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/e5db607c-0a85-4d98-8f35-7aada811bc0c_253x115.jpeg&quot;,&quot;uuid&quot;:&quot;348967b3-dc5a-492e-ba05-1ed6ce160402&quot;}" data-component-name="MentionToDOM"></span> in <a href="https://gradientflow.substack.com/p/security-for-ai-native-companies">The 6 security shifts AI teams can&#8217;t ignore in 2026</a>. He explains how the shift to agentic systems creates vulnerabilities like &#8220;goal hijacking&#8221; and demands a Zero Trust strategy that treats every agent as a distinct identity. This is essential reading for anyone tasked with integrating AI agents into enterprise access management frameworks (including myself).</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Nerds Are Losing Their Last Refuge]]></title><description><![CDATA[Computers are becoming more human &#8212; and work is becoming less logical as a result.]]></description><link>https://newsletter.wangari.global/p/nerds-are-losing-their-last-refuge</link><guid isPermaLink="false">https://newsletter.wangari.global/p/nerds-are-losing-their-last-refuge</guid><dc:creator><![CDATA[Ari Joury]]></dc:creator><pubDate>Fri, 17 Apr 2026 06:01:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!OI3R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OI3R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OI3R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!OI3R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!OI3R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!OI3R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OI3R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:281429,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://wangari.substack.com/i/193805381?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OI3R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 424w, https://substackcdn.com/image/fetch/$s_!OI3R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 848w, https://substackcdn.com/image/fetch/$s_!OI3R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!OI3R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7f964d-908a-4c81-ac13-2df1836c3282_1344x768.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tech work was once a safe haven for people who have difficulties relating to complicated human beings. Image generated with Leonardo AI</figcaption></figure></div><p>For decades, programming, physics, math, and engineering allowed people to live mostly in logical space. If you were analytical, introverted, neurodivergent, or simply uncomfortable with the messy dynamics of human interaction, the computer became a stable partner. It was a refuge.</p><p>I know this from personal experience. My path through particle physics and then into AI and data science was, in part, a path toward a world that made sense. A world where the rules were clear, the feedback was objective, and the right answer was always, in principle, discoverable. The computer did not have bad days. It did not misread your tone. It did not hold grudges.</p><p>But that world is disappearing. And the shift is more profound than most technical professionals have yet fully reckoned with.</p><h2>The Nerd Refuge: Why It Existed</h2><p>The appeal of technical fields to analytical and introverted people was not accidental. It was structural. Old computers were deterministic. You wrote a function, and it executed in exactly the same way every time. If it failed, the failure had a cause, and that cause was traceable. The feedback loop was immediate, objective, and, crucially, free of social judgment.</p><p>This attracted people who were uncomfortable with ambiguity. People who found the social dynamics of human interaction exhausting or unpredictable. People who wanted to be evaluated on the quality of their reasoning, not on their ability to navigate office politics or read a room.</p><p>The result was a culture. Engineering departments, physics labs, and quantitative finance desks became places where a certain kind of person could thrive. The brilliant but socially awkward developer. The quant who hates meetings. The engineer who only wants Jira tickets. These archetypes were not just personality quirks; they were adaptations to an environment that rewarded a specific kind of intelligence.</p><h2>AI Changes the Nature of Computers</h2><p>New computers are probabilistic. They are contextual. They are conversational. We now interact with machines much like we interact with people. When you prompt a large language model, you are not executing a command; you are guiding a conversation. The output is not guaranteed to be identical every time. It depends on the context, the phrasing, and the underlying probability distributions of the model&#8217;s training.</p><p>This shift is not merely technical. It is epistemological. The old model of computation was based on the idea that a machine could be fully specified. You could, in principle, trace every output back to every input. The new model is based on the idea that a machine learns patterns from data and generates responses that are statistically likely, not logically certain.</p><p>This has profound implications for how we build and evaluate AI systems. You cannot simply read the code to understand why a model behaves the way it does. You have to observe it, test it, and interpret its outputs in context. You have to develop intuitions about its failure modes and edge cases. You have to think probabilistically, not deterministically.</p><h2>The Irony of Human Complexity in Technical Work</h2><p>The irony is that the more human computers become, the more technical work involves judgment, ambiguity, and interpretation. In other words, it involves human complexity.</p><p>Consider the process of building an AI agent. You are no longer just writing code to perform a specific task. You are designing a system that must interpret intent, handle edge cases gracefully, and make decisions based on incomplete information. You must think about how the system will behave when a user asks it something unexpected. You must anticipate the ways in which the system&#8217;s outputs might be misinterpreted or misused.</p><p>This requires a level of empathy and systemic understanding that was previously the domain of product managers and designers. The technical professional must now bridge the gap between the deterministic world of traditional software and the probabilistic world of AI. They must understand not just how to build the system, but how the system will behave in the wild, interacting with unpredictable human users in unpredictable contexts.</p><p>The bottleneck in technical work has shifted. It is no longer about writing the code. It is about problem definition, system design, and evaluation. It is about the human coordination required to turn a working demo into a reliable system inside an organization.</p><h2>Robotics Won&#8217;t Save Us</h2><p>You might think that robotics offers a remaining refuge of purely mechanical engineering. The physical world, at least, is deterministic. A robot arm that picks up a component either succeeds or fails. The physics is clear.</p><p>But even robotics is becoming AI-driven, software-mediated, and model-dependent. The physical world is being abstracted into data, and the machines that navigate it are increasingly relying on the same probabilistic models that power conversational AI. Modern robotic systems use deep learning for perception, reinforcement learning for control, and large language models for task planning. The boundary between the physical and the digital is blurring, and the skills required to navigate both are converging.</p><p>The refuge of purely mechanical engineering is shrinking. Even in the most hardware-adjacent domains, the work is increasingly about designing systems that learn, adapt, and make decisions under uncertainty.</p><h2>What This Means for Nerd Culture</h2><p>This shift presents three possible futures for nerd culture and the technical professions.</p><p>The first is retreat. Some technical professionals will seek out the remaining pockets of purely deterministic work. Low-level systems programming, theoretical mathematics, formal verification&#8212;these are areas where the old rules still apply. This is a legitimate path, but it is a narrowing one. The frontier of technical work is moving rapidly away from pure determinism.</p><p>The second is resistance. Some will cling to the old ways of working, arguing that AI is a fad or that it cannot replace the rigor of traditional engineering. This is understandable, but it is ultimately a losing position. The tools are changing, and the organizations that do not adapt will be left behind.</p><p>The third is evolution. Some will embrace the ambiguity and complexity of the new landscape. They will learn to design systems that integrate human and machine intelligence, leveraging the strengths of both. They will develop new skills&#8212;communication, empathy, strategic thinking&#8212;not because they have abandoned their technical identity, but because they have expanded it.</p><p>This third group will dominate the future of technical work.</p><h2>The Evolution of the Technical Professional</h2><p>The evolution into systems thinkers requires a fundamental shift in mindset. It means moving away from a focus on individual components and towards a holistic understanding of the entire system. It means recognizing that the technical architecture is inextricably linked to the organizational architecture.</p><p>This is not an easy transition. It requires developing new skills, such as communication, empathy, and strategic thinking. It requires learning to navigate the messy, ambiguous world of human interaction that many technical professionals initially sought to avoid. It requires tolerating uncertainty and making decisions with incomplete information.</p><p>But it is a necessary transition. And it is worth noting that many of the skills that technical professionals have developed&#8212;rigorous thinking, attention to detail, the ability to decompose complex problems&#8212;are highly transferable to this new landscape. The challenge is not to abandon these skills, but to apply them in a broader context.</p><p>The organizations that succeed in the AI era will be the ones that can effectively integrate human and machine intelligence. And that requires technical professionals who can bridge the gap between the two. Not everyone has to become a communicator. But the interface between humans and machines must be owned by someone who understands both sides.</p><h2>The Bottom Line</h2><p>For decades, nerds escaped into machines because machines were simpler than humans. Now the machines are learning to talk back. The refuge of pure logic is disappearing, replaced by a new landscape of probabilistic complexity.</p><p>The challenge for technical professionals is not to resist this change, but to embrace it. The skills that made you valuable in the old world&#8212;rigorous thinking, deep focus, the ability to decompose complex problems&#8212;are still valuable. But they need to be applied in a broader context, one that includes the messy, ambiguous reality of human organizations and probabilistic AI systems.</p><p>The best technical professionals of the next decade will be those who can design systems, think clearly, and bridge the gap between humans and machines. Not because they have abandoned their technical identity, but because they have expanded it to meet the demands of a new era.</p><div><hr></div><h2>I&#8217;m Launching a Course!</h2><p>So many AI projects die. And that&#8217;s not the fault of the tech nerds: They built the demo, and it worked. Still, 90% (yes, really) of all AI models never make it into production. So let&#8217;s dig deep into the big organizational underbellies, and let&#8217;s find out how we can make those numbers a bit better.</p><p>That&#8217;s the challenge I&#8217;ll be tackling in a new course starting April 21 at GenAI Academy, where we walk through how to actually move an agentic AI system from demo to production &#8212; including the organizational architecture required to make it work. This is for technical leaders, senior engineers, product managers, and AI/ML team leads.</p><p>I&#8217;m really excited to be able to bring what I&#8217;ve seen from the inside and outside to you in this format. You&#8217;ll experience me teaching live over 6 weeks! You&#8217;ll find all the details here: <a href="https://academy.genai.works/courses/from-demo-to-production/details?utm_campaign=academy_launch&amp;utm_source=instructor&amp;utm_medium=ari_joury&amp;utm_content=from_demo_to_production">From Demo to Production</a>.</p>]]></content:encoded></item></channel></rss>