Airalo WW

Andrej Karpathy’s concept of the “March of Nines” highlights that achieving 90% reliability in AI systems is not suff…

Wondershare WW

According to Andrej Karpathy, reaching 90% reliability in AI systems is just the beginning, as each additional nine requires comparable engineering effort. This concept is known as the “March of Nines,” which highlights the challenges of achieving high reliability in AI-powered systems. Karpathy notes that “every single nine is the same amount of work,” emphasizing the significant effort required to achieve each additional nine.

The “March of Nines” is particularly relevant in enterprise settings, where the distance between “usually works” and “operates like dependable software” determines adoption. In a typical enterprise workflow, multiple steps are involved, including intent parsing, context retrieval, planning, tool calls, validation, formatting, and audit logging. If a workflow has n steps and each step succeeds with probability p, end-to-end success is approximately p^n. This means that even small failures in individual steps can compound and significantly impact overall reliability. For example, in a 10-step workflow with 90% per-step success, the end-to-end success rate is approximately 34.87%, resulting in ~6.5 interruptions per day.

To achieve higher reliability, teams must define measurable Service Level Objectives (SLOs) and invest in controls that reduce variance. This can be done by setting SLO targets per workflow tier and managing an error budget to ensure experiments stay controlled. Andrej Karpathy advises teams to “spend a bit more time to be more concrete in your prompts” and to focus on turning reliability into measurable objectives. Nine key levers can help add nines, including constraining autonomy, enforcing contracts, layering validators, routing by risk, engineering tool calls, making retrieval predictable, building a production evaluation pipeline, investing in observability, and shipping an autonomy slider with deterministic fallbacks.

Experts like Nikhil Mungel, who has been building distributed systems and AI teams at SaaS companies for over 15 years, emphasize the importance of disciplined engineering to achieve high reliability. By following a closing checklist that includes picking a top workflow, defining its completion SLO, and instrumenting terminal status codes, teams can start to achieve the later nines. This requires a significant investment in reliability work, including bounded workflows, strict interfaces, resilient dependencies, and fast operational learning loops. As McKinsey‘s 2025 global survey reports, 51% of organizations using AI experienced at least one negative consequence, and nearly one-third reported consequences tied to AI inaccuracy, highlighting the need for stronger measurement, guardrails, and operational controls.

The impact of achieving high reliability in AI systems cannot be overstated. By focusing on the “March of Nines” and investing in the nine key levers, enterprises can reduce business risk and achieve dependable enterprise-grade software. As OpenAI, Nvidia, and other industry leaders continue to push the boundaries of AI innovation, the importance of reliability will only continue to grow. With the right approach and investment, enterprises can unlock the full potential of AI and achieve significant benefits, from improved efficiency to enhanced customer experiences. The journey to achieving high reliability is complex, but with the right mindset and expertise, it is possible to overcome the challenges and achieve the later nines, ultimately driving business success and growth.

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts

AliExpress WW