In early March 2026, a report sent shockwaves through the IT security industry: an autonomous AI agent built by the startup CodeWall had, according to the company's own account, exploited a critical vulnerability in McKinsey's internal AI platform “Lilli” within just two hours. The platform, used by more than 43,000 McKinsey consultants worldwide, was reportedly accessible via unauthenticated API endpoints and a classic SQL injection flaw. The incident raises questions – not only about the security of enterprise AI platforms, but also about the credibility of the parties involved, the regulatory consequences and the true extent of the breach.

At a Glance

What: SQL injection attack on McKinsey's internal AI platform “Lilli” by an autonomous AI agent

Who: The startup CodeWall (provider of AI-based red teaming) as attacker; McKinsey as operator

When: 1 March 2026 (reported) – 2 March (patched) – 9 March (published by CodeWall)

Method: SQL injection via unauthenticated API endpoints, JSON key processing as entry point

Claimed exposure: 46.5 million chat messages, 728,000 files, 57,000 user accounts (not independently verified)

Status: Vulnerability confirmed and patched; actual data compromise disputed

What Happened? The Chronology of the Incident

McKinsey's AI platform Lilli was introduced in 2023 and serves as an internal knowledge management tool. Consultants use Lilli to access the firm's collective knowledge – from previous project reports and industry analyses to internal methodologies. The platform processes natural language queries and draws on an extensive knowledge base comprising documents, chat histories and structured data.

On 1 March 2026, the cybersecurity startup CodeWall deployed an autonomous AI agent against Lilli's publicly reachable interfaces. The agent first identified the platform's publicly accessible API documentation. Of the 22 API endpoints in total, 20 were reportedly accessible without any authentication – a fundamental architectural flaw that enabled an attacker to systematically explore the entire attack surface.

The decisive breakthrough came via a SQL injection vulnerability in the JSON key processing of one of the open endpoints. Key names from JSON requests were concatenated directly into SQL queries – without any validation or parameterisation. Through this vector, the agent reportedly gained full read and write access to the database, including the ability to manipulate the AI's system prompts.

Here a first inconsistency arises: CodeWall refers to the “production database”, whereas McKinsey took the “development environment” offline after the patch. Whether the attack targeted the production or a development environment is a significant difference in terms of impact – and a question that has not been conclusively resolved.

McKinsey responded within 24 hours: the vulnerability was patched on 2 March. A nine-day forensic investigation followed – typical incident response analyses for comparable breaches usually take several weeks to months. On 9 March, CodeWall published the details of the attack without observing the responsible disclosure period customary in the security community.

Chronology of the Lilli Hack
From platform launch to public controversy
2023
Lilli launched at McKinsey
Internal AI platform for knowledge management goes live. Over 43,000 consultants use the system.
1 March 2026
CodeWall reports vulnerability
Autonomous AI agent gains access in two hours via SQL injection through unauthenticated API endpoints.
2 March 2026
McKinsey patches and takes dev environment offline
Response within 24 hours. Forensic investigation initiated.
9 March 2026
Publication by CodeWall
The startup publishes attack details. McKinsey disputes the claimed extent of data compromise.

What Is Confirmed – and What Is Not?

When assessing the Lilli hack, a differentiated examination of the source landscape is essential. Both parties – CodeWall and McKinsey – have substantial commercial interests that influence their account.

Claim Status Assessment
SQL injection vulnerability in Lilli Confirmed Evidenced by McKinsey's own statement and the immediate patch
20 unauthenticated API endpoints Plausible Technically consistent with the attack pattern, not independently verified
46.5m messages / 728,000 files exposed Not evidenced No PoC payloads, screenshots or hashes provided
Write access to system prompts Claimed No independent evidence; would be security-critical as it enables manipulation of AI responses
Client data actually compromised Disputed McKinsey denies; forensics reportedly found no evidence of data compromise

Two Parties, Two Interests

CodeWall is a startup selling AI-based red teaming services. A successful hack against the most recognised consulting name in the world delivers enormous free publicity and validates its own business model. The incentive to present the most dramatic account possible is obvious. It is notable that the most spectacular figures – 46.5 million messages, 728,000 files, 57,000 user accounts – originate exclusively from CodeWall and are not supported by screenshots, hashes or proof-of-concept payloads.

McKinsey, in turn, has an equally strong counter-interest. Trust is a consulting firm's core product. If clients must fear that their strategic conversations with McKinsey consultants end up in a compromised database, the business foundation is at stake. The interest in downplaying the incident is obvious. Moreover, the nine-day forensic window between patch and publication is comparatively short for a thorough analysis. According to a report by the Financial Times, McKinsey emphasised that “files were stored separately and were never at risk”.

The vulnerability was real and embarrassing. The precise extent lies somewhere between CodeWall's dramatic figures and McKinsey's reassurances – both sides have commercial reasons for their version.

The Client Perspective: The Blind Spot

In the public debate between CodeWall and McKinsey, a third party is overlooked: McKinsey's clients. When consultants enter queries into Lilli such as “Should our client close location X?” or feed confidential strategy documents into the knowledge base, a compromise affects not only McKinsey's own data. The question of what confidential corporate information was shared in Lilli chats and whether those clients can be confident that their data was not exposed has yet to receive a satisfactory answer.

Regulatory Consequences

If personal data was indeed affected – and with 57,000 user accounts containing chat histories, this can hardly be ruled out – the GDPR notification obligation under Art. 33 applies: the data controller must notify the competent supervisory authority within 72 hours of a personal data breach. Whether McKinsey filed such a notification is not known. The assertion that there was “no data compromise” may also be attributable to the fact that confirmation would automatically trigger reporting obligations and potential fines.

Additionally, since January 2025 the Digital Operational Resilience Act (DORA) has been applicable. McKinsey serves as an ICT third-party provider for numerous financial institutions. Under DORA, many of the measures described below – penetration testing, incident response, monitoring – are no longer recommendations for such providers but regulatory obligations.

Why Was the Attack Possible?

Regardless of how many records were actually exposed, the fact that the attack was possible at all merits closer analysis. The most uncomfortable insight: it was not the AI that was hacked, but the classic infrastructure underneath it. That a platform of this profile failed at precisely this point suggests systematic deficiencies.

SQL Injection – the Attack Vector in Detail

To understand why the Lilli hack was so straightforward, it is worth examining how SQL injection works. Web applications communicate with databases via SQL queries – structured commands that retrieve, insert or modify data. The problem arises when user inputs are incorporated directly into these commands without prior validation.

A simplified example: when a consultant searches Lilli for the topic “Automotive”, the application internally generates a database query such as SELECT * FROM documents WHERE topic = 'Automotive'. This works as expected – the database returns all documents on the topic of automotive.

In a SQL injection, the attacker enters a manipulated string instead of a normal search term, which the database interprets as a command. Instead of “Automotive”, an attacker might enter ' OR 1=1 --. The database interprets this as: “Give me all documents where the topic is empty or 1 equals 1” – and since 1 always equals 1, the database returns every single record. The two dashes at the end comment out the remainder of the original query, so no error message is produced.

SQL INJECTION – HOW THE ATTACK WORKS NORMAL QUERY USER INPUT Automotive GENERATED SQL QUERY SELECT * FROM docs WHERE topic = ' Automotive ' Result: Only Automotive documents are returned. SQL INJECTION – MANIPULATED INPUT ATTACKER INPUT ' OR 1=1 -- GENERATED SQL QUERY (MANIPULATED) SELECT * FROM docs WHERE topic = ' ' OR 1=1 -- ' The database reads: WHERE topic = '' OR 1=1 -- (rest ignored) Result: 1=1 is ALWAYS true. The database returns ALL records – without any restriction. DEFENCE: PARAMETERISED QUERY SAME ATTACKER INPUT ' OR 1=1 -- SQL TEMPLATE (QUERY + PARAMETER SEPARATED) SELECT * FROM docs WHERE topic = ? (Parameter: "' OR 1=1 --") The database treats the entire input as harmless text – not as a command. No data leak. Principle: Inputs and commands are strictly separated. The database can no longer execute user input as code.

In the case of Lilli, the vulnerability was particularly insidious: the entry point was not the actual search inputs of users but the key names in JSON requests – that is, the technical field designations in the communication between browser and server. Instead of a normal JSON field such as {"topic": "Automotive"}, the attacker could manipulate the key itself: {"topic' OR 1=1--": "irrelevant"}. Because the developers may have validated the values but not the key names, this manipulated text was incorporated directly into the database query.

The defence against this is neither new nor exotic – it is available in every modern programming language: parameterised queries. With this approach, the SQL query and user inputs are processed strictly separately. The database first receives the command as a fixed template with placeholders, then the input values separately. Because the database treats the inputs only as data – never as executable code – SQL injection is structurally impossible. That a platform with the maturity and user base of Lilli went into production without this protection points to missing security code reviews, insufficient quality assurance or a culture in which speed was systematically prioritised over security.

Open API Endpoints

According to CodeWall, 20 of 22 API endpoints were accessible without authentication. Publicly accessible API documentation further provided the attacker with a detailed map of the attack surface. The absence of a central API gateway with uniform authentication enforcement points to an architecture in which security was not implemented as a cross-cutting concern but – if at all – left to each individual service.

Missing Monitoring

CodeWall's AI agent operated undetected on the systems for two hours. No anomaly detection was triggered, no alert was raised for the unknown API accesses, no rate-limiting mechanisms slowed the systematic enumeration. For a platform processing sensitive corporate data of one of the world's largest consulting firms, this is a grave omission.

Assessment

All three root causes are avoidable basics – none of them requires innovative security technology. In the Lilli case, a single one of these measures (parameterised queries or API authentication) would have completely prevented the attack. That all three were missing simultaneously points to a systemic problem.

What Enterprises Should Learn from This

The Lilli hack is not an isolated incident. It is emblematic of a pattern observable in many organisations: AI platforms are developed and rolled out at high speed, whilst the security of the underlying infrastructure fails to keep pace. The recommendations fall into two categories: fundamental infrastructure measures and AI-specific security precautions.

Infrastructure: Getting the Basics Right

1. Enforce secure database access

Parameterised queries must be used without exception – user inputs must never be concatenated into SQL statements. Input validation must occur at every system boundary, including for internal AI agents. The OWASP Top 10 should be treated as the mandatory minimum standard for every application. Security-focused code reviews before each deployment are not optional but obligatory.

2. Treat API security as an architectural principle

No API endpoint should be reachable without authentication – neither in production nor in staging or development environments. A central API gateway should enforce authentication uniformly, rather than delegating this responsibility to individual services. API documentation such as Swagger or OpenAPI must not be publicly exposed. Rate limiting must block systematic enumeration and brute-force attacks.

3. Implement zero-trust architecture

Every request must be verified – regardless of its network origin. The least-privilege principle must be enforced rigorously: each request receives access only to the minimally required data. Data domains such as chat histories, documents, system prompts and user management must be segmented. In particular, no write access to system prompts should be possible via user interfaces.

4. Establish monitoring and incident response

Anomaly detection must immediately flag mass queries. Real-time alerting on unusual access patterns – such as unknown API clients or atypical query frequencies – is indispensable. Audit logging for all database accesses provides the foundation for forensic investigations. Incident response plans must be prepared specifically for AI platforms, as these bring their own attack vectors.

AI-Specific Security Measures

Beyond fundamental infrastructure hardening, GenAI platforms require additional protective measures that address the particular risks of this technology.

5. Treat system prompts as sensitive configuration

System prompts define the behaviour of an AI platform. Whoever can manipulate them controls the system's responses – with potentially devastating consequences. System prompts must therefore be versioned and audited, with a defined change process. Write access should only be possible through a separate, secured admin interface. Regular integrity checks of active prompts are essential.

6. Add authorisation checks to RAG retrievals

Retrieval-Augmented Generation (RAG) accesses knowledge databases. In doing so, the AI must respect the permissions of the requesting user – a consultant must not be able to access documents via the AI for which they lack authorisation. Output filtering against cross-user data leakage and consistent data classification within the knowledge base are further necessary measures.

7. Establish offensive security testing for AI platforms

Regular penetration tests must explicitly include AI platforms. Red teaming with automated AI agents – exactly the method CodeWall employed – should be part of your own security programme. Bug bounty programmes with a clearly defined scope create additional security layers. Crucially: penetration tests must take place before go-live, not only during live operations.

8. Implement prompt injection protection

Strict separation of system and user prompts is the first line of defence against prompt injection. Input sanitisation for all user queries, monitoring for known prompt injection patterns and regular testing with known attack vectors complement the protection. These measures are obligatory for any GenAI platform in an enterprise environment.

An AI platform is only ever as secure as the infrastructure it runs on. GenAI-specific measures complement – but never replace – classic IT security.

Priority Matrix: What Would Have Prevented the Lilli Hack?

Not all measures have the same leverage. The following matrix ranks the most important security measures by their effectiveness in this specific case and the required implementation effort:

# Measure Would have prevented the hack? Effort
1 Enforce parameterised queries Yes – directly Low
2 Authenticate all API endpoints Yes – directly Low
3 Do not publicly expose API documentation Reconnaissance hindered Low
4 Anomaly detection at API/DB level Early detection Medium
5 Zero trust / least privilege Blast radius limited Medium
6 Regular penetration tests incl. AI platforms Found beforehand Medium
7 Security review before deployment Found beforehand Low

The message is clear: the most effective measures are simultaneously the simplest and most cost-effective. Enterprises operating AI platforms should audit their own systems using exactly this priority logic – secure the basics first, then address the more complex measures.

Conclusion: AI Security Starts with the Basics

The McKinsey Lilli hack is a case study – though less about the dangers of AI than about the consequences of neglecting basic IT security. The vulnerability that CodeWall's agent exploited was not a novel AI attack, not a zero-day exploit, not a supply chain attack. It was SQL injection – an attack vector that OWASP has documented since its inception.

This should give pause to enterprises that operate or plan their own GenAI platforms. The temptation is great to focus on the supposedly innovative risks of AI – prompt injection, hallucinations, model poisoning – whilst neglecting the security requirements of the underlying infrastructure. The Lilli hack demonstrates: the greatest danger lurks not in the AI itself but in the fundamentals that are taken for granted and therefore no longer tested.

At the same time, the case counsels caution when assessing spectacular security incidents. The core question is not whether 46.5 million or “only” a few million messages were exposed – the core question is why a platform of this stature could be vulnerable at all. And this question must be asked not only by McKinsey but by every organisation that entrusts its most confidential data to AI platforms operated by third parties.

The Lilli hack could mark a turning point. If even McKinsey – a firm that advises other organisations on security strategies – fails at the basics, organisations must question their assumptions about the security competence of their service providers. Demand penetration test reports and SOC 2 certifications. Ask how your partners' AI platforms are secured. And first examine the obvious in your own systems: Are all API endpoints authenticated? Do all database accesses use parameterised queries? Is there monitoring that detects unusual access patterns? Anyone unable to answer one of these questions immediately has found their field of action.

newsletter
the agentic banker

Keep reading – in your inbox every two weeks.

Capital markets insights, regulatory updates and AI trends. Concise, well-founded, free of charge.

GDPR-compliant. Unsubscribe at any time.

← Back to overview