Knowledge
Blog
November 6, 2025
5
mn
read
David Berenstein

Are AI browsers safe? A security and vulnerability analysis of OpenAI Atlas

OpenAI's Atlas browser is powered by ChatGPT, but its design choices expose unknowing users to numerous risks. They were drawn in by the wonderful marketing promise of fast, helpful, and reliable AI, while articles about exploitation continue to flood the news, just days after the beta release.
OpenAI Atlas browser security risks | LLM vulnerability analysis

Are AI browsers safe? A security and vulnerability analysis of OpenAI Atlas

OpenAI's new Atlas browser promises an AI assistant that follows you across every website, remembers your preferences, and handles tedious work autonomously. However, this comes with security risks that organisations and individuals should consider first.

What’s the main problem with the Atlas browser? It became an attack surface

OpenAI's Atlas browser is built on top of Chromium, like almost 70% of the browsers on the internet. However, OpenAI decided to integrate ChatGPT at the architectural level, rather than as a browser extension or isolated component. This design choice provides the AI with direct access to the browser's full context, encompassing every open tab, form field, and authenticated session across all domains simultaneously. Yes! You give ChatGPT access to everything, everywhere, all at once, which makes for a great movie plot but results in terrible security and privacy practices.

And it gets even better. You can decide to enable the “agent mode” to make it even “faster and more useful”, which gives the Atlas the ability to interact with web pages programmatically by clicking elements, submitting forms, and navigating between sites with your credentials.

From a threat modelling perspective, this architecture significantly expands the attack surface. While traditional browser exploits target isolated vulnerabilities, such as Cross-Site Scripting (XSS) or Remote Code Execution (RCE), Atlas presents threat actors and adversaries with a trusted component that already possesses cross-domain visibility and the capability to perform authenticated actions on behalf of users. The AI agent operates with unlimited permissions, making it an attractive target for prompt injection and other LLM-specific attack vectors. A single well-placed hack or unlucky mistake will compromise all of this, leaving you responsible for picking up the pieces.

Atlas browser architecture and LLM security risks

Traditional browsers enforce the Same-Origin Policy (SOP) and use process-level sandboxing to isolate websites. This ensures that JavaScript from domain A cannot access the DOM, cookies, or local storage of domain B. These boundaries are the key to web application security and have been the market standard for a reason. Chromium even

Atlas deviates from this model by design. The integrated AI agent operates with cross-origin visibility. It's not untrusted code attempting to bypass isolation mechanisms, but rather a privileged component with legitimate access to all browsing context. Even though it is marketed as “fast and useful”, it is a security nightmare and the wet dream of every hacker, as it introduces many OWASP AI risks:

  • OWASP LLM 01 - Persistent memory vulnerable to injection attacks The browser memory feature creates a comprehensive behavioural profile by logging browsing patterns, form inputs, and interaction data. Unlike distributed browser history or isolated session storage, this centralised repository represents a high-value target. The Attack (Injection): Attackers use techniques like Cross-Site Request Forgery (CSRF) to inject malicious instructions directly into a user's ChatGPT memory without their knowledge. This injection successfully alters the model's internal state or context. When the user later queries ChatGPT for legitimate purposes, these compromised memories are invoked, potentially triggering unauthorised actions or data exfiltration. This attack focuses on corrupting the LLM's input source or internal instructions.
  • OWASP LLM 02 - Unintended data exfiltration. ChatGPT processes content from all visited pages, including authenticated sessions in enterprise applications, email clients, CRM systems, and internal tools. This data is transmitted to OpenAI's infrastructure for inference. Organisations using Atlas may inadvertently expose customer data, proprietary information, or regulated content to a third-party LLM provider without proper data processing agreements or user consent mechanisms.
  • OWASP LLM 05 - Improper Output Handling. This risk is a downstream vulnerability where ChatGPT fails to secure the boundary between the AI and other system components, like tools and other browser applications. It occurs when the agent unsafely incorporates, executes, or renders any output generated by the model. The Vulnerability (Output Boundary Failure): If ChatGPT generates a response containing a malicious payload (e.g., JavaScript, SQL code, or shell commands) and the application fails to properly validate, sanitize, or encode that output before it is used (e.g., rendered in a browser, executed as a database query, or passed to a system function), it can lead to severe exploits like Cross-Site Scripting (XSS) or SQL Injection within the operational space of the browser. The key failure is the application's blind trust of the generated actions.
  • OWASP LLM 06 - Autonomous actions with compounding risk. In agent mode, the system makes real-time decisions about form submissions, button clicks, and navigation across authenticated sessions. ** A successfully injected malicious instruction can propagate across multiple domains and services before detection, leveraging the user's existing authentication tokens and session cookies. The temporal gap between compromise and detection creates opportunities for unauthorised data access, privilege escalation, or fraudulent transactions.
  • OWASP LLM 09 - Misinformation.  Misinformation from ChatGPT poses a core vulnerability for basic applications relying on the frontier model, which can lead to potential security breaches, reputational damage, and legal liability. This risk increases when ChatGPT has more autonomy and utilises internet sources to generate content. A related issue is over-reliance, where you or others may place excessive and unverified trust in LLM-generated content, especially in sensitive domains such as finance, law, or cybersecurity. For example, an LLM might confidently suggest an insecure or nonexistent code library; if a developer trusts this output without verification, they can introduce a vulnerability (LLM05) or inadvertently download malware (LLM03 - Supply Chain). The combination of fabricated output and user overtrust exacerbates the impact on critical decisions and processes.

As you can see, this design choice exposes unknowing users to numerous risks. They were drawn in by the wonderful marketing promise of fast, helpful, and reliable AI, while articles [1, 2, 3, 4]about exploitation continued to flood the news, just days after the beta release.

{{cta}}

OpenAI's own warning about enterprise deployment

Atlas remains in beta and is positioned by OpenAI as an experimental product for evaluation purposes, not production deployment. OpenAI's official enterprise documentation explicitly advises against deploying Atlas in regulated environments. The documentation states:

"We recommend caution using Atlas in contexts that require heightened compliance and security controls—such as regulated, confidential, or production data."

At this stage, the browser lacks some security certifications:

  • Not in scope for OpenAI are the SOC2 or ISO certification
  • No compliance API logs or SIEM integration
  • No Atlas-specific SSO enforcement or SCIM provisioning
  • No IP allowlists or private egress settings
  • Limited extension controls and browser policy management

The default configuration further complicates risk management. For Business tier customers, Atlas is enabled by default without administrative approval workflows. Enterprise customers have Atlas disabled by default, but workspace administrators can enable it through workspace settings without triggering security review processes. In both cases, the enterprise security controls available for other ChatGPT interfaces do not extend to Atlas browser sessions.

For more information, refer to OpenAI’s documentation.

Conclusion

The vulnerabilities identified in Atlas are inherent characteristics of agentic LLM architectures. Prompt injection attacks have been demonstrated across all major LLMs (Microsoft Copilot, Google Gemini, Anthropic Claude, Meta Llama…) because these systems operate on a different threat model than traditional applications. Where conventional security assumes malicious code attempting to escape isolation, agentic AI inverts this: the agent has broad permissions that allow it to process and execute instructions from untrusted environments. Atlas amplifies this risk by combining LLM-level vulnerabilities with browser-level access to authenticated sessions across all domains.

Organisations evaluating Atlas should perform risk-based deployment planning. For isolated testing environments with non-production data, Atlas provides a helpful platform for understanding agentic capabilities and developing internal expertise. For production environments handling customer data, regulated information, or business-critical operations, the absence of SOC2 coverage, audit trail infrastructure, and enterprise identity management makes Atlas unsuitable until these controls are implemented and independently validated. Organisations subject to GDPR, HIPAA, PCI-DSS, or similar regulatory frameworks should treat Atlas as out of scope for any systems processing regulated data.

About Giskard: We specialise in red teaming and securing LLM agents through continuous vulnerability testing. Our platform helps enterprises detect security issues in AI systems before they reach production. Ready to test your AI systems? Contact the Giskard team.

Continuously secure LLM agents, preventing hallucinations and security issues.
Book a demo

You will also like

Sycophancy in Large Language Models

When your AI agent tells you what you want to hear: Understanding Sycophancy in LLMs

Sycophancy in LLMs is the tendency of a model to align its responses with the user's beliefs or assumptions, prioritizing pleasing the user over truthfulness or factual accuracy. In this article, you'll learn what are the risks of sycophancy, see real-world scenarios about this weakness, and discover how to detect and prevent it in production AI systems.

View post
LLM vulnerability scanner for dynamic & multi-turn Red Teaming

[Release notes]: New LLM vulnerability scanner for dynamic & multi-turn Red Teaming

We're releasing an upgraded LLM vulnerability scanner that deploys autonomous red teaming agents to conduct dynamic, multi-turn attacks across 40+ probes, covering both security and business failures. Unlike static testing tools, this new scanner adapts attack strategies in real-time to detect sophisticated conversational vulnerabilities.

View post
AI Safety Research - Phare Benchmark - Bias Evaluation - Self-Coherency

LLMs recognise bias but also reproduce harmful stereotypes: an analysis of bias in leading LLMs

Our Phare benchmark reveals that leading LLMs reproduce stereotypes in stories despite recognising bias when asked directly. Analysis of 17 models shows the generation vs discrimination gap.

View post
Stay updated with
the Giskard Newsletter