DeepSeek’s Data Exposure: A Critical Security Oversight

30 January 2025

Overview

A Security Research has discovered a publicly accessible ClickHouse database belonging to DeepSeek, a Chinese AI startup known for its advanced AI models. This database, left completely open without authentication, provided full access to internal data, including over a million log entries containing chat histories, secret keys, backend details, and other highly sensitive information.

Our team immediately and responsibly disclosed this issue to DeepSeek, who promptly secured the exposed database. This blog post outlines our findings and explores the broader security implications for the AI industry.

Executive Summary

DeepSeek has recently gained global recognition for its cutting-edge AI models, particularly DeepSeek-R1, a reasoning model competing with industry leaders like OpenAI’s o1. However, as the company expanded rapidly, security researcher identified a critical security flaw in its infrastructure.

Within minutes of assessing DeepSeek’s external security posture, security researcher discovered a publicly accessible ClickHouse database at the following endpoints:

oauth2callback.deepseek.com:9000
dev.deepseek.com:9000

This database was entirely open, allowing unrestricted access to sensitive data such as:

Chat history logs
API secrets and backend operational details
System logs containing proprietary information

More alarmingly, the exposure allowed full database control, enabling potential privilege escalation within DeepSeek’s environment.

How Found the Exposure

Our reconnaissance started with a surface-level assessment of DeepSeek’s internet-facing assets. security researcher identified around 30 public subdomains, mostly hosting benign services like chatbot interfaces, status pages, and API documentation.

However, our scan extended beyond standard HTTP ports (80/443), revealing two uncommon open ports (8123 & 9000) on the following hosts:

http://oauth2callback.deepseek.com:8123
http://dev.deepseek.com:8123
http://oauth2callback.deepseek.com:9000
http://dev.deepseek.com:9000

Further analysis confirmed that these ports led to an exposed ClickHouse database, which was completely open to the public without authentication—an immediate security red flag.

Understanding ClickHouse

ClickHouse is a high-performance, open-source columnar database system widely used for real-time data analytics and log processing. Given its role in handling large datasets, an unprotected ClickHouse instance presents a serious security risk.

Using ClickHouse’s HTTP interface, security researcher accessed the /play path, which allowed us to execute arbitrary SQL queries directly via a web browser. Running a simple SHOW TABLES; query revealed multiple accessible datasets.

What Found

One table, log_stream, stood out due to its vast amount of sensitive data. This table contained:

Over 1 million log entries dating from January 6, 2025
Plaintext API keys and backend operational metadata
Internal references to DeepSeek’s API endpoints
Exposed chatbot conversation logs
System directory structures and service logs

The level of exposure posed significant risks, not only to DeepSeek but also to any users interacting with its services.

The Security Risks

This misconfiguration created multiple attack vectors:

Data Exfiltration – Attackers could retrieve sensitive logs, API keys, and user-generated chat messages.
Privilege Escalation – The lack of authentication allowed full control over database operations, potentially leading to deeper system compromise.
Leakage of Proprietary AI Data – Exposed logs may have contained internal AI model details, system architecture insights, and other proprietary data.
Potential Credential Exposure – In poorly secured environments, plaintext passwords and local files could be extracted using SQL queries.

(Note: Our team adhered to ethical security research standards and did not execute intrusive queries beyond enumeration.)

Key Takeaways

The rapid adoption of AI services without robust security measures presents serious risks. While discussions around AI security often focus on advanced threats, fundamental issues—such as database misconfigurations—are often the most immediate vulnerabilities.

Lessons for AI Companies and Security Teams:

Security should be a priority from the start – AI startups must integrate security into their development processes, ensuring sensitive infrastructure is properly protected.
Authentication is non-negotiable – Any internet-facing database must be secured with strong authentication and access controls.
Continuous monitoring is essential – Automated security scans and real-time logging can help detect and prevent unauthorized access.
Collaboration between AI and security teams is critical – Security teams must work closely with AI engineers to identify and mitigate risks early.

References

Wiz Research Blog (Original Report)
https://www.wiz.io/blog/deepseek-data-exposure
BleepingComputer Coverage
https://www.bleepingcomputer.com/news/security/deepseek-clickhouse-database-exposed-sensitive-ai-data/
The Record by Recorded Future
https://therecord.media/deepseek-ai-database-exposed-sensitive-data
SecurityWeek Analysis
https://www.securityweek.com/deepseek-data-breach-exposes-ai-chat-logs-and-secrets/
Dark Reading Report
https://www.darkreading.com/threat-intelligence/deepseek-database-exposure-security-risks
TechCrunch Coverage
https://techcrunch.com/2025/01/30/deepseek-database-exposed-security-incident/
Ars Technica Report
https://arstechnica.com/information-technology/2025/01/deepseek-exposes-sensitive-ai-data-due-to-misconfigured-database/
CSO Online
https://www.csoonline.com/article/deepseek-ai-data-exposure.html
SC Magazine Security Analysis
https://www.scmagazine.com/news/deepseek-database-exposure-cybersecurity-risks
Cybersecurity Insiders Report
https://www.cybersecurity-insiders.com/deepseek-security-breach-clickhouse-database-exposure/

Conclusion

The unprecedented pace of AI adoption has led to companies becoming critical infrastructure providers almost overnight. However, many lack the security frameworks necessary to protect sensitive data effectively.

As AI continues to integrate into businesses worldwide, the industry must enforce security standards comparable to those in cloud computing and enterprise IT. Addressing these foundational security issues today is crucial to ensuring the safe and responsible deployment of AI technologies.

DeepSeek’s data exposure serves as a stark reminder: No matter how advanced the AI, basic security hygiene remains paramount.

What do you think?

Show comments / Leave a comment

Partner with Us for Cybersecurity Solutions.

We’re here to answer any questions and help you find the right HookPhish services to meet your cybersecurity needs.

Your benefits:

What happens next?

Schedule a Call at your convenience.

Meeting to understand your needs.

Proposal Preparation with tailored solutions.

DeepSeek’s Data Exposure: A Critical Security Oversight

Overview

Executive Summary