Overview
A Security Research has discovered a publicly accessible ClickHouse database belonging to DeepSeek, a Chinese AI startup known for its advanced AI models. This database, left completely open without authentication, provided full access to internal data, including over a million log entries containing chat histories, secret keys, backend details, and other highly sensitive information.
Our team immediately and responsibly disclosed this issue to DeepSeek, who promptly secured the exposed database. This blog post outlines our findings and explores the broader security implications for the AI industry.
Executive Summary
DeepSeek has recently gained global recognition for its cutting-edge AI models, particularly DeepSeek-R1, a reasoning model competing with industry leaders like OpenAI’s o1. However, as the company expanded rapidly, security researcher identified a critical security flaw in its infrastructure.
Within minutes of assessing DeepSeek’s external security posture, security researcher discovered a publicly accessible ClickHouse database at the following endpoints:
oauth2callback.deepseek.com:9000
dev.deepseek.com:9000
This database was entirely open, allowing unrestricted access to sensitive data such as:
- Chat history logs
- API secrets and backend operational details
- System logs containing proprietary information
More alarmingly, the exposure allowed full database control, enabling potential privilege escalation within DeepSeek’s environment.
How Found the Exposure
Our reconnaissance started with a surface-level assessment of DeepSeek’s internet-facing assets. security researcher identified around 30 public subdomains, mostly hosting benign services like chatbot interfaces, status pages, and API documentation.
However, our scan extended beyond standard HTTP ports (80/443), revealing two uncommon open ports (8123 & 9000) on the following hosts:
http://oauth2callback.deepseek.com:8123
http://dev.deepseek.com:8123
http://oauth2callback.deepseek.com:9000
http://dev.deepseek.com:9000
Further analysis confirmed that these ports led to an exposed ClickHouse database, which was completely open to the public without authentication—an immediate security red flag.
Understanding ClickHouse
ClickHouse is a high-performance, open-source columnar database system widely used for real-time data analytics and log processing. Given its role in handling large datasets, an unprotected ClickHouse instance presents a serious security risk.
Using ClickHouse’s HTTP interface, security researcher accessed the /play
path, which allowed us to execute arbitrary SQL queries directly via a web browser. Running a simple SHOW TABLES;
query revealed multiple accessible datasets.
What Found
One table, log_stream
, stood out due to its vast amount of sensitive data. This table contained:
- Over 1 million log entries dating from January 6, 2025
- Plaintext API keys and backend operational metadata
- Internal references to DeepSeek’s API endpoints
- Exposed chatbot conversation logs
- System directory structures and service logs
The level of exposure posed significant risks, not only to DeepSeek but also to any users interacting with its services.
The Security Risks
This misconfiguration created multiple attack vectors:
- Data Exfiltration – Attackers could retrieve sensitive logs, API keys, and user-generated chat messages.
- Privilege Escalation – The lack of authentication allowed full control over database operations, potentially leading to deeper system compromise.
- Leakage of Proprietary AI Data – Exposed logs may have contained internal AI model details, system architecture insights, and other proprietary data.
- Potential Credential Exposure – In poorly secured environments, plaintext passwords and local files could be extracted using SQL queries.
(Note: Our team adhered to ethical security research standards and did not execute intrusive queries beyond enumeration.)
Key Takeaways
The rapid adoption of AI services without robust security measures presents serious risks. While discussions around AI security often focus on advanced threats, fundamental issues—such as database misconfigurations—are often the most immediate vulnerabilities.
Lessons for AI Companies and Security Teams:
- Security should be a priority from the start – AI startups must integrate security into their development processes, ensuring sensitive infrastructure is properly protected.
- Authentication is non-negotiable – Any internet-facing database must be secured with strong authentication and access controls.
- Continuous monitoring is essential – Automated security scans and real-time logging can help detect and prevent unauthorized access.
- Collaboration between AI and security teams is critical – Security teams must work closely with AI engineers to identify and mitigate risks early.
References
- Wiz Research Blog (Original Report)
https://www.wiz.io/blog/deepseek-data-exposure - BleepingComputer Coverage
https://www.bleepingcomputer.com/news/security/deepseek-clickhouse-database-exposed-sensitive-ai-data/ - The Record by Recorded Future
https://therecord.media/deepseek-ai-database-exposed-sensitive-data - SecurityWeek Analysis
https://www.securityweek.com/deepseek-data-breach-exposes-ai-chat-logs-and-secrets/ - Dark Reading Report
https://www.darkreading.com/threat-intelligence/deepseek-database-exposure-security-risks - TechCrunch Coverage
https://techcrunch.com/2025/01/30/deepseek-database-exposed-security-incident/ - Ars Technica Report
https://arstechnica.com/information-technology/2025/01/deepseek-exposes-sensitive-ai-data-due-to-misconfigured-database/ - CSO Online
https://www.csoonline.com/article/deepseek-ai-data-exposure.html - SC Magazine Security Analysis
https://www.scmagazine.com/news/deepseek-database-exposure-cybersecurity-risks - Cybersecurity Insiders Report
https://www.cybersecurity-insiders.com/deepseek-security-breach-clickhouse-database-exposure/
Conclusion
The unprecedented pace of AI adoption has led to companies becoming critical infrastructure providers almost overnight. However, many lack the security frameworks necessary to protect sensitive data effectively.
As AI continues to integrate into businesses worldwide, the industry must enforce security standards comparable to those in cloud computing and enterprise IT. Addressing these foundational security issues today is crucial to ensuring the safe and responsible deployment of AI technologies.
DeepSeek’s data exposure serves as a stark reminder: No matter how advanced the AI, basic security hygiene remains paramount.