Wiz Research recently uncovered a ClickHouse database related to DeepSeek that was left publicly accessible without any authentication. This exposure allowed complete control over the database, leaking over a million entries of log streams that included chat history, secret keys, backend details, and other critical information. Following a responsible disclosure by Wiz Research, DeepSeek swiftly secured the vulnerable database.

The Discovery and Its Industry Implications

DeepSeek, a prominent Chinese AI startup, has attracted attention due to its advanced AI models, notably the DeepSeek-R1 reasoning model. This technology competes with leading AI systems such as OpenAI’s offerings, known for its efficiency and cost-effectiveness. Amidst DeepSeek's recognition, Wiz Research evaluated its external security structure to identify potential vulnerabilities.

Within minutes, Wiz Research identified a publicly accessible and unprotected ClickHouse database, linked to oauth2callback.deepseek.com:9000 and dev.deepseek.com:9000, which contained sensitive data including chat logs, backend information, API secrets, and more. The exposure allowed attackers potential full control and privilege escalation without authentication barriers.

Detailed Exposure Walkthrough

Wiz Research's process initiated with an analysis of DeepSeek's public domains, mapping about 30 accessible subdomains. While most seemed safe, hosting chatbot interfaces and documentation, they found unusual open ports (8123 & 9000) that led to an unsecured ClickHouse database.

ClickHouse, known for fast analytical queries on large datasets, was misconfigured allowing arbitrary SQL queries via its HTTP interface. Simple queries revealed tables containing sensitive information, highlighting the severity of the exposure.

  • timestamp: Entries from January 6, 2025
  • span_name: Internal DeepSeek API endpoints
  • string.values: Logs with Chat History, API Keys, and backend details
  • _service: Service origin logs
  • _source: Origins of log requests, exposing structured data

This level of access posed severe risks not only to DeepSeek but also to its users, allowing potential retrieval of sensitive information and possibly more through additional queries. Although Wiz Research conducted only non-intrusive queries, it highlighted the potential for substantial security breaches.

Critical Lessons Learned

The widespread adoption of AI technologies, without matching security measures, presents inherent risks. The exposure of DeepSeek's database illustrates the security vulnerabilities stemming from infrastructure and supportive tools.

While AI security discussions often focus on futuristic risks, immediate hazards like unintentional database exposures represent significant threats. As businesses increasingly adopt AI tools, they must prioritize security, safeguarding sensitive user data above all. Security teams need to collaborate closely with AI developers to ensure full transparency and protection of sensitive data.

Conclusion

The rapid integration of AI into global industry infrastructures is unprecedented. Many AI firms have grown into indispensable service providers without the rigorous security measures. As AI systems become more embedded in business operations, it is crucial to enforce security protocols akin to those of cloud providers. Recognizing and mitigating the inherent risks in handling sensitive information will secure long-term trust and reliability in AI technologies.

The link has been copied!