I performed a static analysis of DeepSeek, a Chinese LLM chatbot, using version 1.8.0 from the Google Play Store. The goal was to recognize possible security and personal privacy problems.
I've blogged about DeepSeek formerly here.
Additional security and privacy concerns about DeepSeek have actually been raised.
See also this analysis by NowSecure of the iPhone variation of DeepSeek
The findings detailed in this report are based purely on fixed analysis. This implies that while the code exists within the app, there is no conclusive evidence that all of it is carried out in practice. Nonetheless, the presence of such code warrants scrutiny, specifically given the growing concerns around data privacy, monitoring, the possible abuse of AI-driven applications, and cyber-espionage characteristics in between global powers.
Key Findings
Suspicious Data Handling & Exfiltration
- Hardcoded URLs direct data to external servers, raising concerns about user activity tracking, such as to ByteDance "volce.com" endpoints. NowSecure recognizes these in the iPhone app the other day as well.
- Bespoke encryption and information obfuscation methods exist, with signs that they might be used to exfiltrate user details.
- The app contains hard-coded public secrets, instead of relying on the user gadget's chain of trust.
- UI interaction tracking captures detailed user behavior without clear consent.
- WebView adjustment exists, which could enable the app to gain access to private external internet browser information when links are opened. More details about WebView adjustments is here
Device Fingerprinting & Tracking
A considerable part of the examined code appears to focus on event device-specific details, which can be used for tracking and fingerprinting.
- The app gathers various distinct gadget identifiers, consisting of UDID, Android ID, IMEI, IMSI, and carrier details. - System residential or commercial properties, installed plans, and root detection mechanisms suggest prospective anti-tampering procedures. E.g. probes for the existence of Magisk, a tool that personal privacy advocates and security researchers use to root their Android devices.
- Geolocation and network profiling exist, indicating possible tracking abilities and making it possible for or disabling of fingerprinting routines by area. - Hardcoded device design lists recommend the application may behave differently depending upon the detected hardware.
- Multiple vendor-specific services are utilized to extract extra gadget details. E.g. if it can not determine the gadget through basic Android SIM lookup (since permission was not granted), it attempts producer specific extensions to access the exact same details.
Potential Malware-Like Behavior
While no definitive conclusions can be drawn without dynamic analysis, numerous observed habits line up with known spyware and malware patterns:
- The app uses reflection and UI overlays, which might assist in unapproved screen capture or phishing attacks. - SIM card details, serial numbers, and other device-specific information are aggregated for unidentified purposes.
- The app executes country-based gain access to constraints and "risk-device" detection, suggesting possible surveillance mechanisms.
- The app carries out calls to pack Dex modules, where extra code is loaded from files with a.so extension at runtime.
- The.so files themselves reverse and make additional calls to dlopen(), which can be used to pack additional.so files. This facility is not usually inspected by Google Play Protect and other fixed analysis services.
- The.so files can be implemented in native code, such as C++. The use of native code includes a layer of intricacy to the analysis process and obscures the full level of the app's capabilities. Moreover, native code can be leveraged to more quickly escalate privileges, possibly making use of vulnerabilities within the operating system or device hardware.
Remarks
While data collection prevails in modern applications for debugging and enhancing user experience, aggressive fingerprinting raises substantial privacy concerns. The DeepSeek app requires users to log in with a legitimate email, which should currently offer adequate authentication. There is no legitimate reason for the app to strongly gather and transfer distinct gadget identifiers, IMEI numbers, SIM card details, and other system homes.
The extent of tracking observed here goes beyond common analytics practices, potentially making it possible for persistent user tracking and re-identification across devices. These behaviors, integrated with obfuscation techniques and network interaction with third-party tracking services, require a higher level of analysis from security scientists and users alike.
The work of runtime code loading in addition to the bundling of native code suggests that the app might enable the release and execution of unreviewed, remotely provided code. This is a serious potential attack vector. No proof in this report exists that remotely released code execution is being done, online-learning-initiative.org just that the center for this appears present.
Additionally, the app's technique to finding rooted devices appears excessive for an AI chatbot. Root detection is typically warranted in DRM-protected streaming services, where security and content defense are important, or in competitive video games to avoid unfaithful. However, there is no clear reasoning for such rigorous procedures in an application of this nature, raising more concerns about its intent.
Users and organizations thinking about installing DeepSeek must know these prospective dangers. If this application is being utilized within an enterprise or federal government environment, additional vetting and security controls should be implemented before enabling its deployment on managed devices.
Disclaimer: The analysis presented in this report is based on fixed code review and does not imply that all identified functions are actively used. Further investigation is needed for conclusive conclusions.