feat: add parser for Zoom Team Chat encrypted databases#2833
feat: add parser for Zoom Team Chat encrypted databases#2833calilkhalil wants to merge 22 commits into
Conversation
Add data model classes for Zoom DPAPI forensic parser integration, following IPED's existing parser pattern (separate class per model). Models ported from zoom_forensics with inner classes split into individual files: ZoomData, MasterKeyData, ZoomUserAccount, ZoomSystemInfo, ZoomMessage, ZoomMeeting, ZoomParticipant, ZoomSharedFile, ZoomRecording, ZoomKeyValue, ZoomTimelineEvent. Includes unit tests (15 tests, all passing).
Port DataReader (binary parser), LocalDataDecryptor (AES/CBC with SID-derived key), DPAPIBlobDecryptor (DPAPI blob decryption with 3DES/AES and SHA-1/256/384/512 HMAC), and DPAPIMasterKeyDecryptor (DPAPI master key decryption with PBKDF2). Adapted from zoom_forensics: removed filesystem I/O (readFile), classes now accept byte[] directly for IPED's evidence tree model. Shared hex conversion utilities made static for cross-class reuse. Includes unit tests (15 tests, all passing).
Port ZoomDatabaseReader using SQLiteMCSqlCipherConfig API from io.github.willena:sqlite-jdbc to configure cipher parameters before connection (PRAGMA-based config is not possible, see issue sepinf-inc#159). Port ZoomDataExtractor with methods accepting JDBC Connection directly instead of filesystem paths, suitable for IPED's evidence tree model. Extracts user accounts, participants, messages, meetings, shared files, recordings, key-values, system info, and timeline events. Add io.github.willena:sqlite-jdbc:3.49.1.0 dependency to pom.xml for SQLCipher Multiple Ciphers (MC) support. Includes unit tests (8 tests, all passing).
Add ZoomDpapiParser extending AbstractParser, triggered by Zoom.us.ini files (application/x-zoom-dpapi-ini). Reads DPAPI-encrypted OSKEY from INI, locates encrypted databases via IItemSearcher, extracts forensic data, and emits meetings/messages/account as virtual items via EmbeddedDocumentExtractor following WhatsAppParser pattern. Add ZoomReportGenerator producing HTML fragments for meetings and account info suitable for IPED's viewer. Configurable via @field: extractMessages, decryptedOskey. Includes unit tests (11 tests, all passing; 49 total across package).
Add ZoomDetector implementing Tika Detector interface to identify Zoom.us.ini files by filename and assign application/x-zoom-dpapi-ini MIME type. Encrypted .enc.db files are discovered by the parser via IItemSearcher since they lack standard SQLite headers. Register ZoomDetector in META-INF/services/org.apache.tika.detect.Detector. Includes unit tests (5 tests, all passing; 54 total across package).
Register ZoomDpapiParser in ParserConfig.xml with extractMessages=true. Add ZoomDpapi.Report.* localization keys to all 6 locale files: English (default), Portuguese (pt_BR), German (de_DE), Spanish (es_AR), French (fr_FR), and Italian (it_IT).
Port DPAPIHash (hash string parser), PasswordCracker (dictionary-based cracking with MD4/NTLM for domain accounts, PBKDF2 key derivation), and HashGenerator (generates $DPAPImk$ hashes from master key files). Adapted for IPED: PasswordCracker accepts List<String> instead of file path, HashGenerator accepts byte[] instead of file path, DPAPIHash extracted to its own class file. Includes unit tests (12 tests, all passing; 66 total across package).
Change visibility of ZoomDpapiParser.extractEncryptedKey() and ZoomDataExtractor.formatSize() from package-private to public for external testability and reuse.
Register Zoom MIME types in CategoriesConfig.json: - Chats > Zoom (application/x-zoom-meeting) - Instant Messages (message/x-zoom-message) - User Accounts (application/x-zoom-account) Embed wordlist.txt (669 passwords) in parser resources for automatic DPAPI master key password cracking. Rewrite tryDecryptOskey to use HashGenerator + PasswordCracker with the embedded wordlist instead of only trying empty password. Add informative logging at each decryption stage.
Rewrite ZoomReportGenerator to match the standalone zoom_forensics HtmlReportGenerator visual style with complete forensic details: - Meeting cards with all identifiers (Meeting Number, SDK UID, Conference ID, Host ID) - Stats with first/last message timestamps, duration, counts - Participants table with GUIDs from chat correlation - Shared files with full encryption details (algorithm, keys, DB keys, K attribute, SHA-256 hashes, sender IDs) - Chat messages with timestamps, sender names, and message GUIDs - Account view with 2-column grid (Credentials + System info) with parsed fields (CPU Name, GPU Name from WMI data)
Return null instead of OCTET_STREAM when file is not Zoom.us.ini, preventing the detector from overriding MIME types of all other files. Extract filename from full paths (both Windows and Unix separators) before comparison, so detection works when IPED passes the complete evidence path as resource name. Added tests for full path detection (68 tests total, all passing).
Add INFO-level logging at each step of the parse() method to diagnose whether decryptedOskey @field parameter is being received and whether DPAPI decryption is attempted.
…pe patterns Replace naive NAME-only query with two-strategy approach: 1. PATH + NAME query using INI file's parent directory (Skype pattern) 2. Fallback to NAME with path proximity filtering (WhatsApp pattern) This fixes the blocker where findDatabaseFile returned empty results during IPED processing because the simple name query was insufficient.
- Add ZOOM_INI to QueuesProcessingOrder with priority 3, ensuring IItemSearcher is available when the parser runs (same as Skype/Discord) - Fix extractSidFromPath to search Protect directory children via IItemSearcher instead of broken wildcard query - Add extractUserBasePath helper for deriving user profile path - Improve tryDecryptOskey logging for better diagnostics
…sion Without this, the Zoom.us.ini was not in an expandable category, so IPED never emitted the virtual meeting/message/account subitems.
- Extract shared crypto utilities into CryptoUtil (eliminates duplication across PasswordCracker, DPAPIMasterKeyDecryptor, DPAPIBlobDecryptor) - Remove dead code: ZoomData, MasterKeyData, unused testConnection(), getTitle(), extractRecordings(), ZOOM_ACCOUNT constant - Fix ZoomMessage.getDate() epoch seconds vs milliseconds bug - Add saved meeting extraction and confID/sdkMeetingUid linking - Improve meeting report: full HTML forensic reports with grid stats, participants, shared files with encryption details, timeline - Add sub-class-of text/html for x-zoom-meeting (enables HTML preview) - Add Zoom category/mime icons, fix icon mapping hierarchy - Add SLF4J logging, replace silent catches with logger.debug - Add bounds checking to DataReader.skip() - Use TemporaryResources instead of deleteOnExit() for temp files - Remove unused ZoomDpapi.Report.* localization keys (6 languages) - Update tests to match refactored API (64 tests passing)
|
Hi @calilkhalil, I've just run your example case and it worked great. Regarding the issue of A few other points that might be worth adjusting:
|
|
Hi @aberenguel, thanks for the review! Just to clarify on The current logic does a multi-stage search:
Does this look like the right approach for finding sibling files in the evidence tree? I based it on how the WhatsApp/Skype parsers use Btw, I'm already working on the other suggestions! |
- Map zoom icon to application/x-zoom-meeting MIME type - Add message/x-zoom-message to Instant Messages category - Remove non-deterministic generated date from HTML report - Add Conversation: metadata (id, Name, messagesCount, Participants)
|
@aberenguel, Also, if this PR gets merged, would it be okay to mention it on my LinkedIn? It's part of a research I'm conducting on insider threats and their correlation with third-party apps, and as the research evolves, more parsers will likely follow. Proper credit to your team would definitely be included (feel free to share your LinkedIn or company handle so I can tag you guys).
|
With users and contributors abroad, maintaining English as the standard language helps everyone follow the discussions. |
|
Thanks very much @calilkhalil for this contribution!
Just a doubt here: is there a file size information in the Zoom database being decoded? If yes, I think it should be included in the search query. Searching just for the file name alone is dangerous since it could link to wrong files. |
|
@lfcnassif, yes we can't rely on file size since the databases vary, but the name-only fallback already has a couple of safeguards in place:
That said, I think we can tighten it up a bit more. Since Zoom SQLCipher databases always use a 1024-byte page size, we could reject any file that isn't a multiple of 1024 bytes (or is smaller than 1024 bytes). That would cut out most false positives without adding much complexity. Would that work for you, or did you have something else in mind? |
|
Just a quick observation (without having dove into the code yet): we should ensure the search logic handles deleted files correctly. In cases where two files share the same path, we need to distinguish between the active version and the one where |
|
Thanks for your work, @calilkhalil.
Good to know the
Given the current implementation of Is there any scenario where a Using |
…in findDatabaseFile Addresses PR review feedback from @wladimirleite and @aberenguel: - Add filterDeleted() to prefer active items over deleted ones when multiple files share the same path (e.g. after Zoom auto-updates) - Scope the name-only fallback query with EVIDENCE_UUID to prevent cross-evidence matching in multi-disk cases - Follow the same IItemSearcher patterns used by DiscordParser
c771227 to
5556423
Compare
|
@wladimirleite and @aberenguel. Pushed a fix in 5556423 I've added @aberenguel, the name-only fallback now includes Refs:
|
|
Thank you very much @calilkhalil for this PR! @aberenguel, since you did an initial review and test, would you mind doing a final review/test and, if it is OK, approve this PR so we could get it merged? |
|
@aberenguel thx for fixing the FX thing. I forgot about it. Just used to test it on my environment. |
|
Hi @calilkhalil! I've updated the logic to better prioritize and select the candidate databases. It now checks the file against the |
|
Fair enough, thank you again @aberenguel! One thing is concerning me: everyone should know that, in order to crack the password, the user has to include it in the word list. This needs to be documented somewhere. Do you need any help with that? |
|
Hi, @calilkhalil ! I got some errors parsing E01 files due to duplicated sqlite libraries in lib folder:
I had to remove sqlite-jdbc-3.49.1.0.jar manually but I don't known the impact in SQLite encrypted libraries. |
|
Hi @aberenguel, This is exactly the conflict I flagged in the PR description: the two sqlite-jdbc versions cannot coexist on the classpath. The reason we need 3.49.1.0 is that Zoom's SQLCipher databases use custom cipher parameters (1024-byte page size, 4000 KDF iterations, SHA-512 HMAC), and the only way to configure them is through the SQLiteMCSqlCipherConfig API PRAGMA-based configuration doesn't work because the driver validates the database file during connection initialization, before any statement can execute. I ran into this myself and opened an issue upstream: Willena/sqlite-jdbc-crypt#159. So the fix should be to keep only 3.49.1.0 and remove 3.41.2.2. The question is whether SleuthKit's Happy to help investigate on my end. |
|
The problem is that xerial sqlite jdbc library changed setReadUncommited to setReadUncommitted as firstly noticed here: They should have kept the old method deprecated... I have had some headaches in the past with class loaders. As we already use a sleuthkit fork with some patches, if latest sleuthkit (4.14) doesn't use the updated method name, I think we could add one more patch to our fork. |
The method name was updated in TSK a couple of years ago: TSK 4.14 uses 3.49.1.0 version of Xerial SQLite JDBC. |
Great! So, we need to upgrade to TSK-4.14 before, it is being tracked here: #2229 @aberenguel, may you help with that too? |
Sure! I'm working in a case with encrypted MacBook disk. I'll try to make it work with TSK-4.14 merged with our changes. |
Great! Thank you very much! |
Closes #2708
Summary
Pure Java parser that decrypts and extracts forensic artifacts from Zoom Team Chat encrypted databases (
zoomus.enc.db,zoommeeting.enc.db) using Windows DPAPI master key cracking. Works on both Windows and Linux, with no native dependencies.Decryption Pipeline
Extracted Artifacts
application/x-zoom-meetingEach meeting generates an individual HTML report with child message subitems for the global timeline, following the pattern established by WhatsApp and Telegram parsers.
Report Contents
Each forensic report includes:
Implementation
41 files changed, ~4560 lines added across 4 modules:
Parser Package (
iped.parsers.zoomdpapi)ZoomUserAccount,ZoomMessage,ZoomMeeting,ZoomParticipant,ZoomSharedFile,ZoomRecording,ZoomKeyValue,ZoomSystemInfo,ZoomTimelineEventCryptoUtil,DataReader,LocalDataDecryptor,DPAPIBlobDecryptor,DPAPIMasterKeyDecryptorDPAPIHash,PasswordCracker,HashGeneratorZoomDpapiParser,ZoomDetector,ZoomDatabaseReader,ZoomDataExtractor,ZoomReportGeneratorConfiguration
CustomSignatures.xml: MIME types plussub-class-of text/htmlfor report previewParserConfig.xml: Parser registration withextractMessagesparameterCategoriesConfig.json: Chats > Zoom category mappingMETA-INF/services:ZoomDetectorregistrationDependencies
io.github.willena:sqlite-jdbc:3.49.1.0(SQLCipher MC support) added toiped-parsers-impl/pom.xmlsqlite-jdbc-3.41.2.2.jarand they cannot coexist on the classpath (NoSuchFieldError: CIPHER)SQLCipher Configuration
V4 defaults with Zoom-specific parameters:
SQLiteMCSqlCipherConfigAPI (PRAGMA-based config does not work)Design Decisions
AbstractParserinstead ofSQLite3DBParser: the parser needs to orchestrate multi-file decryption (INI plus DPAPI master keys plus databases), not just open a single SQLite fileZoomDetectortriggers onZoom.us.iniand returnsnullfor non-Zoom filesTesting
64 unit tests in 6 test classes:
ZoomModelsTestZoomCryptoTestZoomDataExtractorTestZoomDpapiParserTestZoomDetectorTestZoomCrackingTestE2E Validation
Tested against real Windows evidence (user profile with Zoom installation):
Known Limitation
findDatabaseFile()usesIItemSearcherto locate.enc.dbfiles in the evidence tree. Currently returns no results during IPED processing, likely a timing issue (items not yet indexed when parser runs). The parser works correctly when databases are found (proven by E2E test). This is the main remaining point for full integration and may need guidance from maintainers.Screenshots