TAXII Collector bot and STIX Parser bot #2611

laciKE · 2025-04-29T23:23:21Z

As a bare minimum, TAXII Collector currently collects only the objects of type indicator. These objects contain information about indicators and the detection patterns, e.g. in stix, pcre, sigma, snort, suricata, yara format. The pattern, pattern_type and valid_from properties are required, while confidence, description and labels are only optional properties. However, they are present in several TAXII feeds and could be used to determine classification.taxonomy and classification.type even without processing the relationships of the indicators (e.g. indicator indicates malware)

STIX Parser is currently capable of parsing objects of type indicator (usually retrieved from the TAXII Collector). From the indicator objects, it extracts the detection pattern (currently only the single Observation Expressions in STIX format are supported). It supports IP addresses, Domains and URLs indicator values. Moreover, this parser also attempts to extract some optional properties of STIX objects such as description and labels, which can be useful for futher classification of the event with the Expert Bots

TAXII Collector tests for missing parameters and mock the simple TAXII server providing minimal collection with simple indicator object STIX Parser tests fo indicator patterns parsing
Improvements based on @sebix comments, collection title used as feed.code Fix codestyle in TAXII and STIX bots
Fix Python 3.8 support in STIX Parser bot.

@sebix

As a bare minimum, TAXII Collector currently collects only the objects of type indicator. These objects contain information about indicators and the detection patterns, e.g. in stix, pcre, sigma, snort, suricata, yara format. The pattern, pattern_type and valid_from properties are required, while confidence, description and labels are only optional properties. However, they are present in several TAXII feeds and could be used to determine classification.taxonomy and classification.type even without processing the relationships of the indicators (e.g. indicator indicates malware) STIX Parser is currently capable of parsing objects of type indicator (usually retrieved from the TAXII Collector). From the indicator objects, it extracts the detection pattern (currently only the single Observation Expressions in STIX format are supported). It supports IP addresses, Domains and URLs indicator values. Moreover, this parser also attempts to extract some optional properties of STIX objects such as description and labels, which can be useful for futher classification of the event with the Expert Bots TAXII Collector tests for missing parameters and mock the simple TAXII server providing minimal collection with simple indicator object STIX Parser tests fo indicator patterns parsing Improvements based on @sebix comments, collection title used as feed.code Fix codestyle in TAXII and STIX bots Fix Python 3.8 support in STIX Parser bot

laciKE · 2025-04-29T23:32:19Z

The TAXII and STIX bots are currently tested with the ESET Threat Intelligence (ETI) feeds.
Recently, ETI added several new feeds which are available only via TAXII/STIX 2.1, and older ESETCollectorBot and ESETParserBot cannot handle them.

I am working on Expert Bot for classification events from ETI and I would like to publish it when it will be ready - together with feeds in feeds.yaml

laciKE · 2025-05-01T23:25:20Z

Hello, I have a question regarding the proposal from the last commit.

I created ESETExpertBot which can add the proper classification.type and malware.name (if possible) to the events produced by StixParserBot. Ref: https://github.com/laciKE/intelmq/blob/eset/intelmq/bots/experts/eset/expert.py

When I wanted to add ESET Threat Intelligence TAXII feeds to feeds.yaml also with the expert bot, too, the tests failed, because it seems that the expert bot is not allowed in feeds.yaml.

Especially with the TAXII feeds, three bots will be needed to ingest those feeds:

Collect STIX objects from TAXII server (generic TAXII Collector)
Parse generic STIX indicator objects (generic STIX Parser)
Apply vendor-specific enrichment of events based on optional STIX properties used by the particular vendor (vendor-specific Expert bot).

As far as I understand, two parsers cannot by chained in the pipeline (because the input is Report, and output is the Event).
What is the suggested way to do a three-step ingestion in similar cases? One generic Parser bot for given format, and all vendor-specific bots should inherit from that generic parser bot?

sebix · 2025-05-02T09:17:34Z

From what I understand, reading the code, the ESET expert fixes the classification for all events coming from the ESET feed. That logic should be in the Parser instead. Or is the code of ESET expert also useful for other sources other than ESET?

laciKE · 2025-05-02T19:29:17Z

Thank you for your answer. You are right, that expert bot fixes the classification and it is ESET-specific. I will change it to parser bot, which will inherit from the StixParserBot from this pull request. After that, I will add the commits with "EsetStixParserBot" to this pull request.

sebix · 2025-05-02T19:55:07Z

Ah, I see. That parser also works for multiple sources, other than ESET?

laciKE · 2025-05-02T21:46:37Z

This StixParserBot yes, it should work for any source which provide Threat Intelligence data in STIX 2.1 format. I created it from scratch by reading STIX 2.1 documentation, and it is able to parse Indicators Objects with simple Patterns.

StixParserBot (and TaxiiCollectorBot) should be used with any TAXII/STIX 2.1 feed. General parsing of indicators works, but for correct classification, the vendor-specific bot is needed. This is why I asked what is the proper way to do it.

Currently I tested TaxiiCollectorBot+StixParserBot only with ESET Threat Intelligence TAXII feeds, because I do not have access to other TAXII 2.1 feeds. For correct classification, I created the ESETExpertBot, which I am going to change to ESETStixParserBot (it will by child a class of generic StixParserBot)

Parser bot for enriching events from ESET Threat Intelligence, which were collected by TaxiiCollectorBot. It inherits from generic StixParserBot and implement vendor-specific parsing. ESET STIX Parser bot analyzes comment (based on original description of STIX Indicator object) and choose proper classification type and if possible, also fills the malware.name in the event.

ETI feeds with URLs, domains and IP addresses, which can be collected by TaxiiCollectorBot and parsed by ESETStixParserBot

laciKE · 2025-05-23T21:54:31Z

I will try to do better parsing for STIX2 patterns.

Also, in ESET Threat Intelligence there are sometimes domains reported in URL feed and IP addresses in Domain feed, and this causes InvalidValue exceptions in produced events - I will try to address it, at least by discarding those indicators without raising exceptions (raise_failure=False).

Use the official STIX2 Pattern Validator to get thecomparison expressions and extracts simple IoCs from them. Support for URLs, Domains, IPv4, IPv6 and also for MD5, SHA-1 and SHA-256 hashes. Small fixes and workarounds implemented to address certain anomalies in STIX data provided by some vendors (e.g. ETI) - SHA1 and SHA256 keywords accepted, invalid objects reported as Domains or URLs are dropped without throwing the exceptions

laciKE · 2025-05-29T00:00:32Z

Better parsing for STIX2 patterns ready, now the STIX parser bot can extract also hashes.

Above-mentioned issues with ESET Threat Intelligence fixed.

From my side, the PR is ready for review. If I should change something or if I forgot to do something, please, let me know, this is my first PR to IntelMQ.

sebix added the component: bots label Apr 30, 2025

laciKE added 2 commits May 3, 2025 01:43

Add ESET Threat Intelligence feeds

c177068

ETI feeds with URLs, domains and IP addresses, which can be collected by TaxiiCollectorBot and parsed by ESETStixParserBot

laciKE force-pushed the taxii branch 2 times, most recently from 0760f22 to c177068 Compare May 3, 2025 01:04

laciKE marked this pull request as draft May 23, 2025 21:48

laciKE added 3 commits May 29, 2025 00:04

Add TAXII and STIX bots documentation

36e4688

Merge branch 'develop' into taxii

dbe7bee

laciKE marked this pull request as ready for review May 28, 2025 23:57

Fix missing dependency error

ef9508b

laciKE force-pushed the taxii branch from 900f4af to ef9508b Compare May 29, 2025 07:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TAXII Collector bot and STIX Parser bot #2611

TAXII Collector bot and STIX Parser bot #2611

Uh oh!

laciKE commented Apr 29, 2025

Uh oh!

laciKE commented Apr 29, 2025

Uh oh!

laciKE commented May 1, 2025

Uh oh!

sebix commented May 2, 2025

Uh oh!

laciKE commented May 2, 2025

Uh oh!

sebix commented May 2, 2025

Uh oh!

laciKE commented May 2, 2025

Uh oh!

laciKE commented May 23, 2025

Uh oh!

laciKE commented May 29, 2025

Uh oh!

Uh oh!

TAXII Collector bot and STIX Parser bot #2611

Are you sure you want to change the base?

TAXII Collector bot and STIX Parser bot #2611

Uh oh!

Conversation

laciKE commented Apr 29, 2025

Uh oh!

laciKE commented Apr 29, 2025

Uh oh!

laciKE commented May 1, 2025

Uh oh!

sebix commented May 2, 2025

Uh oh!

laciKE commented May 2, 2025

Uh oh!

sebix commented May 2, 2025

Uh oh!

laciKE commented May 2, 2025

Uh oh!

laciKE commented May 23, 2025

Uh oh!

laciKE commented May 29, 2025

Uh oh!

Uh oh!