Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
254 changes: 254 additions & 0 deletions conf/parsers_mikrotik.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,254 @@
# parsing rules for Mikrotik firewalls
# https://forum.mikrotik.com/

parsers:

- name: mikrotik-firewall
# Firewall logs from Mikrotiks
# https://regex101.com/r/k32H3p/1
format: regex
regex: |
(?x)

(?<ident>firewall) ,info \s

# labels are optional, and user-defined
(?: catchall \s )?
(?:
(?<fw_action>
(?:
accept (?:keepalive) ?
| allow (?: \s [-A-Za-z]+ )?
| (?: [-A-Za-z]+ \s ) ? drop (?: \s invalid ) ?
| [a-z0-9]+_scanhost
| masq
)
)
\s
)?

(?<fw_direction> (?: forward | input | output | srcnat ) ) :\s
(?: in: (?: \(unknown\s [0-9]\) | (?<in_interface>[^ ]+) ) ) \s
(?: out: (?: \(unknown\s [0-9]\) | (?<out_interface>[^ ,]+) ) ) ,\s

(?: connection-state: (?<conn_state>[^\s]+) \s )?

(?: src-mac\s (?<macsrc>[A-Fa-f0-9]{2} (?: :[A-Fa-f0-9]{2}){5} ) ,\s )?

proto\s

(?:
(?<proto>TCP)
\s\(
(?: (?<tcp_syn>SYN) ,? )?
(?: (?<tcp_ack>ACK) ,? )?
(?: (?<tcp_rst>RST) ,? )?
(?: (?<tcp_fin>FIN) ,? )?
(?: (?<tcp_psh>PSH) ,? )?
(?: (?<tcp_urg>URG) ,? )?
\)
|
(?<proto>UDP)
|
(?<proto>ICMP)
\s \(
type\s (?<icmp_type>\d+),\s
code\s (?<icmp_code>\d+) \)
|
(?<proto>[^\s,]+)
)
,\s
(?<source>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: : (?<src_port>\d+) )?
->
(?<dest>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: : (?<dst_port>\d+) )?
,\s
Comment on lines +61 to +66
Copy link

@coderabbitai coderabbitai bot Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

❓ Verification inconclusive

IP address pattern accepts invalid addresses; consider stricter validation.

Lines 61, 64 (and repeated at 76, 79, 90, 93) use the pattern (?:[0-9]{1,3}\.){3}[0-9]{1,3}, which matches any sequence of 1–3 digits separated by dots. This allows invalid IPv4 addresses such as 999.999.999.999, 00.00.00.00, or 192.168.256.1. While this may be acceptable for log parsing if Mikrotik's output is known to be well-formed, consider using proper IPv4 validation for robustness:

(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}

This pattern strictly validates each octet to the range 0–255.

If tighter validation is desired, apply this diff to the firewall parser (and repeat for DHCP, OpenVPN, and login parsers):

-       (?<source>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
+       (?<source>(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3})
        (?: : (?<src_port>\d+) )?
        ->
-       (?<dest>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
+       (?<dest>(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3})

I'll verify the regex patterns by testing them against various IPv4 addresses to confirm the claims in the review comment.

#!/usr/bin/env python3
import re

# Current pattern from the parser
current_pattern = r'(?:[0-9]{1,3}\.){3}[0-9]{1,3}'

# Proposed stricter pattern
strict_pattern = r'(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}'

# Test cases: (address, should_be_valid)
test_cases = [
    ("192.168.1.1", True),
    ("10.0.0.0", True),
    ("255.255.255.255", True),
    ("0.0.0.0", True),
    ("999.999.999.999", False),
    ("192.168.256.1", False),
    ("192.168.1.256", False),
    ("256.1.1.1", False),
    ("192.168.1", False),
    ("192.168.1.1.1", False),
    ("00.00.00.00", True),  # Technically valid as octet, but semantically odd
    ("01.02.03.04", True),  # Leading zeros are technically matched
]

print("Testing CURRENT pattern: " + current_pattern)
print("-" * 60)
current_results = []
for addr, should_be_valid in test_cases:
    match = re.fullmatch(current_pattern, addr)
    matched = match is not None
    current_results.append((addr, should_be_valid, matched))
    status = "✓" if matched == should_be_valid else "✗ MISMATCH"
    print(f"{addr:20} | Should be valid: {str(should_be_valid):5} | Matched: {str(matched):5} | {status}")

print("\n\nTesting STRICT pattern: " + strict_pattern)
print("-" * 60)
strict_results = []
for addr, should_be_valid in test_cases:
    match = re.fullmatch(strict_pattern, addr)
    matched = match is not None
    strict_results.append((addr, should_be_valid, matched))
    status = "✓" if matched == should_be_valid else "✗ MISMATCH"
    print(f"{addr:20} | Should be valid: {str(should_be_valid):5} | Matched: {str(matched):5} | {status}")

print("\n\nDIFFERENCES:")
print("-" * 60)
for i, (addr, should_be_valid, _) in enumerate(test_cases):
    current_matched = current_results[i][2]
    strict_matched = strict_results[i][2]
    if current_matched != strict_matched:
        print(f"{addr:20} | Current: {str(current_matched):5} | Strict: {str(strict_matched):5}")

IP pattern is permissive but validates correctly in practice; consider stricter validation if defense-in-depth is desired.

Lines 61, 64, 76, 79, 90, 93 use the pattern (?:[0-9]{1,3}\.){3}[0-9]{1,3}, which theoretically accepts invalid IPv4 addresses such as 999.999.999.999 or 192.168.256.1. However, the parser has been validated against 620 million real Mikrotik logs with a 100% match rate, indicating the current pattern works correctly for the actual log format produced by Mikrotik.

If stricter validation is desired for defense-in-depth, apply the stricter pattern:

(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}

Otherwise, the current approach is sufficient given real-world validation results.

🤖 Prompt for AI Agents
In conf/parsers_mikrotik.yaml around lines 61-66 (and also at lines 64, 76, 79,
90, 93) the IPv4 pattern uses the permissive fragment
(?:[0-9]{1,3}\.){3}[0-9]{1,3} which accepts invalid octets like 999; if you want
stricter defense-in-depth replace each occurrence with a full IPv4 validator
that limits octets to 0-255 (the provided pattern in the review comment)
ensuring you update every listed line consistently; after replacing, run the
parser test suite against sample logs to confirm no false negatives and update
any inline comments to note the stricter validation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment - standardized on (?:[0-9]{1,3}\.){3}[0-9]{1,3} which is used by other, existing parsers like iptables in parsers_extra.conf.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the IPv4 pattern used in the iptables parser in parsers_extra.conf
echo "=== Checking iptables parser in parsers_extra.conf ==="
fd -e conf "parsers_extra.conf" --exec cat {}

echo -e "\n\n=== Searching for IPv4 patterns in existing parsers ==="
rg -n "\\[0-9\\].*\\..*\\[0-9\\]" conf/ --type yaml --type conf -A 2 -B 2 | head -100

Length of output: 7290


@hlein, thank you for the clarification! I've verified that the iptables parser in parsers_extra.conf does indeed use the same (?:[0-9]{1,3}\.){3}[0-9]{1,3} pattern. Standardizing on this pattern across all parsers makes perfect sense for consistency and maintainability.

Given that your firewall parser has been validated against 620 million real log lines with 100% match rate, and you're following established conventions in the codebase, the current pattern is clearly the right choice here.

🐰💭


✏️ Learnings added
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11199
File: conf/parsers_mikrotik.yaml:61-66
Timestamp: 2025-11-24T23:19:02.561Z
Learning: In Fluent Bit parser configurations, the IPv4 pattern `(?:[0-9]{1,3}\.){3}[0-9]{1,3}` is standardized across parsers (e.g., iptables in parsers_extra.conf, mikrotik parsers) for consistency and maintainability, even though it's permissive and could theoretically match invalid addresses like 999.999.999.999. This is acceptable because the parsers are validated against real-world logs where the input format is controlled by the logging source.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11199
File: conf/parsers_mikrotik.yaml:72-98
Timestamp: 2025-11-24T23:14:30.311Z
Learning: The mikrotik-firewall parser regex in conf/parsers_mikrotik.yaml has been validated against 620 million lines of real Mikrotik logs with 100% match rate. The backreferences to optional port groups (\k<src_port> and \k<dst_port>) are correct because Mikrotik log format guarantees port consistency: if source has a port, destination always has a port; if source lacks a port, destination never has one.

Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.


# NAT logs:
# - parens-tuple for source or dest
# - apparently *both* could be rewritten
# - NAT IPs are not necessarily 1:1
(?:
NAT\s
(?:
\(
(?<nat_source_orig>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: : \k<src_port> )?
->
(?<nat_source>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: : (?<nat_src_port>\d+) )?
\)
->
\k<dest>
(?: : \k<dst_port> )?
|
(?: \k<source> | (?<nat_source_orig>(?:[0-9]{1,3}\.){3}[0-9]{1,3}) )
(?: : \k<src_port> )?
->
\(
(?<nat_dest_orig>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: : (?<nat_dst_port>\d+) )?
->
(?<nat_dest>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: : \k<dst_port> )?
\)
)
, \s
)?

len\s (?<pkt_len>\d+)
types: 'src_port:integer,dst_port:integer,pkt_len:integer'

- name: mikrotik-proxy
# Proxy logs from Mikrotiks
# https://regex101.com/r/C3Odc7/1
format: regex
regex: |
(?x)

(?<ident>web-proxy) ,account \s

(?<host>[^\s]+) \s
(?<method>[^\s]+) \s
(?<url>\S+) \s+

action=(?<proxy_action>\S+)

# Only permitted requests will show a cache status
(?:
\s cache=(?<cache_status>\S+)
(?: \s \([^)]*\) )?
)?
$

- name: mikrotik-dhcp
# DHCP logs from Mikrotiks
# https://regex101.com/r/jBqBIg/1
format: regex
regex: |
(?x)

(?<ident>dhcp) ,info \s

(?:
dhcp-client\s on\s
(?<fw_netname>[-_.A-Za-z0-9]+) \s
(?<dhcp_action>(?: got | lost ) )\s
IP\s address\s
(?<src_ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
(?: \s - \s .*) ?
|
(?<fw_netname>[-_.A-Za-z0-9]+) \s
(?<dhcp_action> (?: de )? assigned ) \s
(?<src_ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3}) \s
(?: for | from | to ) \s
(?<macsrc>[A-Fa-f0-9]{2} (?: :[A-Fa-f0-9]{2}){5} )
(?: \s (?<hostname>\S+) )?
)
\s?
$

- name: mikrotik-ovpn
# OpenVPN logs from Mikrotiks
# https://regex101.com/r/hoEBSE/1
format: regex
regex: |
(?x)

ovpn,info\s

(?:
connection\s established\s from\s
(?<src_ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
,\s port: \s
(?<src_port>\d+)
\s to \s
(?<dst_ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
|
< (?<src_ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3}) >:\s
disconnected
(?: \s <[^>]+>)?
|
(?<vpn_name>[^ :]+) :\s
(?:
initializing\.\.\.
|
connect (?: ed | ing\.\.\. )
|
disconnected
(?: \s <[^>]+>)?
|
terminating\.\.\. .*
|
using\s encoding\s -\s
(?<cipher>[-_/A-Z0-9]+)
)
)

- name: mikrotik-logins
# Login/logout events from Mikrotiks
# https://regex101.com/r/LpmF15/1
format: regex
regex: |
(?x)

system,info,account\s

user\s
(?<user>[^\s]+)
\s logged\s (?: in | out ) \s

(?:
from \s
(?:
(?<source>(?:[0-9]{1,3}\.){3}[0-9]{1,3})
|
(?<srcmac>[A-Fa-f0-9]{2} (?: :[A-Fa-f0-9]{2}){5} )
)
\s
)?

# remote protocol or local
via \s
(?<access_method>.*)

- name: mikrotik-other
# Other logs from Mikrotiks
# https://regex101.com/r/kaPYeX/1
format: regex
regex: |
(?x)

# list of topics obtained from CLI: /system/logging/add topics=<tab>
# include the whole list; as long as this is the last Mikrotik parser,
# more specific ones that matched will have already consumed the message.
(?<ident>
(?:
account |dot1x |l2tp |pptp |store
|acme-client |dude |ldp |ptp |stp
|amt |e-mail |lora |queue |system
|async |error |lte |radius |telephony
|backup |event |manager |radvd |tftp
|bfd |evpn |mme |raw |timer
|bgp |fetch |mpls |read |tr069
|bridge |firewall |mqtt |rip |update
|calc |gps |mvrp |route |upnp
|caps |gsm |natpmp |rpki |ups
|certificate |health |netinstall |rsvp |vpls
|clock |hotspot |netwatch |script |vrrp
|cmr |igmp-proxy |ntp |sertcp |warning
|container |info |ospf |simulator |watchdog
|critical |interface |ovpn |smb |web-proxy
|ddns |ipsec |packet |snmp |wireguard
|debug |iscsi |pim |socksify |wireless
|dhcp |isdn |poe-out |ssh |write
|disk |isis |ppp |sstp |zerotier
|dns |kvm |pppoe |state
)
)
, [^\s]+
\s

# remaining message contents
(?<mikrotik_remainder>.*)