Skip to content

packrat ereport storage and snitch implementation #2126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 61 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
5a626d5
WIP
cbiffle Jun 30, 2025
da365ee
disembreak merge conflict with cosmo
hawkw Jul 1, 2025
7c60074
restart nonce plumbing
hawkw Jul 1, 2025
b3d580e
ringbuf
hawkw Jul 1, 2025
6c40169
actually make CBOR lists
hawkw Jul 2, 2025
0b6406c
reticulating packrat
hawkw Jul 2, 2025
2df36e8
add rng_driver deps
hawkw Jul 3, 2025
9dde194
implement more of packrat, feature-flag ereports
hawkw Jul 3, 2025
7dca1ed
add VPD metadata
hawkw Jul 3, 2025
a0d2954
reticulating app.toml
hawkw Jul 3, 2025
4689314
draw the rest of the owl
hawkw Jul 3, 2025
c4d4530
right-size rx buf, give everyone a snitch
hawkw Jul 3, 2025
31a951e
rm packrat size limits
hawkw Jul 3, 2025
c055af3
untangle rng_driver prio mess on gimletlet
hawkw Jul 3, 2025
3334d72
proper timestampiness
hawkw Jul 3, 2025
db5d6d6
PROPERLY untangle priority mess on gimletlet
hawkw Jul 3, 2025
adaa20d
BEHOLD THE EREPORTULATOR
hawkw Jul 3, 2025
653765f
shut clippy up
hawkw Jul 3, 2025
3dd9f20
always send empty meta map
hawkw Jul 3, 2025
60bb54a
dont ringbuf zerocopy types
hawkw Jul 3, 2025
5f1634b
be smart about not overflowing buf
hawkw Jul 3, 2025
ee7d004
implement limit
hawkw Jul 4, 2025
809fe11
give ereportulator a Hiffy interface
hawkw Jul 5, 2025
90b401d
fix type mismatch for disabled ereport IPCs
hawkw Jul 6, 2025
728dfbb
packrat docs
hawkw Jul 6, 2025
3135cb8
don't crash packrat on internal errors
hawkw Jul 6, 2025
9745992
embiggen packrat ereport stacks
hawkw Jul 6, 2025
9a9befa
reticulating stack sizes
hawkw Jul 6, 2025
b88c6ae
rng to packrat: "don't call us, we'll call you"
hawkw Jul 6, 2025
84f6658
Merge branch 'master' into eliza/snitch-again
hawkw Jul 7, 2025
8eca916
Merge branch 'master' into eliza/snitch-again
hawkw Jul 7, 2025
f98b43f
record in the ringbuf when data is lost
hawkw Jul 7, 2025
4a92c66
include task id and timestamp
hawkw Jul 8, 2025
e0a14ca
ENAs start at 1
hawkw Jul 8, 2025
0629765
encode serial and part numbers as CBOR strings
hawkw Jul 8, 2025
593c204
ena tidiness
hawkw Jul 8, 2025
81497b6
constify CBOR break byte
hawkw Jul 8, 2025
e633749
blargh
hawkw Jul 8, 2025
28b5b87
metadata embetterment
hawkw Jul 8, 2025
d2c77d5
ENAs start at 1 now
hawkw Jul 8, 2025
ff93754
add test reproducing panic when trying to insert loss
hawkw Jul 8, 2025
0b8e1ed
always reserve space for loss
hawkw Jul 8, 2025
9bb7ffa
Revert "always reserve space for loss"
hawkw Jul 8, 2025
22044d1
nicer solution: make the check for loss record space better
hawkw Jul 8, 2025
610b075
these are results now
hawkw Jul 8, 2025
920d9f8
prettier ringbuf
hawkw Jul 9, 2025
32244ee
don't encode any metadata if we don't have VPD
hawkw Jul 9, 2025
1f70fd1
shrink ringbuf a bit
hawkw Jul 9, 2025
f0ff13a
add lots of commentary
hawkw Jul 9, 2025
8e5949c
ereportulator: user-controllable fake VPD
hawkw Jul 9, 2025
f09f283
add ereport stuff to grapefruit
hawkw Jul 9, 2025
864fa57
priority wiggling (ensure sntich is right under net)
hawkw Jul 9, 2025
1aa9cf1
Merge branch 'master' into eliza/snitch-again
hawkw Jul 9, 2025
21057ba
reticulating priorities
hawkw Jul 10, 2025
a3386d7
review suggestions from @mkeeter
hawkw Jul 10, 2025
c76f9d4
oh, that was the wrong error type
hawkw Jul 10, 2025
8951909
make things be constants
hawkw Jul 10, 2025
17ee24e
properly account for number of flushed ereports
hawkw Jul 11, 2025
7543e3b
Merge branch 'master' into eliza/snitch-again
hawkw Jul 11, 2025
76c93af
Merge branch 'master' into eliza/snitch-again
hawkw Jul 14, 2025
a9f2673
Merge branch 'master' into eliza/snitch-again
hawkw Jul 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
590 changes: 324 additions & 266 deletions Cargo.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ leb128 = { version = "0.2.5", default-features = false }
lpc55-pac = { version = "0.4", default-features = false }
memchr = { version = "2.4", default-features = false }
memoffset = { version = "0.6.5", default-features = false }
minicbor = { version = "0.26.4", default-features = false }
multimap = { version = "0.8.3", default-features = false }
nb = { version = "1", default-features = false }
num = { version = "0.4", default-features = false }
Expand Down Expand Up @@ -144,6 +145,7 @@ zip = { version = "0.6", default-features = false, features = ["bzip2", "deflate
attest-data = { git = "https://github.com/oxidecomputer/dice-util", default-features = false, version = "0.4.0" }
dice-mfg-msgs = { git = "https://github.com/oxidecomputer/dice-util", default-features = false, version = "0.2.1" }
gateway-messages = { git = "https://github.com/oxidecomputer/management-gateway-service", default-features = false, features = ["smoltcp"] }
gateway-ereport-messages = { git = "https://github.com/oxidecomputer/management-gateway-service", default-features = false }
gimlet-inspector-protocol = { git = "https://github.com/oxidecomputer/gimlet-inspector-protocol", version = "0.1.0" }
hif = { git = "https://github.com/oxidecomputer/hif", default-features = false }
humpty = { git = "https://github.com/oxidecomputer/humpty", default-features = false, version = "0.1.3" }
Expand Down
32 changes: 31 additions & 1 deletion app/cosmo/base.toml
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,20 @@ notifications = ["i2c1-irq", "i2c2-irq", "i2c3-irq", "i2c4-irq"]
[tasks.packrat]
name = "task-packrat"
priority = 1
stacksize = 1040
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
features = ["cosmo"]
features = ["cosmo", "ereport"]

[tasks.rng_driver]
features = ["h753", "ereport"]
name = "drv-stm32h7-rng"
priority = 6
uses = ["rng"]
start = true
stacksize = 512
task-slots = ["sys", "packrat"]

[tasks.thermal]
name = "task-thermal"
Expand Down Expand Up @@ -347,6 +357,17 @@ extern-regions = ["sram1", "sram2", "sram3", "sram4"]
notifications = ["socket"]
features = ["net", "vlan"]

[tasks.snitch]
name = "task-snitch"
# The snitch should have a priority immediately below that of the net task,
# to minimize the number of components that can starve it from resources.
priority = 6
stacksize = 1200
start = true
task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.spd]
name = "task-cosmo-spd"
priority = 7
Expand Down Expand Up @@ -1517,6 +1538,15 @@ port = 11113
tx = { packets = 3, bytes = 1024 }
rx = { packets = 3, bytes = 1024 }

[config.net.sockets.ereport]
kind = "udp"
owner = {name = "snitch", notification = "socket"}
port = 57005
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tx = { packets = 3, bytes = 1024 }
# v0 ereport requests are always 35B, so just make the buffer exactly
# that size...
rx = { packets = 3, bytes = 35 }

[config.sprot]
# ROT_IRQ (af=0 for GPIO, af=15 when EXTI is implemneted)
rot_irq = { port = "F", pin = 2, af = 0} # XXX can we use EXTI now?
Expand Down
1 change: 0 additions & 1 deletion app/demo-stm32h7-nucleo/app-h753.toml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ task-slots = ["sys"]
[tasks.packrat]
name = "task-packrat"
priority = 2
max-sizes = {flash = 8192, ram = 2048}
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
Expand Down
32 changes: 31 additions & 1 deletion app/gimlet/base.toml
Original file line number Diff line number Diff line change
Expand Up @@ -123,10 +123,11 @@ notifications = ["i2c1-irq", "jefe-state-change"]
[tasks.packrat]
name = "task-packrat"
priority = 1
stacksize = 1040
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
features = ["gimlet"]
features = ["gimlet", "ereport"]

[tasks.thermal]
name = "task-thermal"
Expand Down Expand Up @@ -194,6 +195,15 @@ interrupts = {"hash.irq" = "hash-irq"}
task-slots = ["sys"]
notifications = ["hash-irq"]

[tasks.rng_driver]
features = ["h753", "ereport"]
name = "drv-stm32h7-rng"
priority = 6
uses = ["rng"]
start = true
stacksize = 512
task-slots = ["sys", "packrat"]

[tasks.hf]
name = "drv-gimlet-hf-server"
features = ["h753"]
Expand Down Expand Up @@ -336,6 +346,17 @@ extern-regions = ["sram1", "sram2", "sram3", "sram4"]
notifications = ["socket"]
features = ["net", "vlan"]

[tasks.snitch]
name = "task-snitch"
# The snitch should have a priority immediately below that of the net task,
# to minimize the number of components that can starve it from resources.
priority = 6
stacksize = 1200
start = true
task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.sbrmi]
name = "drv-sbrmi"
priority = 4
Expand Down Expand Up @@ -1315,6 +1336,15 @@ port = 23547
tx = { packets = 3, bytes = 1024 }
rx = { packets = 3, bytes = 512 }

[config.net.sockets.ereport]
kind = "udp"
owner = {name = "snitch", notification = "socket"}
port = 57005
tx = { packets = 3, bytes = 1024 }
# v0 ereport requests are always 35B, so just make the buffer exactly
# that size...
rx = { packets = 3, bytes = 35 }

[config.sprot]
# ROT_IRQ (af=0 for GPIO, af=15 when EXTI is implemneted)
rot_irq = { port = "E", pin = 3, af = 0}
31 changes: 31 additions & 0 deletions app/gimletlet/app-ereportlet.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Gimletlet Ereport test application
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about this, but I'd be fine having this be part of the basic gimletlet image, since it's additive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I was thinking we might not always want it to save on flash, but that seems like a problem that could be solved when it actually becomes a problem!

#
# This image includes the `ereportulator` task, which may be used to generate
# fake error reports to test the ereport aggregation and evacuation subsystem.
#
# Ereports may be generated using `humility hiffy` to call the
# `Ereportulator.fake_ereport` IPC operation. This takes one argument, `n`,
# which is an arbitrary `u32` value included in the ereport data payload. This
# is intended to be used to differentiate between multiple ereports during
# testing.
#
# For example:
#
# $ humility hiffy -t gimletlet hiffy -c Ereportulator.fake_ereport -a n=42
#
name = "gimletlet-ereportlet"
inherit = "app.toml"

[tasks.jefe.config.allowed-callers]
request_reset = ["hiffy"]

[tasks.hiffy]
features = ["h753", "stm32h7", "i2c", "gpio", "spi"]
task-slots = ["sys", "i2c_driver", "user_leds", "ereportulator"]

[tasks.ereportulator]
name = "task-ereportulator"
priority = 5
start = true
task-slots = ["packrat"]
notifications = []
29 changes: 27 additions & 2 deletions app/gimletlet/app.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,12 @@ owner = {name = "sprot", notification = "rot_irq"}

[tasks.packrat]
name = "task-packrat"
priority = 3
max-sizes = {flash = 8192, ram = 2048}
priority = 1
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
stacksize = 1040
features = ["ereport"]

[tasks.control_plane_agent]
name = "task-control-plane-agent"
Expand Down Expand Up @@ -185,6 +186,21 @@ task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.snitch]
name = "task-snitch"
# The snitch should have a priority immediately below that of the net task,
# to minimize the number of components that can starve it from resources.
priority = 4
stacksize = 1200
start = true
task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.rng_driver]
features = ["h753", "ereport"]
task-slots = ["sys", "user_leds", "packrat"]

# VLAN configuration
[config.net.vlans.sidecar1]
vid = 0x301
Expand Down Expand Up @@ -233,6 +249,15 @@ port = 11113
tx = { packets = 3, bytes = 1024 }
rx = { packets = 3, bytes = 1024 }

[config.net.sockets.ereport]
kind = "udp"
owner = {name = "snitch", notification = "socket"}
port = 57005
tx = { packets = 3, bytes = 1024 }
# v0 ereport requests are always 35B, so just make the buffer exactly
# that size...
rx = { packets = 3, bytes = 35 }

[tasks.sprot]
name = "drv-stm32h7-sprot-server"
priority = 5
Expand Down
40 changes: 40 additions & 0 deletions app/grapefruit/app-dev.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,43 @@ inherit = "base.toml"
features = ["uart8"]
uses = ["uart8"]
interrupts = {"uart8.irq" = "usart-irq"}

# Ereport stuff
[tasks.packrat]
stacksize = 1040
features = ["ereport"]

[tasks.snitch]
name = "task-snitch"
priority = 4
stacksize = 1200
start = true
task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.rng_driver]
features = ["h753", "ereport"]
name = "drv-stm32h7-rng"
priority = 6
uses = ["rng"]
start = true
stacksize = 512
task-slots = ["sys", "packrat"]

# Demo/test task for ereports
[tasks.ereportulator]
name = "task-ereportulator"
priority = 6
start = true
task-slots = ["packrat"]
notifications = []

[config.net.sockets.ereport]
kind = "udp"
owner = {name = "snitch", notification = "socket"}
port = 57005
tx = { packets = 3, bytes = 1024 }
# v0 ereport requests are always 35B, so just make the buffer exactly
# that size...
rx = { packets = 3, bytes = 35 }
3 changes: 1 addition & 2 deletions app/grapefruit/base.toml
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,7 @@ notifications = ["timer"]

[tasks.packrat]
name = "task-packrat"
priority = 3
max-sizes = {flash = 8192, ram = 2048}
priority = 1
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
Expand Down
33 changes: 31 additions & 2 deletions app/psc/base.toml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,14 @@ owner = {name = "sequencer", notification = "psu_pwr_ok_6"}



[tasks.rng_driver]
features = ["h753", "ereport"]
name = "drv-stm32h7-rng"
priority = 6
uses = ["rng"]
start = true
stacksize = 512
task-slots = ["sys", "packrat"]

[tasks.i2c_driver]
name = "drv-stm32xx-i2c-server"
Expand All @@ -109,11 +117,12 @@ notifications = ["i2c2-irq", "i2c3-irq"]

[tasks.packrat]
name = "task-packrat"
priority = 3
max-sizes = {flash = 8192, ram = 2048}
priority = 1
stacksize = 1040
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
features = ["ereport"]

[tasks.sequencer]
name = "drv-psc-seq-server"
Expand Down Expand Up @@ -313,6 +322,17 @@ extern-regions = [ "sram1", "sram2", "sram3", "sram4" ]
notifications = ["socket"]
features = ["net", "vlan"]

[tasks.snitch]
name = "task-snitch"
# The snitch should have a priority immediately below that of the net task,
# to minimize the number of components that can starve it from resources.
priority = 5
stacksize = 1200
start = true
task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.idle]
name = "task-idle"
priority = 7
Expand Down Expand Up @@ -535,3 +555,12 @@ owner = {name = "dump_agent", notification = "socket"}
port = 11113
tx = { packets = 3, bytes = 1024 }
rx = { packets = 3, bytes = 1024 }

[config.net.sockets.ereport]
kind = "udp"
owner = {name = "snitch", notification = "socket"}
port = 57005
tx = { packets = 3, bytes = 1024 }
# v0 ereport requests are always 35B, so just make the buffer exactly
# that size...
rx = { packets = 3, bytes = 35 }
25 changes: 23 additions & 2 deletions app/sidecar/base.toml
Original file line number Diff line number Diff line change
Expand Up @@ -256,11 +256,12 @@ notifications = ["socket", "timer"]

[tasks.packrat]
name = "task-packrat"
priority = 3
max-sizes = {flash = 8192, ram = 2048}
priority = 1
stacksize = 1040
start = true
# task-slots is explicitly empty: packrat should not send IPCs!
task-slots = []
features = ["ereport"]

[tasks.sequencer]
name = "drv-sidecar-seq-server"
Expand Down Expand Up @@ -333,6 +334,17 @@ extern-regions = [ "sram1", "sram2", "sram3", "sram4" ]
notifications = ["socket"]
features = ["net", "vlan"]

[tasks.snitch]
name = "task-snitch"
# The snitch should have a priority immediately below that of the net task,
# to minimize the number of components that can starve it from resources.
priority = 6
stacksize = 1200
start = true
task-slots = ["net", "packrat"]
features = ["vlan"]
notifications = ["socket"]

[tasks.idle]
name = "task-idle"
priority = 8
Expand Down Expand Up @@ -1163,6 +1175,15 @@ port = 11112
tx = { packets = 3, bytes = 2048 }
rx = { packets = 3, bytes = 2048 }

[config.net.sockets.ereport]
kind = "udp"
owner = {name = "snitch", notification = "socket"}
port = 57005
tx = { packets = 3, bytes = 1024 }
# v0 ereport requests are always 35B, so just make the buffer exactly
# that size...
rx = { packets = 3, bytes = 35 }

[config.auxflash]
memory-size = 33_554_432 # 256 Mib / 32 MiB
slot-count = 16 # 2 MiB slots
Expand Down
Loading
Loading