Skip to content

Troubleshooting guide #435

@lentzi90

Description

@lentzi90

We should add a troubleshooting guide with the most common issues and solutions. Since we integrate with Cluster API, it also makes sense to link to their troubleshooting page, and if possible give some guidance on when an issue is with CAPI and when it is with Metal3.

Things to include

BMO/Ironic/IPA:

  • How to verify that BMO and Ironic are operational to rule out configuration errors Troubleshooting: Verify BMO and Ironic healthy #490
    • Containers should be running (not restarting, this could indicate that BMO cannot connect to Ironic)
    • Ironic should not be "waiting for IP". Check Ironic logs!
  • What does inspection errors look like, what could be the cause and solution?
    • BMC credentials could be wrong or missing. These issues should show up in the BareMetalHost status or as events (to be confirmed)
    • The host is not able to communicate back results to Ironic (this will result in a timeout). Access to serial logs is needed to determine the exact issue in these cases.
    • Incompatible configuration. For example, attempt to use virtualmedia or UEFI when not supported. This should show up in the BareMetalHost status or as events (to be confirmed).
  • Provisioning errors
    • Image errors (wrong checksum, missing image, image too large to decompress or too large for disk).
      This should show up in the BMH status (to be confirmed)
    • No root device found
      This should show up in the BMH status (to be confirmed)

CAPM3/IPAM:

  • No BareMetalHost available/matching. This should show as an event when describing the Metal3Machine (to be confirmed)
  • Provider ID missing. This can happen if noCloudProvider is set to false on the Metal3Cluster when no external cloud provider is used.
  • nodeRef missing. Really a CAPI level issue. Could be caused by failure to boot the image or failure to join it to the cluster. Access to the node or serial logs is needed to determine the issue. Especially cloud-final logs are of interest.

Metadata

Metadata

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.triage/acceptedIndicates an issue is ready to be actively worked on.

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions