Skip to content

Conversation

@alshabib
Copy link
Contributor

@alshabib alshabib commented Sep 4, 2025

No description provided.

@alshabib alshabib requested a review from robshakir September 4, 2025 20:28
@coveralls
Copy link

coveralls commented Sep 4, 2025

Pull Request Test Coverage Report for Build 18838834993

Details

  • 0 of 36 (0.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 0.0%

Changes Missing Coverage Covered Lines Changed/Added Lines %
containerz/containerz.pb.go 0 36 0.0%
Totals Coverage Status
Change from base Build 18826081167: 0.0%
Covered Lines: 0
Relevant Lines: 12467

💛 - Coveralls

Copy link
Member

@robshakir robshakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a reasonable change to me. No concerns.

Copy link

@LimeHat LimeHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to describe what happens if a supervisor role changes: do we start/stop containers according to the new roles automatically?

// location defines where a container should run. If a container is started
// on the backup, it is expected that any resources (image, volume, etc)
// be replicated to the backup prior to actually starting the container.
// This parameter cannot be updated, but rather the container needs to be
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the supervisor role changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRIMARY should always point to the current PRIMARY and same goes for BACKUP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternate here is to have this be a component reference that allows one to specify the name of the component the container should be launched on. It requires knowledge of the component name (potentially a downside because this is not uniform, whereas PRIMARY and BACKUP are), but would be extensible to allow for containers to be launched on other components in the future.

Copy link

@LimeHat LimeHat Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRIMARY should always point to the current PRIMARY and same goes for BACKUP.

Of course, but the question what happens to the already deployed containers?

Let's say you launch containers foo on primary and bar on backup (with the restart policy=always), and the primary reboots. Then:

  • foo temporary goes offline
  • the control module that hosted bar transitions from backup to primary role.
    • do we stop bar?
    • do we launch foo?
  • the other control module comes back online after the reboot in the backup role
    • do we restart foo (which was initially deployed to run on primary)?
    • do we start bar?

Indeed, the Rob's suggestion would work if your objective is a static placement (instead of a dynamic container placement that follows the role). It really depends on the use case.

From the current description I can't tell any of this.

Copy link
Contributor Author

@alshabib alshabib Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I think the discussion revolves around what happens when a backup card comes back online. I think we can distinguish to cases, dual and single, where dual mode both cards are available, and single mode where only one card is available.

  • Device is in dual mode: Containers should be placed according to the location parameter.
  • Device is in single mode: All containers should be running on the remaining card.

One nuance may be the ALL location where the two containers where running on both cards, when the device is in single mode there should still only be one instance of the container.

We could introduce another state which could be BACKUP_ONLY meaning that a container can only run on a backup and therefore if there is no backup card available then the container does not run.

PRIMARY_ONLY does not make sense since there is always a primary.

Does this make sense to you? If yes, I will update the spec.

I think Rob's suggestion could be accommodated by adding a COMPONENT state which indicated that a the location will be specified in a field holding the component name.

// Devices to be added to the container.
repeated Device devices = 14;

// Location describes where a resources (image, container, volume) is located.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it describes the image placement (and possibly volume placement)? Aren't the images always synchronized between active and standby?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not specify image placement as images are provided by the Deploy RPC. Images should probably be replicated to the primary and backup all time, but the spec does not mandate when those images should be replicated.

But this comment is not accurate - I updated it to reflect a more accurate description.

@LimeHat
Copy link

LimeHat commented Sep 15, 2025

Another consideration, and I don't know if you already thought about it and whether it requires any modifications to the .proto, is related to networking.

string network = 8;

What we discussed previously was that network (typically) will be set to host, and host should imply the namespace where the request was received.
This host logic won't work for deployments on backup or all. It would be nice to understand what's the plan here (we can move this conversation to gchat/buganizer if that'd be preferred)

@alshabib
Copy link
Contributor Author

Let's have this conversation over chat or a meet and bring it back here. I am not sure I follow the reasoning here.

@alshabib
Copy link
Contributor Author

Another consideration, and I don't know if you already thought about it and whether it requires any modifications to the .proto, is related to networking.

string network = 8;

What we discussed previously was that network (typically) will be set to host, and host should imply the namespace where the request was received.
This host logic won't work for deployments on backup or all. It would be nice to understand what's the plan here (we can move this conversation to gchat/buganizer if that'd be preferred)

I think the meaning of host is to run the container container in the same namespace as the docker daemon. I don't believe the location field has any impact on this.

@alshabib alshabib force-pushed the locations branch 2 times, most recently from 0457a71 to ce173cf Compare October 27, 2025 10:46
@alshabib alshabib requested review from LimeHat and robshakir October 27, 2025 10:46
@alshabib alshabib merged commit 212af31 into openconfig:main Dec 2, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants