Skip to content

Conversation

junpengdev
Copy link
Member

Issue #, if available:

Description of changes:

In this PR, I am setting up hybrid nodes for nvidia test. It doesn't belong to add-on test as the nodes require specialized hardware. I modified InstanceType func in NodeadmOS interface to accept gpuInstance flag. When this flag is set to true, it choose instance type among g4dn.xlarge, g4dn.2xlarge, g5g.xlarge, g5g.2xlarge, based on architecture and instance size.

Testing (if applicable):

Documentation added/planned (if applicable):

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@@ -94,8 +105,16 @@ func getAmiIDFromSSM(ctx context.Context, client *ssm.Client, amiName string) (*
}

// an unknown size and arch combination is a coding error, so we panic
func getInstanceTypeFromRegionAndArch(_ string, arch architecture, instanceSize e2e.InstanceSize) string {
instanceType, ok := instanceSizeToType[arch][instanceSize]
func getInstanceTypeFromRegionAndArch(_ string, arch architecture, instanceSize e2e.InstanceSize, gpuInstance bool) string {
Copy link
Member

@tatlat tatlat Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid using boolean arguments? It can make code less readable and less extensible. An enum approach might be better. Maybe something like this:

const (
    StandardInstance InstanceCategory = iota
    GPUInstance
)

func getInstanceTypeFromRegionAndArch(_ string, arch architecture, instanceSize e2e.InstanceSize, category InstanceCategory)

@@ -98,7 +98,7 @@ var _ = Describe("Hybrid Nodes", func() {
k8sVersion = test.OverrideNodeK8sVersion
}

testNode := test.NewTestNode(ctx, instanceName, nodeName, k8sVersion, nodeOS, provider, e2e.Large)
testNode := test.NewTestNode(ctx, instanceName, nodeName, k8sVersion, nodeOS, provider, e2e.Large, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I meant by less readable code. I don't know what "false" means here.

@junpengdev junpengdev force-pushed the setup_nvidia_test branch from 23d209e to fd3e73a Compare June 6, 2025 17:58
tatlat
tatlat previously approved these changes Jun 9, 2025
@junpengdev junpengdev merged commit d9944d8 into aws:main Jun 10, 2025
6 checks passed
@junpengdev junpengdev deleted the setup_nvidia_test branch July 31, 2025 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants