Issue with GPU sharing on NVIDIA L40S

### What happened?

Hi,
I am using KAI scheduler on my EKS cluster, my instance type is G6E which is using NVIDIA L40S.
I would like to use GPU sharing to split GPU resource in half so that 2 pod AI AGENT can run on a node
I also configrue the queue like
```
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
  name: default
spec:
  resources:
    cpu:
      quota: -1
      limit: -1
      overQuotaWeight: 1
    gpu:
      quota: -1
      limit: -1
      overQuotaWeight: 1
    memory:
      quota: -1
      limit: -1
      overQuotaWeight: 1
---
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
  name: test
spec:
  parentQueue: default
  resources:
    cpu:
      quota: -1
      limit: -1
      overQuotaWeight: 1
    gpu:
      quota: -1
      limit: -1
      overQuotaWeight: 1
    memory:
      quota: -1
      limit: -1
      overQuotaWeight: 1
```

This is my test test pod:
```
apiVersion: v1
kind: Pod
metadata:
  name: gpu-sharing-01
  labels:
    kai.scheduler/queue: test
  annotations:
    gpu-fraction: "0.5"
spec:
  schedulerName: kai-scheduler
  tolerations:     
    - effect: NoSchedule
      key: ai-type
      operator: Equal
      value: strong
    - effect: NoExecute
      key: ai-type
      operator: Equal
      value: strong
    - effect: NoSchedule  
      key: nvidia.com/gpu
      value: "true"
  containers:
    - name: ubuntu
      image: ubuntu
      args: ["sleep", "infinity"]      
```

But the pod get stuck in pending state, when I try to describe the pod, I got this error:
```
  ERROR   Reached timeout while waiting for GPU reservation pod to be allocated   {"controller": "bindrequest", "controllerGroup": "scheduling.run.ai", "controllerKind": "BindRequest", "BindRequest": {"name":"gpu-sharing-01","namespace":"test"}, "namespace": "test", "name": "gpu-sharing-01", "reconcileID": "41b3665e-0a46-4730-b84a-abf0da9bd1ed", "nodeName": "ip-10-0-3-95.eu-central-1.compute.internal", "name": "gpu-reservation-ip-10-0-3-95.eu-central-1.compute.internal-4vhm5", "error": "timeout"}
github.com/NVIDIA/KAI-scheduler/pkg/binder/binding/resourcereservation.(*service).waitForGPUReservationPodAllocation
        /local/pkg/binder/binding/resourcereservation/resource_reservation.go:424
github.com/NVIDIA/KAI-scheduler/pkg/binder/binding/resourcereservation.(*service).createGPUReservationPodAndGetIndex
        /local/pkg/binder/binding/resourcereservation/resource_reservation.go:333
github.com/NVIDIA/KAI-scheduler/pkg/binder/binding/resourcereservation.(*service).acquireGPUIndexByGroup
        /local/pkg/binder/binding/resourcereservation/resource_reservation.go:299
github.com/NVIDIA/KAI-scheduler/pkg/binder/binding/resourcereservation.(*service).ReserveGpuDevice
        /local/pkg/binder/binding/resourcereservation/resource_reservation.go:209
github.com/NVIDIA/KAI-scheduler/pkg/binder/binding.(*Binder).reserveGPUs
        /local/pkg/binder/binding/binder.go:101
github.com/NVIDIA/KAI-scheduler/pkg/binder/binding.(*Binder).Bind
        /local/pkg/binder/binding/binder.go:53
github.com/NVIDIA/KAI-scheduler/pkg/binder/controllers.(*BindRequestReconciler).Reconcile
        /local/pkg/binder/controllers/bindrequest_controller.go:155
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:334
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:294
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:255
```
Is the gpu sharing only worked for GPU supports MIG? 
If the anwser of above question is yes, could you suggest me some solution on this case? We didn't ahve anough budget to use higher NVIDIA like NVIDIA H100 or H200

### What did you expect to happen?

We areusing NVIDIA L40S and expect to deploy 2 pod into a node

### Environment

- Kubernetes version: v1.33.1
- KAI Scheduler version: v0.9.2
- Cloud provider or hardware configuration: AWS Elastic Kubernetes Service
- Tools that you are using KAI together with: Helm, Nvidia GPU-Operator
- Anything else that is relevant: None


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with GPU sharing on NVIDIA L40S #522

What happened?

What did you expect to happen?

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with GPU sharing on NVIDIA L40S #522

Description

What happened?

What did you expect to happen?

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions