|
| 1 | +<!-- |
| 2 | + SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
| 3 | + SPDX-License-Identifier: Apache-2.0 |
| 4 | +--> |
| 5 | + |
| 6 | +# Configure RBAC |
| 7 | + |
| 8 | +````{only} not publish_bsp |
| 9 | +```{contents} |
| 10 | +:depth: 2 |
| 11 | +:backlinks: none |
| 12 | +:local: true |
| 13 | +``` |
| 14 | +```` |
| 15 | + |
| 16 | +## Inject Istio |
| 17 | + |
| 18 | +1. Label the namespace to enable Istio injection. |
| 19 | + |
| 20 | + ```console |
| 21 | + kubectl label namespace <namespace> istio-injection=enabled --overwrite |
| 22 | + ``` |
| 23 | + |
| 24 | + Replace the `<namespace>` with your target namespace. |
| 25 | + |
| 26 | +2. Delete the existing pods to recreate them with Istio sidecar containers. |
| 27 | + |
| 28 | + ```console |
| 29 | + kubectl delete pod $(kubectl get pods -n <namespace> | awk '{print $1}') -n <namespace> |
| 30 | + ```` |
| 31 | + |
| 32 | +## Deploy Manifests |
| 33 | + |
| 34 | +1. The following sample manifest deploys a gateway and ingress virtual service. |
| 35 | + |
| 36 | + - Update the target namespace for the virtual service resource. |
| 37 | + - The sample manifest applies to NVIDIA NIM for LLMs. For other NVIDIA microservices, update the `match` and `route` for the microservice endpoints. |
| 38 | + - For information about the microservice endpoints, refer to the following documents: |
| 39 | + - [NIM Inference API Inference](https://docs.nvidia.com/nim/large-language-models/latest/api-reference.html) |
| 40 | + - [NIM Embedding API Reference](https://docs.nvidia.com/nim/nemo-retriever/text-embedding/latest/reference.html) |
| 41 | + - [NIM ReRanking API Reference](https://docs.nvidia.com/nim/nemo-retriever/text-reranking/latest/reference.html) |
| 42 | + |
| 43 | + ```{literalinclude} ./manifests/istio-sample-manifest.yaml |
| 44 | + :language: yaml |
| 45 | + ``` |
| 46 | + |
| 47 | +2. Apply the manifest. |
| 48 | + |
| 49 | + ```console |
| 50 | + kubectl apply -f istio-sample-manifest.yaml |
| 51 | + ```` |
| 52 | + |
| 53 | +3. Determine the Istio ingress gateway node port. |
| 54 | + |
| 55 | + ```console |
| 56 | + kubectl get svc -n istio-system | grep ingress |
| 57 | + ``` |
| 58 | + |
| 59 | + *Example Output* |
| 60 | + |
| 61 | + ```output |
| 62 | + istio-ingressgateway LoadBalancer 10.102.8.149 10.28.234.101 15021:32658/TCP,80:30611/TCP,443:31874/TCP,31400:30160/TCP,15443:32430/TCP 22h |
| 63 | + ``` |
| 64 | + |
| 65 | +4. List the worker IP addresses. |
| 66 | + |
| 67 | + ```console |
| 68 | + for node in `kubectl get nodes | awk '{print $1}' | grep -v NAME`; do echo $node ' ' | tr -d '\n'; kubectl describe node $node | grep -i 'internalIP:' | awk '{print $2}'; done |
| 69 | + ``` |
| 70 | + |
| 71 | + *Example Output* |
| 72 | + |
| 73 | + ```console |
| 74 | + nim-test-cluster-03-worker-nbhk9-56b4b888dd-8lpqd 10.120.199.16 |
| 75 | + nim-test-cluster-03-worker-nbhk9-56b4b888dd-hnrxr 10.120.199.23 |
| 76 | + ``` |
| 77 | + |
| 78 | +5. The following manifest creates request authentication resources. |
| 79 | + |
| 80 | + - Update the target namespace. |
| 81 | + - Modify the issuer in the manifest with one of the preceding IP addresses and preceeding ingress Istio gateway node ports, mapped to port 80. |
| 82 | + |
| 83 | + ```{literalinclude} ./manifests/requestAuthentication.yaml |
| 84 | + :language: yaml |
| 85 | + ``` |
| 86 | +
|
| 87 | +6. Apply the manifest. |
| 88 | +
|
| 89 | + ```console |
| 90 | + kubectl apply -f requestAuthentication.yaml |
| 91 | + ``` |
| 92 | + |
| 93 | +7. The following manifest creates an authorization policy resource. |
| 94 | + |
| 95 | + - Update the target namespace. |
| 96 | + - Update the rules that apply to the target microservices. |
| 97 | + |
| 98 | + ```{literalinclude} ./manifests/authorizationPolicy.yaml |
| 99 | + :language: yaml |
| 100 | + ``` |
| 101 | + |
| 102 | +8. Apply the manifest. |
| 103 | + |
| 104 | + ```console |
| 105 | + kubectl apply -f authorizationPolicy.yaml |
| 106 | + ``` |
| 107 | + |
| 108 | +9. Create a token for Keycloak authentication. |
| 109 | + Update the node IP address and ingress gateway node port. |
| 110 | + |
| 111 | + ```console |
| 112 | + TOKEN=`curl -X POST -d "client_id=nvidia-nim" -d "username=nim" -d "password=nvidia123" -d "grant_type=password" "http://10.217.19.114:30611/realms/nvidia-nim-llm/protocol/openid-connect/token"| jq .access_token| tr -d '"' ` |
| 113 | + ``` |
| 114 | + |
| 115 | +10. Verify access to the microservice from Keycloak through the Istio gateway. |
| 116 | + |
| 117 | + ```console |
| 118 | + curl -v -X POST http://10.217.19.114:30611/v1/completions -H "Authorization: Bearer $TOKEN" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "model": "llama-2-13b-chat","prompt": "What is Kubernetes?","max_tokens": 16,"temperature": 1, "n": 1, "stream": false, "stop": "string", "frequency_penalty": 0.0 }' |
| 119 | + ``` |
| 120 | + |
| 121 | + Update the node IP address and ingress gateway port. |
| 122 | + Update the model name if it is not `llama-2-13b-chat`. |
| 123 | + |
| 124 | +11. Generate some more data so it can be visualized in the next step on the Kiali dashboard. |
| 125 | + |
| 126 | + ```console |
| 127 | + for i in $(seq 1 100); do curl -X POST http://10.217.19.114:30611/v1/chat/completions -H 'accept: application/json' -H "Authorization: Bearer $TOKEN" -H 'Content-Type: application/json' -d '{"model": "llama-2-13b-chat","messages": [{"role": "system","content": "You are a helpful assistant."},{"role": "user", "content": "Hello!"}]}' -s -o /dev/null; done |
| 128 | + ``` |
| 129 | + |
| 130 | +12. Access the Istio Dashboard, specifying your client system IP address. |
| 131 | + |
| 132 | + ```console |
| 133 | + istioctl dashboard kiali --address <system-ip> |
| 134 | + ``` |
| 135 | + |
| 136 | +Access in browser with `system-ip` and port `20001`. |
| 137 | + |
| 138 | +## Conclusion |
| 139 | + |
| 140 | +This architecture offers a robust solution for deploying NVIDIA NeMo MicroServices in a secure, scalable, and efficient manner. Integrating advanced service mesh capabilities with OIDC authentication sets a new standard for building sophisticated AI-driven applications. |
0 commit comments