Skip to content

Commit f4c8a81

Browse files
authored
Merge pull request #4 from redhat-gpte-devopsautomation/main
update
2 parents 45ff4cb + 548f7e8 commit f4c8a81

File tree

6 files changed

+154
-0
lines changed

6 files changed

+154
-0
lines changed
124 KB
Loading
206 KB
Loading
65.3 KB
Loading

content/modules/ROOT/nav.adoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,7 @@
2424
2525
* xref:13-AI-model-exploration.adoc[13. AI Model Exploration]
2626
27+
* xref:14-AI-bring-your-own-model.adoc[14. Bring your own model]
28+
29+
* xref:15-troubleshooting.adoc[15. Troubleshooting]
30+
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
== AI Bring your own model
2+
3+
These steps cover the download from Huggingface and upload into Minio where the model can be served via RHOAI's vLLM.
4+
5+
Find your desired model at Huggingface. 8B parameter models have a better chance at "fitting".
6+
7+
Also, some models require approvals at Huggingface.co, make sure to apply and get approved if that is required.
8+
9+
```
10+
open https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
11+
```
12+
13+
Install the huggingface CLI
14+
15+
```
16+
brew install huggingface-cli
17+
```
18+
19+
and install the Minio CLI
20+
21+
```
22+
brew install minio/stable/mc
23+
```
24+
25+
You will also need `oc` and be logged in as admin
26+
27+
Know where you plan to download these files
28+
29+
```
30+
cd /Users/burr/my-projects/models
31+
```
32+
33+
```
34+
huggingface-cli download Qwen/Qwen2.5-7B-Instruct --local-dir Qwen2.5-7B-Instruct
35+
```
36+
37+
Wait for the download
38+
39+
```
40+
ls -la Qwen2.5-7B-Instruct
41+
```
42+
43+
```
44+
total 29792296
45+
drwxr-xr-x 17 burr staff 544 Sep 23 17:59 .
46+
drwxr-xr-x 7 burr staff 224 Sep 23 17:59 ..
47+
drwxr-xr-x 3 burr staff 96 Sep 23 17:54 .cache
48+
-rw-r--r-- 1 burr staff 1519 Sep 23 17:54 .gitattributes
49+
-rw-r--r-- 1 burr staff 11343 Sep 23 17:54 LICENSE
50+
-rw-r--r-- 1 burr staff 5978 Sep 23 17:54 README.md
51+
-rw-r--r-- 1 burr staff 663 Sep 23 17:54 config.json
52+
-rw-r--r-- 1 burr staff 243 Sep 23 17:54 generation_config.json
53+
-rw-r--r-- 1 burr staff 1671839 Sep 23 17:54 merges.txt
54+
-rw-r--r-- 1 burr staff 3945441440 Sep 23 17:58 model-00001-of-00004.safetensors
55+
-rw-r--r-- 1 burr staff 3864726352 Sep 23 17:57 model-00002-of-00004.safetensors
56+
-rw-r--r-- 1 burr staff 3864726424 Sep 23 17:57 model-00003-of-00004.safetensors
57+
-rw-r--r-- 1 burr staff 3556377672 Sep 23 17:59 model-00004-of-00004.safetensors
58+
-rw-r--r-- 1 burr staff 27752 Sep 23 17:54 model.safetensors.index.json
59+
-rw-r--r-- 1 burr staff 7031645 Sep 23 17:54 tokenizer.json
60+
-rw-r--r-- 1 burr staff 7305 Sep 23 17:54 tokenizer_config.json
61+
-rw-r--r-- 1 burr staff 2776833 Sep 23 17:54 vocab.json
62+
```
63+
64+
Now upload to your cluster's Minio
65+
66+
67+
```
68+
oc project ic-shared-minio
69+
```
70+
71+
```
72+
MINIO_API="https://$(oc get route minio -o jsonpath='{.spec.host}')"
73+
```
74+
75+
```
76+
B64_USER=$(kubectl get secret minio-keys -o jsonpath='{.data.minio_root_user}')
77+
MINIO_USER=$(echo $B64_USER | base64 --decode)
78+
echo "user:$B64_USER is decoded as $MINIO_USER"
79+
80+
```
81+
82+
```
83+
B64_PASSWORD=$(kubectl get secret minio-keys -o jsonpath='{.data.minio_root_password}' -n ic-shared-minio)
84+
MINIO_PASSWORD=$(echo $B64_PASSWORD | base64 --decode)
85+
echo "password:$B64_PASSWORD is decoded as $MINIO_PASSWORD"
86+
```
87+
88+
```
89+
mc alias set minio $MINIO_API $MINIO_USER $MINIO_PASSWORD
90+
```
91+
92+
```
93+
mc ls minio/models
94+
```
95+
96+
Change the name "a InferenceService name must consist of lower case alphanumeric characters or ‘-’, and must start with alphabetical character."
97+
98+
99+
```
100+
mv Qwen2.5-7B-Instruct qwen257binstruct
101+
```
102+
103+
104+
```
105+
mc cp --recursive qwen257binstruct minio/models/
106+
```
107+
108+
Wait a while
109+
110+
```
111+
mc ls minio/models
112+
```
113+
114+
And you can use the Minio GUI
115+
116+
image::bring-your-own-model-0.png[]
117+
118+
Add the new model to your favorite template.
119+
120+
You can edit `rhoai-on-rhdh-templates
121+
scaffolder-templates
122+
chatbot-self-hosted-llm-template
123+
template.yaml`
124+
125+
Use the same string of `qwen257binstruct` as that is its name in Minio and the demo's templates assume that the model name matches the name in Minio.
126+
127+
image::bring-your-own-model-1.png[]
128+
129+
Now go run the wizard
130+
131+
image::bring-your-own-model-2.png[]
132+
133+
134+
135+
136+
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
== Troubleshooting
2+
3+
=== Cluster Sleep/Wake
4+
5+
When ordering this demo from demo.redhat.com, you cluster will be woken up from a *stopped* state. Once startup is complete, you should wait around a further 15 minutes for the cluster to fully initialize. There are occasions when certain components in the cluster are not initialized in a healthy state and the following issues occur:
6+
7+
==== Webhook not triggering pipeline run
8+
9+
You will notice that any source code changes, tagging and releasing will not trigger a pipeline. To confirm this issue, navigate to your software components *xxx-dev* project and go to *Workloads -> Pods* then locate the *event listener* pod with the name *el-<component>-el\**. If you select the *Logs* tab, you should notice a certificate error while communicating *tekton-trigger-core-interceptors* service. To fix this issue, switch the project to *openshift-pipelines* and go to *Workloads - Pods*. Find the pod *tekton-triggers-core-interceptors-** and delete it. This action should recreat the pod and should resolve this issue. Attempt to trigger the pipeline again.
10+
11+
[Important]
12+
====
13+
If there are no logs in the event listener indicating a certificate issue, this may be caused by a timing issue i.e. you may have triggered your pipeline before the webhook was successfully created. To ensure that the webhook was created, open your software component on Developer Hub and ensure you are on the *Overview* tab. On the bottom right, under *Deployment Summary* confirm that the ArgoCD App *<component-name>-dev-build* has a *Sync status* of *Synced* and *Health status* of *Healthy* before triggering your pipeline.
14+
====

0 commit comments

Comments
 (0)