Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
245 commits
Select commit Hold shift + click to select a range
cf60682
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
1fd1de1
DocSum - fix main
Feb 13, 2025
bd2d47e
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
2459ecb
DocSum - fix main
Feb 13, 2025
4d35065
Merge remote-tracking branch 'origin/main'
Feb 19, 2025
6d5049d
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
9dfbdc5
DocSum - fix main
Feb 13, 2025
a8857ae
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
5a38b26
DocSum - fix main
Feb 13, 2025
0e2ef94
Merge remote-tracking branch 'origin/main'
Feb 25, 2025
30071db
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Mar 11, 2025
0757dec
Merge branch 'opea-project:main' into main
artem-astafev Mar 20, 2025
9aaf378
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Mar 26, 2025
9cf4b6e
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Apr 3, 2025
8e89787
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Apr 5, 2025
a117c69
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Apr 11, 2025
7fed5cf
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Apr 15, 2025
28504e1
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
Apr 15, 2025
4cd6a50
Merge branch 'main' of https://github.com/opea-project/GenAIExamples
chyundunovDatamonsters Apr 21, 2025
9ccf540
DocSum - refactoring README.md
chyundunovDatamonsters Apr 24, 2025
b5df348
Fix mismatched environment variable (#1575)
xiguiw Feb 19, 2025
60dd862
Fix trivy issue (#1569)
ZePan110 Feb 20, 2025
06d31cc
Update AgentQnA and DocIndexRetriever (#1564)
minmin-intel Feb 22, 2025
59ffc84
Update README.md of AIPC quick start (#1578)
yinghu5 Feb 23, 2025
4bd9c1a
Fix "OpenAI" & "response" spelling (#1561)
eero-t Feb 25, 2025
2abf738
Bump gradio from 5.5.0 to 5.11.0 in /DocSum/ui/gradio (#1576)
dependabot[bot] Feb 25, 2025
8e8d296
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
9aba6d0
DocSum - fix main
Feb 13, 2025
24f886f
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
2e1b401
DocSum - fix main
Feb 13, 2025
c9a7807
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
b2e1523
DocSum - fix main
Feb 13, 2025
947aa81
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
6b2b297
DocSum - fix main
Feb 13, 2025
f7b3be6
Align mongo related image names with comps (#1543)
Spycsh Feb 27, 2025
23fbd2f
Fix ChatQnA ROCm compose Readme file and absolute path for ROCM CI te…
artem-astafev Feb 27, 2025
4b47c3e
Fix async in chatqna bug (#1589)
XinyaoWa Feb 27, 2025
ed01594
Fix benchmark scripts (#1517)
chensuyue Feb 28, 2025
d1861f9
Top level README: add link to github.io documentation (#1584)
alexsin368 Feb 28, 2025
a30a6e3
fix click example button issue (#1586)
WenjiaoYue Feb 28, 2025
859b697
ChatQnA Docker compose file for Milvus as vdb (#1548)
ezelanza Feb 28, 2025
3460a38
Fix cd workflow condition (#1588)
chensuyue Mar 3, 2025
fc75a8c
Update DBQnA tgi docker image to latest tgi 2.4.0 (#1593)
yinghu5 Mar 3, 2025
2b701ca
Revert chatqna async and enhance tests (#1598)
Spycsh Mar 3, 2025
12845f1
Use model cache for docker compose test (#1582)
ZePan110 Mar 4, 2025
aef57f6
open chatqna frontend test (#1594)
chensuyue Mar 4, 2025
bd0996c
Enable CodeGen,CodeTrans and DocSum model cache for docker compose te…
ZePan110 Mar 4, 2025
fe3132e
bugfix GraphRAG updated docker compose and env settings to fix issues…
rbrugaro Mar 4, 2025
a9154e8
Enable ChatQnA model cache for docker compose test. (#1605)
ZePan110 Mar 5, 2025
10fb928
Enable SearchQnA model cache for docker compose test. (#1606)
ZePan110 Mar 5, 2025
f746c78
Fix docker image opea/edgecraftrag security issue #1577 (#1617)
Yongbozzz Mar 5, 2025
a7c83e3
[AudioQnA] Fix the LLM model field for inputs alignment (#1611)
wangkl2 Mar 5, 2025
db31e55
Update compose.yaml for SearchQnA (#1622)
ZePan110 Mar 7, 2025
3b33f30
Update compose.yaml for ChatQnA (#1621)
ZePan110 Mar 7, 2025
e6d0c27
Update compose.yaml (#1620)
ZePan110 Mar 7, 2025
caefabe
Update compose.yaml (#1619)
ZePan110 Mar 7, 2025
a74dc1e
Enable vllm for CodeTrans (#1626)
letonghan Mar 7, 2025
f6b63d1
Update model cache for AgentQnA (#1627)
ZePan110 Mar 7, 2025
e836b36
Use GenAIComp base image to simplify Dockerfiles (#1612)
eero-t Mar 7, 2025
48a6a0a
[Bug: 112] Fix introduction in GenAIExamples main README (#1631)
srajabos Mar 7, 2025
48ee4c4
Fix corner CI issue when the example path deleted (#1634)
chensuyue Mar 7, 2025
7b78247
[ChatQnA] Show spinner after query to improve user experience (#1003)…
wangleflex Mar 7, 2025
6fbe02d
Use the latest HabanaAI/vllm-fork release tag to build vllm-gaudi ima…
chensuyue Mar 7, 2025
06dab21
Set vLLM as default model for FaqGen (#1580)
XinyaoWa Mar 10, 2025
a6d6f1f
Fix vllm model cache directory (#1642)
wangkl2 Mar 10, 2025
cb831b0
Enhance ChatQnA test scripts (#1643)
chensuyue Mar 10, 2025
ffa0ead
Add GitHub Action to check and close stale issues and PRs (#1646)
XuehaoSun Mar 12, 2025
b725c26
Use GenAIComp base image to simplify Dockerfiles & reduce image sizes…
eero-t Mar 13, 2025
faf8f09
Enable inject_commit to docker image feature. (#1653)
ZePan110 Mar 13, 2025
6e262af
Enable CodeGen vLLM (#1636)
xiguiw Mar 13, 2025
ceffcff
[ChatQnA][docker]Check healthy of redis to avoid dataprep failure (#1…
gavinlichn Mar 13, 2025
7a4e2a7
Enable GraphRAG and ProductivitySuite model cache for docker compose …
ZePan110 Mar 13, 2025
0e6eacb
Enable Gaudi3, Rocm and Arc on manually release test. (#1615)
ZePan110 Mar 13, 2025
a658f80
Refine README with highlighted examples and updated support info (#1006)
CharleneHu-42 Mar 13, 2025
d12765b
[AudioQnA] Enable vLLM and set it as default LLM serving (#1657)
wangkl2 Mar 14, 2025
a355478
[ChatQnA] Enable Prometheus and Grafana with telemetry docker compos…
louie-tsai Mar 14, 2025
43df124
Update stale issue and PR settings to 30 days for inactivity (#1661)
XuehaoSun Mar 14, 2025
62b72dd
Add final README.md and set_env.sh script for quickstart review. Prev…
jedwards-habana Mar 14, 2025
67ab246
Fix input issue for manual-image-build.yml (#1666)
chensuyue Mar 17, 2025
422698d
Set vLLM as default model for VisualQnA (#1644)
Spycsh Mar 18, 2025
0516abd
Fix workflow issues. (#1691)
ZePan110 Mar 19, 2025
d8191af
Enable base image build in CI/CD (#1669)
chensuyue Mar 19, 2025
c1a8cdc
fix errors for running AgentQnA on xeon with openai and update readme…
minmin-intel Mar 20, 2025
bbfe10a
Add new UI/new features for EC-RAG (#1665)
Yongbozzz Mar 20, 2025
ef0a480
Merge FaqGen into ChatQnA (#1654)
XinyaoWa Mar 20, 2025
40ddb37
Integrate docker images into compose yaml file to simplify the run in…
louie-tsai Mar 21, 2025
d067da7
change gaudi node exporter from default one to 41612 (#1702)
louie-tsai Mar 21, 2025
6938c7e
Use GenAIComp base image to simplify Dockerfiles - part 3/4 (#1671)
eero-t Mar 24, 2025
aadc101
Enhance port release before CI test (#1704)
chensuyue Mar 24, 2025
3673479
Adding files to deploy AudioQnA application on ROCm vLLM (#1655)
chyundunovDatamonsters Mar 24, 2025
92848dc
Fix CD cancel issue (#1706)
ZePan110 Mar 24, 2025
ca1cd8e
Adding files to deploy CodeGen application on ROCm vLLM (#1544)
chyundunovDatamonsters Mar 24, 2025
ebcaf1a
Adding files to deploy CodeTrans application on ROCm vLLM (#1545)
chyundunovDatamonsters Mar 24, 2025
ec814b1
remove 3 useless environments. (#1708)
lkk12014402 Mar 24, 2025
ac069c9
Remove FaqGen from ProductivitySuite (#1709)
XinyaoWa Mar 24, 2025
da0f28e
[docs] Multimodal Endpoints Issue (#1700)
theBeginner86 Mar 25, 2025
298de03
typo for docker image (#1717)
louie-tsai Mar 25, 2025
7b0840f
[Bug: 899] Create a version of DocIndexRetriever example with Zilliz/…
srajabos Mar 26, 2025
2c90366
Update TEI docker image to 1.6 (#1650)
xiguiw Mar 27, 2025
a03f538
Enable vllm for DocSum (#1716)
letonghan Mar 28, 2025
e34a215
Expand example running timeout for the new test cluster with k8s runn…
chensuyue Mar 31, 2025
e4f7f1b
Adding files to deploy Translation application on ROCm vLLM (#1648)
chyundunovDatamonsters Mar 31, 2025
a649e33
Adding files to deploy SearchQnA application on ROCm vLLM (#1649)
chyundunovDatamonsters Mar 31, 2025
852f9d4
Update MMQnA tgi-gaudi verison to match compose.yaml (#1663)
dmsuehir Mar 31, 2025
2740b7b
Update TGI image versions (#1625)
xiaotia3 Apr 1, 2025
40bae97
Add short descriptions to the images OPEA publishes on Docker Hub (#1…
zhanmyz Apr 1, 2025
c126d41
Update README.md to have Table for contents (#1721)
louie-tsai Apr 1, 2025
a3cfbc8
Add short descriptions to the images OPEA publishes on Docker Hub (#1…
zhanmyz Apr 2, 2025
622fffa
Adding files to deploy AgentQnA application on ROCm vLLM (#1613)
chyundunovDatamonsters Apr 2, 2025
66be3e6
Fix model cache path and use Random to avoid ns conflict (#1734)
yongfengdu Apr 2, 2025
b21bf43
Add model environment variable (#1660)
ZePan110 Apr 2, 2025
4eb572a
Add Telemetry support for AgentQnA using Grafana, Prometheus and Jaeg…
louie-tsai Apr 3, 2025
955caa7
Fix README for deploy AgentQnA application on ROCm vLLM (#1742)
chyundunovDatamonsters Apr 3, 2025
98aa001
MultimodalQnA audio features completion (#1698)
mhbuehler Apr 3, 2025
a109d6e
Code Enhancement for vllm inference (#1729)
Yongbozzz Apr 3, 2025
04f2cac
Adding files to deploy DocSum application on ROCm vLLM (#1572)
chyundunovDatamonsters Apr 3, 2025
a319903
Fix relative path validity issue (#1750)
ZePan110 Apr 3, 2025
5d06aea
Adding files to deploy ChatQnA application on ROCm vLLM (#1560)
chyundunovDatamonsters Apr 3, 2025
1746b2f
Add AudioQnA multilang tts test (#1746)
Spycsh Apr 3, 2025
733bb81
[CICD enhance] ChatQnA run CI with latest base image, group logs in G…
chensuyue Apr 3, 2025
4a539cf
[ChatQnA] update to the latest Grafana Dashboard (#1728)
louie-tsai Apr 3, 2025
5f8774a
ChatQnA - Adding files to deploy an application in the K8S environmen…
Apr 5, 2025
9784ba6
ChatQnA - Adding files to deploy an application in the K8S environmen…
Apr 5, 2025
eca2a54
ChatQnA - Adding files to deploy an application in the K8S environmen…
Apr 5, 2025
796cea2
Adding files to deploy VisualQnA application on ROCm vLLM (#1751)
artem-astafev Apr 7, 2025
43fbc46
Fix vllm and vllm-fork tags (#1766)
ZePan110 Apr 7, 2025
840920b
Add dockerhub login step to avoid 429 Too Many Requests (#1772)
chensuyue Apr 8, 2025
ce00546
Refine third parties links (#1764)
Spycsh Apr 8, 2025
ee2aad2
compatible open-webui for opea agent. (#1765)
lkk12014402 Apr 8, 2025
9129af5
fix docker image clean up issue (#1773)
chensuyue Apr 8, 2025
50ed6f3
Sync values yaml file for 1.3 release (#1748)
yongfengdu Apr 8, 2025
dd01990
Fix ChatQnA port to internal vllm port (#1763)
XinyaoWa Apr 9, 2025
e7b3847
Fix vLLM CPU initialize engine issue for DeepSeek models (#1762)
lvliang-intel Apr 9, 2025
c9e3baa
Fix GenAIExamples #1607 (#1776)
ctao456 Apr 9, 2025
1ab4b56
Update ChatQna & CodeGen README.md with new Automated Terraform Deplo…
lucasmelogithub Apr 9, 2025
29c31ec
Iteratively add image docker hub description (#1768)
zhanmyz Apr 9, 2025
449edab
Adaptation to vllm v0.8.3 build paths (#1761)
ZePan110 Apr 9, 2025
42cd24a
Use GenAIComp base image to simplify Dockerfiles & reduce image sizes…
eero-t Apr 9, 2025
17a0499
CodGen Examples using-RAG-and-Agents (#1757)
MSCetin37 Apr 9, 2025
239595a
Fix typo in CodeGen README (#1783)
lvliang-intel Apr 9, 2025
fbf78b9
[pre-commit.ci] pre-commit autoupdate (#1771)
pre-commit-ci[bot] Apr 9, 2025
4377f19
Adding files to deploy MultimodalQnA application on ROCm vLLM (#1737)
artem-astafev Apr 10, 2025
54a6525
Enable model cache for Rocm docker compose test. (#1614)
ZePan110 Apr 10, 2025
9836332
fix bugs in DocIndexRetriever (#1770)
minmin-intel Apr 10, 2025
f2d95a9
Enable AvatarChatbot model cache for docker compose test. (#1604)
ZePan110 Apr 10, 2025
69cdec4
Unified build.yaml file writing style (#1781)
ZePan110 Apr 10, 2025
f251ab2
Align DocSum env to vllm (#1784)
XinyaoWa Apr 10, 2025
a190a84
Add new secrets for docker compose test (#1786)
chensuyue Apr 10, 2025
222a343
Update model cache for MultimodalQnA (#1618)
ZePan110 Apr 10, 2025
162793f
Redefine docker images list. (#1743)
ZePan110 Apr 10, 2025
592edb4
[Translation] Integrate set_env.sh into test scripts. (#1785)
ZePan110 Apr 11, 2025
56ab290
update AgentQnA (#1790)
minmin-intel Apr 11, 2025
cc23fb9
Fix VideoQnA (#1696)
cwlacewe Apr 12, 2025
48085d2
support rocm helm charts test (#1787)
chensuyue Apr 13, 2025
dcb85f2
add 'N/A' to option (#1801)
NeoZhangJianyu Apr 14, 2025
7c54f40
Add Finance Agent Example (#1752)
minmin-intel Apr 14, 2025
d907684
Optimize the nightly/weekly example test (#1806)
chensuyue Apr 14, 2025
fb19fcd
Update vLLM parameter max-seq-len-to-capture (#1809)
lvliang-intel Apr 15, 2025
7f7f475
Update AgentQnA and DocSum for Gaudi Compatibility (#1777)
Mahathi-Vatsal Apr 16, 2025
b6ca1c4
Group log lines in GHA outputs for better readable logs. (#1821)
chensuyue Apr 16, 2025
0c70d96
Remove template_llava.jinja in command (#1831)
XinyuYe-Intel Apr 16, 2025
7b0f89a
Replaced TGI with vLLM for guardrail serving (#1815)
lvliang-intel Apr 16, 2025
4a6a675
Adding the two missing packages for ingest script (#1822)
ashahba Apr 16, 2025
b4cd1d9
Update FinanceAgent v1.3 (#1819)
minmin-intel Apr 16, 2025
7387a48
Fix Multimodal & ProductivitySuite Issue (#1820)
letonghan Apr 17, 2025
6514ded
Update TEI docker images to CPU-1.6 (#1791)
xiguiw Apr 17, 2025
b8fb7d4
Enable health check for dataprep in ChatQnA (#1799)
letonghan Apr 17, 2025
311f2c3
Enable dataprep health check for examples (#1800)
letonghan Apr 17, 2025
3ef1823
Update docker images list. (#1835)
ZePan110 Apr 17, 2025
56e9840
new chatqna readme template (#1755)
srinarayan-srikanthan Apr 17, 2025
97deca0
fix missing package (#1841)
Yongbozzz Apr 17, 2025
83a12a9
Update README.md of ChatQnA for layout (#1842)
yinghu5 Apr 18, 2025
b5f1146
Redirect Users to github.io for ChatQnA telemetry materials (#1845)
louie-tsai Apr 18, 2025
3c42c19
Remote inference support for examples in Productivity suite (#1818)
srinarayan-srikanthan Apr 18, 2025
1589db0
Redirect users to new github.io sections for AgentQnA opentelemetry m…
louie-tsai Apr 18, 2025
b5a77d6
Refine ChatQnA READMEs (#1850)
lvliang-intel Apr 20, 2025
4776312
Refine readme of InstructionTuning (#1794)
XinyuYe-Intel Apr 20, 2025
fe80211
CodeGen Gradio UI Updates for new delete endpoint features (#1851)
okhleif-10 Apr 20, 2025
6333338
[ Translation ] Refine documents (#1795)
ZePan110 Apr 20, 2025
174d528
Refine documents for DocSum (#1802)
XinyaoWa Apr 20, 2025
79cce28
add AudioQnA key parameters to comply with the image size reduction (…
Spycsh Apr 20, 2025
1ce338f
Update chatqna values file changes (#1844)
yongfengdu Apr 21, 2025
7920799
[Bug: 900] Create a version of MultimodalQnA example with Zilliz/Milv…
srajabos Apr 21, 2025
b18bce0
Added Initial version of DocSum support for benchmarking scripts for …
vrantala Apr 21, 2025
8142872
hot fix for permission issue (#1849)
Yongbozzz Apr 21, 2025
531d21b
AgentQnA group log lines in test outputs for better readable logs. (…
chensuyue Apr 21, 2025
940506b
Refine the READMEs of CodeTrans (#1796)
letonghan Apr 21, 2025
d7cc6da
[ SearchQnA ] Refine documents (#1803)
WenjiaoYue Apr 21, 2025
a48f11d
Enable more flexible support for test HWs (#1816)
chensuyue Apr 21, 2025
082f5d5
Refine readme of AudioQnA (#1804)
Spycsh Apr 21, 2025
257b633
Refine readme of CodeGen (#1797)
yao531441 Apr 21, 2025
6dfffa3
New Productivity Suite react UI and Bug Fixes (#1834)
sgurunat Apr 21, 2025
07e1b71
[CICD enhance] AudioQnA run CI with latest base image, group logs in …
chensuyue Apr 21, 2025
86dfda5
Fixes for MultimodalQnA with the Milvus vector db (#1859)
dmsuehir Apr 21, 2025
30bb758
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters Apr 22, 2025
988db95
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 22, 2025
20422f0
Update README.md of DBQnA (#1855)
Ervin0307 Apr 22, 2025
18a3e69
Added CodeGen Gradio README link to Docker Images List (#1864)
okhleif-10 Apr 22, 2025
4449660
Refine AuidoQnA README.MD for AMD ROCm docker compose deployment (#1862)
artem-astafev Apr 23, 2025
ab05955
Fix compose file and functional tests for Avatarchatbot on AMD ROCm p…
artem-astafev Apr 23, 2025
b719089
Set opea_branch for CD test (#1870)
chensuyue Apr 24, 2025
c95fda8
Refine README.MD for AMD ROCm docker compose deployment (#1856)
artem-astafev Apr 24, 2025
50d110a
Update image links. (#1866)
ZePan110 Apr 24, 2025
f0c7ace
Remove proxy in CodeTrans test (#1874)
chensuyue Apr 24, 2025
b85a06d
CodeTrans - refactoring README.md for deploy application on ROCm with…
chyundunovDatamonsters Apr 24, 2025
cb03db2
[CICD enhance] EdgeCraftRAG run CI with latest base image, group logs…
chensuyue Apr 24, 2025
b068668
ChatQnA - refactoring README.md for deploy application on ROCm (#1857)
chyundunovDatamonsters Apr 25, 2025
6b19e10
Refine README.MD for SearchQnA on AMD ROCm platform (#1876)
artem-astafev Apr 25, 2025
f54021e
Update ChatQnA/kubernetes/helm/README.md
chensuyue May 9, 2025
dcbc56a
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
1c5eedb
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
578c2e5
DocSum - fix main
Feb 13, 2025
d4aafc6
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
1e81f17
DocSum - fix main
Feb 13, 2025
441f617
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
e518929
DocSum - fix main
Feb 13, 2025
e03ac0b
DocSum - add files for deploy app with ROCm vLLM
Feb 13, 2025
11c9bd1
DocSum - fix main
Feb 13, 2025
7fedeb8
ChatQnA - Adding files to deploy an application in the K8S environmen…
Apr 5, 2025
3b99f89
ChatQnA - Adding files to deploy an application in the K8S environmen…
Apr 5, 2025
a5e1f6c
ChatQnA - Adding files to deploy an application in the K8S environmen…
Apr 5, 2025
233ef4d
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters Apr 22, 2025
54b003d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 22, 2025
74ede45
Update ChatQnA/kubernetes/helm/README.md
chensuyue May 9, 2025
daf423d
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
e85c2ae
Merge branch 'main' of https://github.com/chyundunovDatamonsters/OPEA…
chyundunovDatamonsters May 16, 2025
df0c956
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
49a7eae
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
fb14f29
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
e48dee7
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
814caa4
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
32c9fec
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 16, 2025
2d399ac
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 20, 2025
ee32b05
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 20, 2025
73b0162
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 27, 2025
8139f55
Merge remote-tracking branch 'origin/feature/ChatQnA_k8s' into featur…
chyundunovDatamonsters May 27, 2025
d7af65a
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 27, 2025
2ecf319
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 29, 2025
618ab09
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 29, 2025
8db1dd8
Merge branch 'main' into feature/ChatQnA_k8s
chyundunovDatamonsters May 29, 2025
ad8c766
ChatQnA - Adding files to deploy an application in the K8S environmen…
chyundunovDatamonsters May 29, 2025
1674112
Merge branch 'main' into feature/ChatQnA_k8s
chensuyue May 30, 2025
5358393
Merge branch 'main' into feature/ChatQnA_k8s
yinghu5 Jun 12, 2025
0da422a
Merge remote-tracking branch 'opea-origin/main' into feature/ChatQnA_k8s
chyundunovDatamonsters Jul 4, 2025
60b830a
Merge remote-tracking branch 'origin/feature/ChatQnA_k8s' into featur…
chyundunovDatamonsters Jul 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
221 changes: 221 additions & 0 deletions ChatQnA/kubernetes/helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,224 @@ helm install chatqna oci://ghcr.io/opea-project/charts/chatqna --set global.HUG
```

See other *-values.yaml files in this directory for more reference.

## Deploy on AMD ROCm using Helm charts from the binary Helm repository

```bash
mkdir ~/chatqna-k8s-install && cd ~/chatqna-k8s-install
```

### Cloning repos

```bash
git clone https://github.com/opea-project/GenAIExamples.git
```

### Go to the installation directory

```bash
cd GenAIExamples/ChatQnA/kubernetes/helm
```

### Settings system variables

```bash
export HFTOKEN="your_huggingface_token"
export MODELDIR="/mnt/opea-models"
export MODELNAME="meta-llama/Meta-Llama-3-8B-Instruct"
```

### Setting variables in Values files

#### If ROCm vLLM used
```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/rocm-values.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could assume you are already in the correct directory, namely
~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm

```

#### If deploy FaqGen based application on AMD ROCm device with vLLM
```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/faqgen-rocm-values.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, could assume you are in the correct directory

```

- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- TENSOR_PARALLEL_SIZE - must match the number of GPUs used
- ```yaml
resources:
limits:
amd.com/gpu: "1" # replace "1" with the number of GPUs used
```

#### If ROCm TGI used

```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/rocm-tgi-values.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto on directory

```

#### If deploy FaqGen based application on AMD ROCm device with TGI

```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/faqgen-rocm-tgi-values.yaml
```

- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- extraCmdArgs: [ "--num-shard","1" ] - replace "1" with the number of GPUs used
- ```yaml
resources:
limits:
amd.com/gpu: "1" # replace "1" with the number of GPUs used
```

### Installing the Helm Chart

#### If ROCm vLLM used
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this heading not bold faced while others below are?

```bash
helm upgrade --install chatqna oci://ghcr.io/opea-project/charts/chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values rocm-values.yaml
```

#### If ROCm TGI used
```bash
helm upgrade --install chatqna oci://ghcr.io/opea-project/charts/chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values rocm-tgi-values.yaml
```

#### If deploy FaqGen based application on AMD ROCm device with vLLM
```bash
helm upgrade --install chatqna oci://ghcr.io/opea-project/charts/chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values faqgen-rocm-values.yaml
```

#### If deploy FaqGen based application on AMD ROCm device with TGI
```bash
helm upgrade --install chatqna oci://ghcr.io/opea-project/charts/chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values faqgen-rocm-tgi-values.yaml
```

## Deploy on AMD ROCm using Helm charts from Git repositories

### Creating working dirs

```bash
mkdir ~/chatqna-k8s-install && cd ~/chatqna-k8s-install
```

### Cloning repos

```bash
git clone git clone https://github.com/opea-project/GenAIExamples.git
git clone git clone https://github.com/opea-project/GenAIInfra.git
```

### Go to the installation directory

```bash
cd GenAIExamples/ChatQnA/kubernetes/helm
```

### Settings system variables

```bash
export HFTOKEN="your_huggingface_token"
export MODELDIR="/mnt/opea-models"
export MODELNAME="Intel/neural-chat-7b-v3-3"
```

### Setting variables in Values files

#### If ROCm vLLM used
```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/rocm-values.yaml
```

- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- TENSOR_PARALLEL_SIZE - must match the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used

#### If ROCm TGI used

```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/rocm-tgi-values.yaml
```

- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- extraCmdArgs: [ "--num-shard","1" ] - replace "1" with the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used

#### If deploy FaqGen based application on AMD ROCm device with vLLM
```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/faqgen-rocm-values.yaml
```

- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- TENSOR_PARALLEL_SIZE - must match the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used

#### If deploy FaqGen based application on AMD ROCm device with TGI

```bash
nano ~/chatqna-k8s-install/GenAIExamples/ChatQnA/kubernetes/helm/faqgen-rocm-tgi-values.yaml
```

- HIP_VISIBLE_DEVICES - this variable specifies the ID of the GPU that you want to use.
You can specify either one or several comma-separated ones - "0" or "0,1,2,3"
- extraCmdArgs: [ "--num-shard","1" ] - replace "1" with the number of GPUs used
- resources:
limits:
amd.com/gpu: "1" - replace "1" with the number of GPUs used

### Installing the Helm Chart

#### If ROCm vLLM used
```bash
cd ~/chatqna-k8s-install/GenAIInfra/helm-charts
./update_dependency.sh
helm dependency update chatqna
helm upgrade --install chatqna chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values ../../GenAIExamples/ChatQnA/kubernetes/helm/rocm-values.yaml
```

#### If ROCm TGI used
```bash
cd ~/chatqna-k8s-install/GenAIInfra/helm-charts
./update_dependency.sh
helm dependency update chatqna
helm upgrade --install chatqna chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values ../../GenAIExamples/ChatQnA/kubernetes/helm/rocm-tgi-values.yaml
```

#### If deploy FaqGen based application on AMD ROCm device with vLLM
```bash
cd ~/chatqna-k8s-install/GenAIInfra/helm-charts
./update_dependency.sh
helm dependency update chatqna
helm upgrade --install chatqna chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values ../../GenAIExamples/ChatQnA/kubernetes/helm/faqgen-rocm-values.yaml
```

#### If deploy FaqGen based application on AMD ROCm device with TGI
```bash
cd ~/chatqna-k8s-install/GenAIInfra/helm-charts
./update_dependency.sh
helm dependency update chatqna
helm upgrade --install chatqna chatqna \
--set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} \
--values ../../GenAIExamples/ChatQnA/kubernetes/helm/faqgen-rocm-tgi-values.yaml
```
66 changes: 66 additions & 0 deletions ChatQnA/kubernetes/helm/faqgen-rocm-tgi-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

CHATQNA_TYPE: "CHATQNA_FAQGEN"
llm-uservice:
enabled: true
image:
repository: opea/llm-faqgen
LLM_MODEL_ID: meta-llama/Meta-Llama-3-8B-Instruct
FAQGEN_BACKEND: "TGI"
service:
port: 80
tgi:
enabled: true
accelDevice: "rocm"
image:
repository: ghcr.io/huggingface/text-generation-inference
tag: "2.4.1-rocm"
LLM_MODEL_ID: meta-llama/Meta-Llama-3-8B-Instruct
MAX_INPUT_LENGTH: "3072"
MAX_TOTAL_TOKENS: "4096"
PYTORCH_TUNABLEOP_ENABLED: "0"
USE_FLASH_ATTENTION: "true"
FLASH_ATTENTION_RECOMPUTE: "false"
HIP_VISIBLE_DEVICES: "0,1"
MAX_BATCH_SIZE: "2"
extraCmdArgs: [ "--num-shard","2" ]
resources:
limits:
amd.com/gpu: "2"
requests:
cpu: 1
memory: 16Gi
securityContext:
readOnlyRootFilesystem: false
runAsNonRoot: false
runAsUser: 0
capabilities:
add:
- SYS_PTRACE
readinessProbe:
initialDelaySeconds: 60
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
startupProbe:
initialDelaySeconds: 60
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
vllm:
enabled: false

# Reranking: second largest bottleneck when reranking is in use
# (i.e. query context docs have been uploaded with data-prep)
#
# TODO: could vLLM be used also for reranking / embedding?
teirerank:
accelDevice: "cpu"
image:
repository: ghcr.io/huggingface/text-embeddings-inference
tag: cpu-1.5
# securityContext:
# readOnlyRootFilesystem: false
readinessProbe:
timeoutSeconds: 1
59 changes: 59 additions & 0 deletions ChatQnA/kubernetes/helm/faqgen-rocm-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

CHATQNA_TYPE: "CHATQNA_FAQGEN"
llm-uservice:
enabled: true
image:
repository: opea/llm-faqgen
LLM_MODEL_ID: meta-llama/Meta-Llama-3-8B-Instruct
FAQGEN_BACKEND: "vLLM"
service:
port: 80
tgi:
enabled: false
vllm:
enabled: true
accelDevice: "rocm"
image:
repository: opea/vllm-rocm
tag: latest
env:
HIP_VISIBLE_DEVICES: "0"
TENSOR_PARALLEL_SIZE: "1"
HF_HUB_DISABLE_PROGRESS_BARS: "1"
HF_HUB_ENABLE_HF_TRANSFER: "0"
VLLM_USE_TRITON_FLASH_ATTN: "0"
VLLM_WORKER_MULTIPROC_METHOD: "spawn"
PYTORCH_JIT: "0"
HF_HOME: "/data"
extraCmd:
command: [ "python3", "/workspace/api_server.py" ]
extraCmdArgs: [ "--swap-space", "16",
"--disable-log-requests",
"--dtype", "float16",
"--num-scheduler-steps", "1",
"--distributed-executor-backend", "mp" ]
resources:
limits:
amd.com/gpu: "1"
startupProbe:
failureThreshold: 180
securityContext:
readOnlyRootFilesystem: false
runAsNonRoot: false
runAsUser: 0

# Reranking: second largest bottleneck when reranking is in use
# (i.e. query context docs have been uploaded with data-prep)
#
# TODO: could vLLM be used also for reranking / embedding?
teirerank:
accelDevice: "cpu"
image:
repository: ghcr.io/huggingface/text-embeddings-inference
tag: cpu-1.5
# securityContext:
# readOnlyRootFilesystem: false
readinessProbe:
timeoutSeconds: 1
Loading
Loading