Skip to content

nawkwoo/boaz-rag-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– boaz-rag-chatbot

RAG(Retrieval-Augmented Generation) 기반의 BOAZ FAQ μ±—λ΄‡μž…λ‹ˆλ‹€. μ§€μ›μžλ“€μ΄ 자주 λ¬»λŠ” μ§ˆλ¬Έμ— λŒ€ν•΄ μ •ν™•ν•˜κ³  동적인 응닡을 μ œκ³΅ν•©λ‹ˆλ‹€.

πŸ“Œ ν”„λ‘œμ νŠΈ μ†Œκ°œ

BOAZ λ¦¬ν¬λ£¨νŒ… κΈ°κ°„λ§ˆλ‹€ λ°˜λ³΅λ˜λŠ” FAQ 응닡을 μžλ™ν™”ν•˜κ³ μž, 크둀링 + 벑터 검색 + μƒμ„±ν˜• λͺ¨λΈμ„ κ²°ν•©ν•œ 챗봇을 κ΅¬μΆ•ν•©λ‹ˆλ‹€.

μ‚¬μš©μžμ˜ μ§ˆλ¬Έμ„ 의미 기반으둜 λ²‘ν„°ν™”ν•˜μ—¬ κ΄€λ ¨ λ¬Έμ„œλ₯Ό κ²€μƒ‰ν•˜κ³ , κ·Έ λ¬Έμ„œλ₯Ό 기반으둜 μžμ—°μŠ€λŸ¬μš΄ λ¬Έμž₯을 μƒμ„±ν•˜μ—¬ μ‘λ‹΅ν•©λ‹ˆλ‹€.

🎯 λͺ©ν‘œ

  • BOAZ 곡식 ν™ˆνŽ˜μ΄μ§€μ— 챗봇 배포
  • μ§€μ›μžμ˜ λ‹€μ–‘ν•œ μ§ˆλ¬Έμ— μ •ν™•ν•˜κ²Œ λŒ€μ‘
  • 후속 κΈ°μˆ˜μ—μ„œλ„ μœ μ§€λ³΄μˆ˜ κ°€λŠ₯ν•œ ꡬ쑰 섀계

πŸ‘₯ νŒ€μ›

경재영 Β· κΉ€μ™„μ²  Β· μ†κ΄€μš° Β· μ •μ˜ˆλ¦°

πŸ› οΈ μ‹€ν–‰ κ°€μ΄λ“œ

# 1. .env νŒŒμΌμ— Pineconeκ³Ό Gemini API ν‚€ μž…λ ₯
# μ˜ˆμ‹œ (.env)
# PINECONE_API_KEY=your_pinecone_key
# GEMINI_API_KEY=your_gemini_key
# 2. κ°€μƒν™˜κ²½ 생성
python -m venv venv
# 3. κ°€μƒν™˜κ²½ ν™œμ„±ν™” (Windows κΈ°μ€€)
source venv/Scripts/activate
# 4. νŒ¨ν‚€μ§€ μ„€μΉ˜
pip install -r requirements.txt
# 5. λ²‘ν„°μŠ€ν† μ–΄ 생성
python vectorstore/dense_uploader.py
python vectorstore/sparse_uploader.py
# 6. Streamlit μ•± μ‹€ν–‰
streamlit run app.py

πŸ“ ν”„λ‘œμ νŠΈ ꡬ쑰

boaz_rag/
β”œβ”€β”€ app.py                     # Streamlit UI μ‹€ν–‰
β”œβ”€β”€ chain.py                   # Gemini LLM 및 QA 체인 μ •μ˜
β”œβ”€β”€ config.py                  # μ „μ—­ μ„€μ • (λͺ¨λΈλͺ…, indexλͺ… λ“±)
β”œβ”€β”€ preprocess.py              # PDF/CSV λ¬Έμ„œ λ‘œλ”© 및 μ²­ν‚Ή
β”œβ”€β”€ retriever/                 # 벑터 검색기 μ •μ˜
β”‚   β”œβ”€β”€ dense_retriever.py     # SBERT 기반
β”‚   β”œβ”€β”€ sparse_retriever.py    # BM25 기반
β”‚   └── factory.py             # μ„€μ • 기반 retriever 선택
β”œβ”€β”€ vectorstore/               # 인덱슀 μ—…λ‘œλ“œ 슀크립트
β”‚   β”œβ”€β”€ dense_uploader.py
β”‚   └── sparse_uploader.py
β”œβ”€β”€ data/                      # 원본 λ¬Έμ„œ μ €μž₯ 폴더
β”œβ”€β”€ data_with_meta/            # 청크 + λ§€ν•‘ JSON μ €μž₯μ†Œ
β”‚   └── id_to_text_dense.json
β”œβ”€β”€ .env                       # ν™˜κ²½ λ³€μˆ˜ μ„€μ •
β”œβ”€β”€ requirements.txt           # νŒ¨ν‚€μ§€ λͺ©λ‘

About

πŸ€– BOAZ FAQ chatbot powered by Retrieval-Augmented Generation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages