-
Notifications
You must be signed in to change notification settings - Fork 96
Add HSTU model #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HSTU model #122
Conversation
Introduces the HSTU (Hierarchical Sequential Transduction Units) generative recommendation model, including core layers, model definition, utility functions, and trainer. Adds MovieLens-1M preprocessing for HSTU, example scripts, and sequence data utilities. Updates package imports to support new generative components.
Major improvements to the MovieLens-1M preprocessing and HSTU model pipeline: preprocessing now uses a sliding window strategy to generate multiple training samples per user, includes time-difference features for time-aware modeling, and applies cold-start filtering. The HSTU model and layers now support time embeddings and causal masking. The training and evaluation scripts are updated to handle time-difference inputs and provide ranking metrics. These changes improve data efficiency, model expressiveness, and evaluation rigor.
Improved MovieLens-1M preprocessing for HSTU by switching to a sliding window strategy, adding time-difference features, and updating documentation and comments for clarity. Unified data format to include time-aware features, updated training and evaluation scripts to use time-aware positional encoding, and enhanced docstrings for HSTU layers and blocks. These changes align the implementation more closely with Meta's official HSTU logic and improve ranking metrics by better modeling temporal information.
Added 'HSTU Reproduction' to the blog section in mkdocs.yml for both English and Chinese navigation. Updated the Chinese blog post to summarize recent HSTU-related commits and removed outdated commit details.
Reformatted code across HSTU-related modules and data utilities to use more compact function calls and initialization patterns, reducing unnecessary line breaks and improving readability. No functional changes were made; this is a style and maintainability update.
Hstu model
Bumped yapf and isort versions in CI workflow to match pyproject.toml. Removed test_hstu_imports.py, which contained import and basic functionality tests for HSTU components.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #122 +/- ##
==========================================
- Coverage 39.20% 36.39% -2.81%
==========================================
Files 47 52 +5
Lines 2844 3283 +439
==========================================
+ Hits 1115 1195 +80
- Misses 1729 2088 +359
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Pull Request / 拉取请求
What does this PR do? / 这个PR做了什么?
本 PR 实现了 HSTU (Hierarchical Sequential Transduction Units) 生成式推荐模型,这是 Meta 提出的一种先进的序列推荐模型。主要包括:
完整的 HSTU 模型实现
HSTULayer:多头注意力 + 门控机制 + FFNHSTUBlock:多层 HSTU 转导单元堆叠HSTUModel:支持时间感知的自回归生成式推荐时间感知的数据预处理流程
训练与评估框架
SeqTrainer工具函数与文档
RelPosBiasVocabMapper、VocabMaskThis PR implements the HSTU (Hierarchical Sequential Transduction Units) generative recommendation model proposed by Meta. Key features include:
Complete HSTU model implementation
HSTULayer: Multi-head attention + gating + FFNHSTUBlock: Stacked HSTU transduction unitsHSTUModel: Time-aware autoregressive generative recommenderTime-aware data preprocessing pipeline
Training and evaluation framework
SeqTrainerUtilities and documentation
RelPosBiasType of Change / 变更类型
Related Issues / 相关Issues
N/A - 这是一个新功能实现
Key Implementation Details / 关键实现细节
1. 模型架构 / Model Architecture
HSTULayer (
torch_rechub/basic/layers.py)HSTUModel (
torch_rechub/models/generative/hstu.py)2. 数据处理 / Data Processing
examples/generative/data/ml-1m/preprocess_ml_hstu.py)3. 训练框架 / Training Framework
torch_rechub/trainers/seq_trainer.py)4. 与官方实现的对比 / Comparison with Official Implementation
相同点:
差异点:
How to Test / 如何测试
1. 数据预处理 / Data Preprocessing
cd examples/generative/data/ml-1m python preprocess_ml_hstu.py2. 训练模型 / Train Model
3. 运行完整示例 / Run Complete Example
cd examples/generative python run_hstu_movielens.pyChecklist / 检查清单
python config/format_code.py) / 代码遵循项目风格(运行了格式化脚本)Additional Notes / 附加说明
文件结构 / File Structure
性能指标 / Performance Metrics
在 MovieLens-1M 数据集上的初步测试结果:
注:这些指标是初步结果,可通过调整超参数(如增加层数、维度、训练轮数)进一步优化。
未来改进方向 / Future Improvements
性能优化
功能扩展
文档完善
参考文献 / References
感谢审查!如有任何问题或建议,欢迎讨论。
Thank you for reviewing! Feel free to discuss any questions or suggestions.