-
Notifications
You must be signed in to change notification settings - Fork 63
Add workflow to run SDGym monthly and publish results #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
R-Palazzo
wants to merge
48
commits into
main
Choose a base branch
from
issue-425-workflow-sdgym
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
64f64f6
def 425
R-Palazzo db0ee3a
define run_benchmark.ym file
R-Palazzo 6d63bc6
def 3
R-Palazzo 92a4f1b
define upload workflow
R-Palazzo f6a3d1c
restructure files
R-Palazzo 266cbf2
trigger on pushes
R-Palazzo 234957f
fix upload benchmark workflow
R-Palazzo 1c25b7f
fix workflow
R-Palazzo cfc31fd
fix run workflow
R-Palazzo 86c2eaf
add unit test upload_benchmark
R-Palazzo f7e9733
update write_run_id
R-Palazzo d15900c
fix sving big pickles
R-Palazzo 755ca33
increase timeout for large data
R-Palazzo fc1f7a1
fix benchmark
R-Palazzo ce661ea
debug
R-Palazzo 54b9f31
use logger info
R-Palazzo 37ea8cf
set level to info
R-Palazzo 3d082cc
add logging handler
R-Palazzo bff2d22
fix logs
R-Palazzo 9a07f7a
clean + fix region name
R-Palazzo af7afbb
update aws validation
R-Palazzo 620c8b0
debug _score
R-Palazzo 4104835
cleaning
R-Palazzo 878005d
add unit test
R-Palazzo 0d52be3
make variable name consistent
R-Palazzo cf270c4
add region name
R-Palazzo 7fc725b
improve datetime logic
R-Palazzo 4c411b8
add unit test
R-Palazzo c8e3067
address comments
R-Palazzo e8eb5be
def sclack 1
R-Palazzo a96fe88
pyproject slack sdk
R-Palazzo d7fe8bf
fix parameter name
R-Palazzo c00d200
add token
R-Palazzo 9f7fe60
update slack message
R-Palazzo d5657df
update message 1
R-Palazzo a5ccd6d
update uploading workflow
R-Palazzo 84e4c7c
fix upload
R-Palazzo 3eed4fe
add unit tests
R-Palazzo 614b419
clean run_benchmark
R-Palazzo e63a58d
cleaning 1
R-Palazzo a21c7f8
lauch benchmark with RealTabFormer
R-Palazzo fad6908
debug run with timeout 1
R-Palazzo 2556dc4
debug run with timeout 2
R-Palazzo 0b3f0f6
debug run with timeout 3
R-Palazzo b93a237
debug run with timeout 4
R-Palazzo 734045e
debug run with timeout 5
R-Palazzo e38dde4
test with 1s time out
R-Palazzo 584f3a7
run benchmark with timeout
R-Palazzo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
name: Run SDGym Benchmark | ||
|
||
on: | ||
push: | ||
branches: | ||
- issue-425-workflow-sdgym | ||
workflow_dispatch: | ||
schedule: | ||
- cron: '0 5 5 * *' | ||
|
||
jobs: | ||
run-sdgym-benchmark: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
fetch-depth: 0 | ||
- name: Set up latest Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version-file: 'pyproject.toml' | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
python -m pip install -e .[dev] | ||
|
||
- name: Run SDGym Benchmark | ||
env: | ||
SLACK_TOKEN: ${{ secrets.SLACK_TOKEN }} | ||
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} | ||
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
AWS_DEFAULT_REGION: ${{ secrets.AWS_REGION }} | ||
|
||
run: invoke run-sdgym-benchmark |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
name: Upload SDGym Benchmark results | ||
|
||
on: | ||
workflow_run: | ||
workflows: ["Run SDGym Benchmark"] | ||
types: | ||
- completed | ||
workflow_dispatch: | ||
schedule: | ||
- cron: '0 6 * * *' | ||
|
||
jobs: | ||
upload-sdgym-benchmark: | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
fetch-depth: 0 | ||
|
||
- name: Set up latest Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version-file: 'pyproject.toml' | ||
|
||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
python -m pip install -e .[dev] | ||
|
||
- name: Upload SDGym Benchmark | ||
env: | ||
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} | ||
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
GITHUB_LOCAL_RESULTS_DIR: ${{ runner.temp }}/sdgym-leaderboard-files | ||
run: | | ||
invoke upload-benchmark-results | ||
echo "GITHUB_LOCAL_RESULTS_DIR=$GITHUB_LOCAL_RESULTS_DIR" >> $GITHUB_ENV | ||
|
||
- name: Prepare files for PR | ||
if: env.SKIP_UPLOAD != 'true' | ||
run: | | ||
mkdir pr-staging | ||
echo "Looking for files in: $GITHUB_LOCAL_RESULTS_DIR" | ||
ls -l "$GITHUB_LOCAL_RESULTS_DIR" | ||
for f in "$GITHUB_LOCAL_RESULTS_DIR"/${FOLDER_NAME}_*.csv; do | ||
base=$(basename "$f") | ||
cp "$f" "pr-staging/${base}" | ||
done | ||
|
||
echo "Files staged for PR:" | ||
ls -l pr-staging | ||
|
||
- name: Checkout target repo (sdv-dev.github.io) | ||
if: env.SKIP_UPLOAD != 'true' | ||
run: | | ||
git clone https://github.com/sdv-dev/sdv-dev.github.io.git target-repo | ||
cd target-repo | ||
git checkout gatsby-home | ||
|
||
- name: Copy results and create PR | ||
if: env.SKIP_UPLOAD != 'true' | ||
env: | ||
GH_TOKEN: ${{ secrets.GH_TOKEN }} | ||
FOLDER_NAME: ${{ env.FOLDER_NAME }} | ||
run: | | ||
cp pr-staging/* target-repo/assets/sdgym-leaderboard-files/ | ||
cd target-repo | ||
git checkout -b sdgym-benchmark-upload-${FOLDER_NAME} | ||
git config --local user.name "github-actions[bot]" | ||
git config --local user.email "41898282+github-actions[bot]@users.noreply.github.com" | ||
|
||
git add assets/ | ||
git commit -m "Upload SDGym Benchmark Results ($FOLDER_NAME)" | ||
git remote set-url origin https://x-access-token:${GH_TOKEN}@github.com/sdv-dev/sdv-dev.github.io.git | ||
git push origin sdgym-benchmark-upload-${FOLDER_NAME} | ||
|
||
gh pr create \ | ||
--repo sdv-dev/sdv-dev.github.io \ | ||
--head sdgym-benchmark-upload-${FOLDER_NAME} \ | ||
--base gatsby-home \ | ||
--title "Upload SDGym Benchmark Results ($FOLDER_NAME)" \ | ||
--body "Automated benchmark results upload" \ | ||
--reviewer "pcarapic15" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we automerge this instead of having predrag approve? |
||
|
||
# Capture PR URL | ||
PR_URL=$(gh pr view sdgym-benchmark-upload-${FOLDER_NAME} \ | ||
--repo sdv-dev/sdv-dev.github.io \ | ||
--json url -q .url) | ||
|
||
echo "PR URL: $PR_URL" | ||
echo "PR_URL=$PR_URL" >> $GITHUB_ENV | ||
|
||
- name: Send Slack notification | ||
if: env.SKIP_UPLOAD != 'true' | ||
env: | ||
SLACK_TOKEN: ${{ secrets.SLACK_TOKEN }} | ||
run: | | ||
invoke notify-sdgym-benchmark-uploaded \ | ||
--folder-name "$FOLDER_NAME" \ | ||
--pr-url "$PR_URL" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be updated. Since I launched one today, I think every month on the 5th makes sense.