-
Notifications
You must be signed in to change notification settings - Fork 643
Use larger runners for nightly jobs #933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Make steps simpler by using tricks such as setting SHELLOPTS, and use new larger runner (from Google ML Velocity) for the build job because the job is running out of disk space on the default runner.
These are safe to use.
Make steps simpler by using tricks such as setting SHELLOPTS, and use new larger runner (from Google ML Velocity) for the build job because the job is running out of disk space on the default runner.
It's impossible to test with the new ML runners in your fork, because the fork is not part of the TensorFlow org and thus doesn't have access to the runners. Need to provide the ability to specify the runner when invoking the workflow manually.
In some contexts in GitHub Actions workflows, you have to use single quotes. Remembering when and where in YAML is too hard. Just use single quotes everywhere.s
7945915 to
e34f32a
Compare
Despite that this was valid YAML, the GitHub expression parser seems to have a hard time with string literals that contain dashes. And apparently, it's unnecessary to provide a fallback in this case anyway.
MichaelBroughton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it's combining a lot of changes all at once that may interact in complex ways. Can we split this into two PRs ? The first being just the runner upgrade, very simple and short. Verify that this is working well and doesn't break. Then we can move onto doing all of the other stuff:
In addition, this PR makes some simplifications to the workflows to get rid of some of the overelaborate things I did before, and make minor other changes to keep with best practices such as putting timeouts on jobs.
Yes, that's reasonable. |
The nightly jobs are failing due to running out of disk space. (It's the same problem we had recently with the CI checks workflow.) This PR updates the jobs to use the larger self-hosted runners from Google ML Velocity.
In addition, this PR makes some simplifications to the workflows to get rid of some of the overelaborate things I did before, and make minor other changes to keep with best practices such as putting timeouts on jobs.