-
Notifications
You must be signed in to change notification settings - Fork 12
Set current job as environment variable, and add util to get job type #431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/unit_test_py |
|
/integration_test |
|
/e2e_test |
GiGL Automation@ 23:21:40UTC : 🔄 @ 24:36:24UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 23:21:45UTC : 🔄 @ 24:30:51UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 23:21:49UTC : 🔄 @ 24:36:14UTC : ✅ Workflow completed successfully. |
svij-sc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of introducing env vars as it may limit flexibility to move to other backends for train/infer. Specifically we want to support K8s training/inference sometime later this year.
Btw, how does this affect if a user is doing local training/inference?
Will we have to set this var?
If it doesn't affect local training I am fine with this change.
I think that it's pretty easy to migrate the env vars to k8s/etc, if we're not able to set env vars at all then I think using that as a backend would be a non-starter. Regardless in the migration to k8s we're going to need to migrate the injected CLI flags (among other things like labels 1).
What do you mean by this? AFAIK people aren't running |
Yes, but those are expected for any dist training. Anyways, I am just concerned but this is not blocking. |
mkolodner-sc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, thanks Kyle!
Is your concern that we may now need to require users to provide more env vars? I guess that's fair - but given that we build datasets differently based on the component 1 I'm not sure what the other approach here is, we need to signal somehow, either via CLI flag or env var, and I feel that env var is less intrusive. FWIW I don't expect we'll require users to set this all the time, but I do think it'll be useful for graphstore more (which is sort of weird to run locally anyways...) |
Scope of work done
We do this so we can infer the job type downstream, e.g. for
is_inference1.I thought about making this a CLI flag that we inject, similar to
use_cuda2. But I strongly feel that we should not be injecting CLI flags, and that the CLI flags should only be controled by users.If I'm a user, I expect there to be all sorts of stuff in the environment variables, but I'd expect the CLI flags to be mine, (and I don't thikn we should be using the ones we have anyways...)
We could still inject the CLI flag but I think it's going to cause more pain down the road, and its going to be unsafe to add to the colcateed (e.g. non-graphstore jobs).
Another note: I think we should use the "job type" vs
is_inferenceas we may have other "job" types in the future (e.g. some GLT preprocessor step to compute PEs or similar.Where is the documentation for this feature?: N/A
Did you add automated tests or write a test plan?
Updated Changelog.md? NO
Ready for code review?: NO