-
Notifications
You must be signed in to change notification settings - Fork 48
Changing the hashing methodology for cache folder creation of models. #481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review WIP.
QEfficient/base/modeling_qeff.py
Outdated
@@ -5,7 +5,7 @@ | |||
# | |||
# ---------------------------------------------------------------------------- | |||
|
|||
import hashlib | |||
# import hashlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented code.
Make sure that commented code is not there in ready to review PRs.
QEfficient/base/modeling_qeff.py
Outdated
self.model_params.update(kwargs) | ||
self.model_params["config"] = self.model.config.to_diff_dict() | ||
self.model_params["_transform_names"] = self._transform_names() | ||
self.compile_params = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialize this only when compile is called.
No point in creating this dictionary if user not calling compile.
QEfficient/base/modeling_qeff.py
Outdated
self.model_params = {} | ||
self.model_params.update(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to do self.model_params = copy.deepcopy(kwargs)
This lets other methods mutate kwargs.
Otherwise we would need to ensure that no other method mutates the kwargs.
QEfficient/base/modeling_qeff.py
Outdated
if export_kwargs is not None: | ||
self.model_params.update(export_kwargs) | ||
if onnx_transform_kwargs is not None: | ||
self.model_params.update(onnx_transform_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One liners are better
self.model_params.update(export_kwargs) if export_kwargs is not None else None
self.model_params.update(onnx_transform_kwargs) if export_kwargs is not None else None
QEfficient/base/modeling_qeff.py
Outdated
self.model_params["output_names"] = output_names | ||
self.model_params["dynamic_axes"] = dynamic_axes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to keep them in one more level as
self.model_params["export_params"] = export_params
And add all exports params in export_params
which is another dict.
Makes the dumped JSON readable by user.
QEfficient/base/modeling_qeff.py
Outdated
model_params_json = export_dir / "model_params.json" | ||
with open(model_params_json, "w") as fp: | ||
json.dump( | ||
{ | ||
"model_params": [ | ||
{k: make_serializable(self.model_params[k]) for k in sorted(self.model_params.keys())} | ||
] | ||
}, | ||
fp, | ||
indent=4, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumping should happen after export.
If model errors out during export and we still dump the json, it does not make sense
|
||
self.pretrained_model_name_or_path = kwargs.get("pretrained_model_name_or_path", None) | ||
# self.pretrained_model_name_or_path = kwargs.get("pretrained_model_name_or_path", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To control the user should not initialize the model directly using the init of modeling class. Please use Metaclass Control
to control the flow and print the warning or error. For example-
class NoInitMeta(type):
def __call__(cls, *args, **kwargs):
raise RuntimeError("Use `from_pretrained` to create an instance.")
class MyModel(metaclass=NoInitMeta):
def __init__(self, data):
self.data = data
@classmethod
def from_pretrained(cls, path):
instance = object.__new__(cls)
instance.__init__(f"Loaded from {path}")
return instance
More about this you can read from here- https://stackoverflow.com/questions/100003/what-are-metaclasses-in-python.
Put that meta class in the utils and use this for all the modelling class.
QEfficient/base/modeling_qeff.py
Outdated
@@ -357,6 +388,19 @@ def _compile( | |||
logger.info(f"Running compiler: {' '.join(command)}") | |||
try: | |||
subprocess.run(command, capture_output=True, check=True) | |||
|
|||
# Dumping compile paramters in a JSON file after successful QPC compilation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all the code related to compile_param_json
from here including dumping
and handle these inside the decorator dump_qconfig
. Lets keep the base methods clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is a typo in "paramters" line 391
- Define a method for creating compile/export params_json in utils, use the same method for creating the export and compiler json and dump using dump_json method.
QEfficient/base/modeling_qeff.py
Outdated
model_params_json = export_dir / "model_params.json" | ||
with open(model_params_json, "w") as fp: | ||
json.dump( | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same like compile create a decorator and handle all these param updates and dumping inside that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use the dump_json from the utils.
QEfficient/base/modeling_qeff.py
Outdated
export_params["output_names"] = output_names | ||
export_params["dynamic_axes"] = dynamic_axes | ||
|
||
self.model_params["export_params"] = export_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle in decorator. Lets keep our base methods clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, lets write decorator implementation to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a decorator here? how we would be able to return the hash from the decorator and construct the export dir?
Wouldnt it be better if we can move the hash creation to a separate function and call it from here?
QEfficient/base/modeling_qeff.py
Outdated
|
||
# Store Model parameters to Calculate Hash for caching | ||
self.model_params = {} | ||
self.model_params = copy.deepcopy(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we need self.model_params = {}?
- move the updating model_params to a method. it keeps the init clean
QEfficient/base/modeling_qeff.py
Outdated
export_params["output_names"] = output_names | ||
export_params["dynamic_axes"] = dynamic_axes | ||
|
||
self.model_params["export_params"] = export_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a decorator here? how we would be able to return the hash from the decorator and construct the export dir?
Wouldnt it be better if we can move the hash creation to a separate function and call it from here?
QEfficient/base/modeling_qeff.py
Outdated
model_params_json = export_dir / "model_params.json" | ||
with open(model_params_json, "w") as fp: | ||
json.dump( | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use the dump_json from the utils.
QEfficient/base/modeling_qeff.py
Outdated
@@ -357,6 +388,19 @@ def _compile( | |||
logger.info(f"Running compiler: {' '.join(command)}") | |||
try: | |||
subprocess.run(command, capture_output=True, check=True) | |||
|
|||
# Dumping compile paramters in a JSON file after successful QPC compilation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is a typo in "paramters" line 391
- Define a method for creating compile/export params_json in utils, use the same method for creating the export and compiler json and dump using dump_json method.
99ff668
to
7b9cfba
Compare
QEfficient/base/modeling_qeff.py
Outdated
model_params = copy.deepcopy(kwargs) | ||
|
||
model_params["config"] = self.model.config.to_diff_dict() | ||
model_params["_transform_names"] = self._transform_names() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can make it as applied transforms instead of "_transform_names", Also move this method below init
QEfficient/base/modeling_qeff.py
Outdated
if hasattr(self.model.config, "architectures"): | ||
self.model_architecture = self.model.config.architectures[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would raise an attribute error if config doesnt have architectures. instead you can use
self.model_architecture = getattr(self.model.config, "architectures", [None])[0]
QEfficient/base/modeling_qeff.py
Outdated
dynamic_axes=dynamic_axes, | ||
export_kwargs=export_kwargs, | ||
onnx_transform_kwargs=onnx_transform_kwargs, | ||
export_dir=export_dir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need export dir in hash params?
QEfficient/base/modeling_qeff.py
Outdated
@@ -210,6 +236,11 @@ def _export( | |||
finally: | |||
shutil.rmtree(tmp_onnx_dir, ignore_errors=True) | |||
|
|||
# Dump JSON file with hashed parameters | |||
hashed_params_export_path = export_dir / "hashed_model_params.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Suggestion) would it be better if we name it as export_model_params or something like this? Since we have compile_model_params as well
# Check if already compiled | ||
compile_hash = compile_hash.hexdigest()[:16] | ||
compile_hash, hashed_params = hash_compile_params( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the above comment as its not needed here " # Check if already compiled"
|
||
# Make Embedding specific transforms like appending pooling | ||
if pooling: | ||
self.model, _ = PoolingTransform.apply(self.model, pooling) | ||
|
||
self.model.base_model.config.use_cache = True | ||
|
||
self.pretrained_model_name_or_path = kwargs.get("pretrained_model_name_or_path", None) | ||
self.hash_params["qeff_class"] = self.__class__.__name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: which would be better to better readability "qeff_class" or "qeff_auto_class"?
QEfficient/utils/cache.py
Outdated
@@ -39,3 +43,11 @@ def to_hashable(obj) -> bytes: | |||
default=json_serializable, | |||
sort_keys=True, | |||
).encode() | |||
|
|||
|
|||
def hash_dict_params(dict_items: Dict, hash_string_size: int = HASH_HEXDIGEST_STR_LEN): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be better to create a new python file in utils and maintain all hash related modules there.
b28d33e
to
a07406b
Compare
… QNN compilation not included yet. Cache folder mechanism has been modified to have a parent directory for a model based on the architecture that we retrieve from the model config. The hash calculation for the ONNX export now incorporates all model kwargs as well as export kwargs and parameters. the parameters that were used to create the hash also gets dumped as a serialized JSON file in the ONNX folder, the same happens for the compile parameters inside the respective qpc folder. Signed-off-by: Dhiraj Kumar Sah <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
This fixed the issue with higher BS compilation for SwiftKV models ``` Compiler command: ['/opt/qti-aic/exec/qaic-exec', '-aic-hw', '-aic-hw-version=2.0', '-m=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/LlamaSwiftKVForCausal LM.onnx', '-compile-only', '-retained-state', '-convert-to-fp16', '-aic-num-cores=16', '-network-specialization-config=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879eb c0e59ab40/qpc-60f86f912a187346/specializations.json', '-custom-IO-list-file=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/qpc-60f86f912a187346/custom_io.ya ml', '-mdp-load-partition-config=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/qpc-60f86f912a187346/mdp_ts_4.json', '-aic-binary-dir=/prj/qct/aisyssol_scra tch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/qpc-60f86f912a187346/qpc'] Compiler exitcode: 1 Compiler stderr: QAIC_ERROR: Error message: [Operator-'/model/layers.16/self_attn/Reshape'] : Reshape: input shape (4, 4, 4096) and output shape (4, 1, 32, 128) have different number of elements (in 65536 vs. out 16384) Unable to AddNodesToGraphFromModel ``` Tested with BS4. Able to compile now Signed-off-by: quic-shagun <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
CI enablement and other minor fixes for Gemma3 --------- Signed-off-by: Ann Kuruvilla <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Added fix for spdtransform due to change in hash --------- Signed-off-by: Dipankar Sarkar <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
- Enabled CI tests for Finetuning. - Updated Jenkins file to install torch_qaic as it is required during FT tests. - Added finetune as a new pytest flag and updated other existing tests not to trigger for this flag. --------- Signed-off-by: meetkuma <[email protected]> Co-authored-by: Meet Patel <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
CI enablement and other minor fixes for Gemma3 --------- --------- Signed-off-by: Ann Kuruvilla <[email protected]> Signed-off-by: Dipankar Sarkar <[email protected]> Co-authored-by: Dipankar Sarkar <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Reverts quic#484 Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…boarded features in docs (quic#423) This PR is created for updating the readme and docs for adding the latest features added in this release. --------- Signed-off-by: Abukhoyer Shaik <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…next file. (quic#475) Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Padding the dataset with dummy samples (they won't contribute in total_loss) to make the #samples a multiple of degree of ddp*batch_size) in case of 1) Fine tuning through DDP 2) train_batch_size > 1 or val_batch_size > 0 --------- Signed-off-by: Swati Allabadi <[email protected]> Co-authored-by: Swati Allabadi <[email protected]> Co-authored-by: Mamta Singh <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Generating data format config file fails for encoder onnx graph without past key or past value. Fixed a coding bug in the function. --------- Signed-off-by: Shubham Agrawal <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Changed Total (E2E) inference time from decode/sec to sec. Signed-off-by: Asmita Goswami <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…ng optimizer step only. (quic#477) Disabling gradient is necessary when using gradient_accumulation_step > 1 with ddp enabled. Currently, we are syncing gradient at every loss.backward() call, which is called at all steps. When using gradient accumulation, the weight update during opt.step() step. Only during that step, the gradients across each devices should sync with each other. with model.no_sync() --> context manager solves this issue. Here, we are not using it but instead setting ddp_model.require_backward_grad_sync to True or False depending on which step we are. --------- Signed-off-by: Meet Patel <[email protected]> Signed-off-by: meetkuma <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…uic#371) 1. Implement logger for finetuning 2. enable dumping logs by given flag --------- Signed-off-by: Mamta Singh <[email protected]> Co-authored-by: Mamta Singh <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Falcon Modeling fix to accommodate multiple config. This is a fix for falcon 40b Signed-off-by: Dipankar Sarkar <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…ic#482) - Removed all the references of samsum dataset from finetuning code. - Samsum dataset can be used via custom dataset path. --------- Signed-off-by: Meet Patel <[email protected]> Signed-off-by: meetkuma <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Rishin <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Upgrading onnx , onnxruntime ,onnscript and protobuff. Also Updating transformer to 4.52.3 1.onnx==1.18.0 2.onnxruntime==1.22 3.onnxscript==0.2.5 4. protobuff ==6.31.0 --------- Signed-off-by: Dipankar Sarkar <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
1. fix task_type variable in configs 2. enabled passing peft_config yaml/json file from command line. 3. updated run_ft_model.py --------- Signed-off-by: Mamta Singh <[email protected]> Co-authored-by: Mamta Singh <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Dhiraj Kumar Sah <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Dhiraj Kumar Sah <[email protected]> Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…or export Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Functions made to filter and modify hashes. Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…m_pretrained based model creation. Will add that functionality separately after the hashing methodlogy is finalized. Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…ents on naming and ordering as well. Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…file to contain all hashing related methods and tools. Minor code clean ups. Signed-off-by: Dhiraj Kumar Sah <[email protected]>
a07406b
to
c5bed92
Compare
…' instead of 'model_params'. Signed-off-by: Dhiraj Kumar Sah <[email protected]>
…into an error if that parameter doesn't exist in the config. Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Signed-off-by: Dhiraj Kumar Sah <[email protected]>
Detaching hash function for model cache path calculation. changes for QNN compilation not included yet.
Cache folder mechanism has been modified to have a parent directory for a model based on the architecture that we retrieve from the model config. The hash calculation for the ONNX export now incorporates all model kwargs as well as export kwargs and parameters. the parameters that were used to create the hash also gets dumped as a serialized JSON file in the ONNX folder, the same happens for the compile parameters inside the respective qpc folder.