Releases: snowflakedb/snowflake-ml-python
Releases · snowflakedb/snowflake-ml-python
1.2.0
1.2.0
Bug Fixes
- Model Registry: Fix "XGBoost version not compiled with GPU support" error when running CPU inference against open-source
XGBoost models deployed to SPCS. - Model Registry: Fix model deployment to SPCS on Windows machines.
Behavior Changes
New Features
- Model Development: Introduced XGBoost external memory training feature. This feature enables training XGBoost models
on large datasets that don't fit into memory. - Registry: New Registry class named
snowflake.ml.registry.Registry
providing similar APIs as the old one but works
with new MODEL object in Snowflake SQL. Also, we are providingsnowflake.ml.model.Model
and
snowflake.ml.model.ModelVersion
to represent a model and a specific version of a model. - Model Development: Add support for
fit_predict
method inAgglomerativeClustering
,DBSCAN
, andOPTICS
classes; - Model Development: Add support for
fit_transform
method inMDS
,SpectralEmbedding
andTSNE
class.
Additional Notes
- Model Registry: The
snowflake.ml.registry.model_registry.ModelRegistry
has been deprecated starting from version
1.2.0. It will stay in the Private Preview phase. For future implementations, kindly utilize
snowflake.ml.registry.Registry
, except when specifically required. The old model registry will be removed once all
its primary functionalities are fully integrated into the new registry.
1.1.2
1.1.2
Bug Fixes
- Generic: Fix the issue that stack trace is hidden by telemetry unexpectedly.
- Model Development: Execute model signature inference without materializing full dataframe in memory.
- Model Registry: Fix occasional 'snowflake-ml-python library does not exist' error when deploying to SPCS.
Behavior Changes
- Model Registry: When calling
predict
with Snowpark DataFrame, both inferred or normalized column names are accepted. - Model Registry: When logging a Snowpark ML Modeling Model, sample input data or manually provided signature will be
ignored since they are not necessary.
New Features
- Model Development: SQL implementation of binary
precision_score
metric.
1.1.1
1.1.1
Bug Fixes
- Model Registry: The
predict
target method on registered models is now compatible with unsupervised estimators. - Model Development: Fix confusion_matrix incorrect results when the row number cannot be divided by the batch size.
Behavior Changes
New Features
- Introduced passthrough_col param in Modeling API. This new param is helpful in scenarios
requiring automatic input_cols inference, but need to avoid using specific
columns, like index columns, during training or inference.
1.1.0
1.1.0
Bug Fixes
- Model Registry: Fix panda dataframe input not handling first row properly.
- Model Development: OrdinalEncoder and LabelEncoder output_columns do not need to be valid snowflake identifiers. They
would previously be excluded if the normalized name did not match the name specified in output_columns.
Behavior Changes
New Features
- Model Registry: Add support for invoking public endpoint on SPCS service, by providing a "enable_ingress" SPCS
deployment option. - Model Development: Add support for distributed HPO - GridSearchCV and RandomizedSearchCV execution will be
distributed on multi-node warehouses.
1.0.12
1.0.12
Bug Fixes
- Model Registry: Fix regression issue that container logging is not shown during model deployment to SPCS.
- Model Development: Enhance the column capacity of OrdinalEncoder.
- Model Registry: Fix unbound `batch_size`` error when deploying a model other than Hugging Face Pipeline
and LLM with GPU on SPCS.
Behavior Changes
- Model Registry: Raise early error when deploying to SPCS with db/schema that starts with underscore.
- Model Registry:
conda-forge
channel is now automatically added to channel lists when deploying to SPCS. - Model Registry:
relax_version
will not strip all version specifier, instead it will relax==x.y.z
specifier to
>=x.y,<(x+1)
. - Model Registry: Python with different patchlevel but the same major and minor will not result a warning when loading
the model via Model Registry and would be considered to use when deploying to SPCS. - Model Registry: When logging a
snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel
object,
versions of local installed libraries won't be picked as dependencies of models, instead it will pick up some pre-
defined dependencies to improve user experience.
New Features
- Model Registry: Enable best-effort SPCS job/service log streaming when logging level is set to INFO.
1.0.11
1.0.11
New Features
- Model Registry: Add log_artifact() public method.
- Model Development: Add support for
kneighbors
.
Behavior Changes
- Model Registry: Change log_model() argument from TrainingDataset to List of Artifact.
- Model Registry: Change get_training_dataset() to get_artifact().
Bug Fixes
- Model Development: Fix support for XGBoost and LightGBM models using SKLearn Grid Search and Randomized Search model selectors.
- Model Development: DecimalType is now supported as a DataType.
- Model Development: Fix metrics compatibility with Snowpark Dataframes that use Snowflake identifiers
1.0.10
1.0.10
Behavior Changes
- Model Development: precision_score, recall_score, f1_score, fbeta_score, precision_recall_fscore_support,
mean_absolute_error, mean_squared_error, and mean_absolute_percentage_error metric calculations are now distributed. - Model Registry:
deploy
will now returnDeployment
for deployment information.
New Features
- Model Registry: When the model signature is auto-inferred, it will be printed to the log for reference.
- Model Registry: For SPCS deployment,
Deployment
details will containsimage_name
,service_spec
andservice_function_sql
.
Bug Fixes
- Model Development: Fix an issue that leading to UTF-8 decoding errors when using modeling modules on Windows.
- Model Development: Fix an issue that alias definitions cause
SnowparkSQLUnexpectedAliasException
in inference. - Model Registry: Fix an issue that signature inference could be incorrect when using Snowpark DataFrame as sample input.
- Model Registry: Fix too strict data type validation when predicting. Now, for example, if you have a INT8
type feature in the signature, if providing a INT64 dataframe but all values are within the range, it would not fail.
[1.0.9]
Behavior Changes
- Model Development: log_loss metric calculation is now distributed.
Bug Fixes
- Model Registry: Fix an issue that building images fails with specific docker setup.
- Model Registry: Fix an issue that unable to embed local ML library when the library is imported by
zipimport
. - Model Registry: Fix out-of-date doc about
platform
argument in thedeploy
function. - Model Registry: Fix an issue that unable to deploy a GPU-trained PyTorch model to a platform where GPU is not available.
[1.0.8]
1.0.8
Bug Fixes
- Model Development: Ordinal encoder can be used with mixed input column types.
- Model Registry: Fix an issue that incorrect docker executable is used when building images.
- Model Registry: Fix an issue that specifying
token
argument when using
snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel
withtransformers < 4.32.0
is not effective. - Model Registry: Fix an issue that incorrect system function call is used when deploying to SPCS.
- Model Registry: Fix an issue when using a
transformers.pipeline
that does not have atokenizer
. - Model Registry: Fix incorrectly-inferred image repository name during model deployment to SPCS.
- Model Registry: Fix GPU resource retention issue caused by failed or stuck previous deployments in SPCS.
[1.0.7]
Bug Fixes
- Model Development & Model Registry: Fix an error related to pandas.io.json.json_normalize.