Skip to content

Commit 2b044fc

Browse files
sfc-gh-anavalosSnowflake Authors
andauthored
Project import generated by Copybara. (#114)
GitOrigin-RevId: 516a6129d65f30b2dbfc2160bc41cc35c6f468a8 Co-authored-by: Snowflake Authors <[email protected]>
1 parent 123693a commit 2b044fc

File tree

144 files changed

+5584
-4794
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

144 files changed

+5584
-4794
lines changed

CHANGELOG.md

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,31 @@
11
# Release History
22

3-
## 1.6.0
3+
## 1.6.1 (TBD)
4+
5+
### Bug Fixes
6+
7+
- Feature Store: Support large metadata blob when generating dataset
8+
- Feature Store: Added a hidden knob in FeatureView as kargs for setting customized
9+
refresh_mode
10+
- Registry: Fix an error message in Model Version `run` when `function_name` is not mentioned and model has multiple
11+
target methods.
12+
- Cortex inference: snowflake.cortex.Complete now only uses the REST API for streaming and the use_rest_api_experimental
13+
is no longer needed.
14+
- Feature Store: Add a new API: FeatureView.list_columns() which list all column information.
15+
- Data: Fix `DataFrame` ingestion with `ArrowIngestor`.
16+
17+
### New Features
18+
19+
- Enable `set_params` to set the parameters of the underlying sklearn estimator, if the snowflake-ml model has been fit.
20+
- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.
21+
- Data: Add `snowflake.ml.data.ingestor_utils` module with utility functions helpful for `DataIngestor` implementations.
22+
- Data: Add new `to_torch_dataset()` connector to `DataConnector` to replace deprecated DataPipe.
23+
- Registry: Option to `enable_explainability` set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.
24+
- Registry: Option to `enable_explainability` when registering SHAP supported sklearn models.
25+
26+
### Behavior Changes
27+
28+
## 1.6.0 (2024-07-29)
429

530
### Bug Fixes
631

@@ -29,6 +54,14 @@
2954
distributed_hpo_trainer.ENABLE_EFFICIENT_MEMORY_USAGE = False
3055
`
3156
- Registry: Option to `enable_explainability` when registering LightGBM models as a pre-PuPr feature.
57+
- Data: Add new `snowflake.ml.data` preview module which contains data reading utilities like `DataConnector`
58+
- `DataConnector` provides efficient connectors from Snowpark `DataFrame`
59+
and Snowpark ML `Dataset` to external frameworks like PyTorch, TensorFlow, and Pandas. Create `DataConnector`
60+
instances using the classmethod constructors `DataConnector.from_dataset()` and `DataConnector.from_dataframe()`.
61+
- Data: Add new `DataConnector.from_sources()` classmethod constructor for constructing from `DataSource` objects.
62+
- Data: Add new `ingestor_class` arg to `DataConnector` classmethod constructors for easier `DataIngestor` injection.
63+
- Dataset: `DatasetReader` now subclasses new `DataConnector` class.
64+
- Add optional `limit` arg to `DatasetReader.to_pandas()`
3265

3366
### Behavior Changes
3467

bazel/py_rules.bzl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -256,6 +256,7 @@ def _py_wheel_impl(ctx):
256256
ctx.file.pyproject_toml.path,
257257
execution_root_relative_path,
258258
"--wheel",
259+
"--sdist",
259260
"--outdir",
260261
wheel_output_dir.path,
261262
],

ci/conda_recipe/meta.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ build:
1717
noarch: python
1818
package:
1919
name: snowflake-ml-python
20-
version: 1.6.0
20+
version: 1.6.1
2121
requirements:
2222
build:
2323
- python

ci/targets/quarantine/prod3.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@
22
//tests/integ/snowflake/ml/registry:model_registry_snowservice_integ_test
33
//tests/integ/snowflake/ml/model:spcs_llm_model_integ_test
44
//tests/integ/snowflake/ml/extra_tests:xgboost_external_memory_training_test
5+
//tests/integ/snowflake/ml/registry:model_registry_snowservice_merge_gate_integ_test

codegen/build_file_autogen.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
from absl import app
1515

1616
from codegen import sklearn_wrapper_autogen as swa
17-
from snowflake.ml._internal.snowpark_pandassnowpark_pandas import imports
17+
from snowflake.ml._internal.snowpark_pandas import imports
1818

1919

2020
@dataclass(frozen=True)

snowflake/cortex/_complete.py

Lines changed: 7 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,6 @@ def _call_complete_rest(
9090
prompt: Union[str, List[ConversationMessage]],
9191
options: Optional[CompleteOptions] = None,
9292
session: Optional[snowpark.Session] = None,
93-
stream: bool = False,
9493
) -> requests.Response:
9594
session = session or context.get_active_session()
9695
if session is None:
@@ -121,7 +120,7 @@ def _call_complete_rest(
121120

122121
data = {
123122
"model": model,
124-
"stream": stream,
123+
"stream": True,
125124
}
126125
if isinstance(prompt, List):
127126
data["messages"] = prompt
@@ -137,32 +136,15 @@ def _call_complete_rest(
137136
if "top_p" in options:
138137
data["top_p"] = options["top_p"]
139138

140-
logger.debug(f"making POST request to {url} (model={model}, stream={stream})")
139+
logger.debug(f"making POST request to {url} (model={model})")
141140
return requests.post(
142141
url,
143142
json=data,
144143
headers=headers,
145-
stream=stream,
144+
stream=True,
146145
)
147146

148147

149-
def _process_rest_response(
150-
response: requests.Response,
151-
stream: bool = False,
152-
deadline: Optional[float] = None,
153-
) -> Union[str, Iterator[str]]:
154-
if stream:
155-
return _return_stream_response(response, deadline)
156-
157-
try:
158-
content = response.json()["choices"][0]["message"]["content"]
159-
assert isinstance(content, str)
160-
return content
161-
except (KeyError, IndexError, AssertionError) as e:
162-
# Unlike the streaming case, errors are not ignored because a message must be returned.
163-
raise ResponseParseException("Failed to parse message from response.") from e
164-
165-
166148
def _return_stream_response(response: requests.Response, deadline: Optional[float]) -> Iterator[str]:
167149
client = SSEClient(response)
168150
for event in client.events():
@@ -243,7 +225,6 @@ def _complete_impl(
243225
prompt: Union[str, List[ConversationMessage], snowpark.Column],
244226
options: Optional[CompleteOptions] = None,
245227
session: Optional[snowpark.Session] = None,
246-
use_rest_api_experimental: bool = False,
247228
stream: bool = False,
248229
function: str = "snowflake.cortex.complete",
249230
timeout: Optional[float] = None,
@@ -253,16 +234,14 @@ def _complete_impl(
253234
raise ValueError('only one of "timeout" and "deadline" must be set')
254235
if timeout is not None:
255236
deadline = time.time() + timeout
256-
if use_rest_api_experimental:
237+
if stream:
257238
if not isinstance(model, str):
258239
raise ValueError("in REST mode, 'model' must be a string")
259240
if not isinstance(prompt, str) and not isinstance(prompt, List):
260241
raise ValueError("in REST mode, 'prompt' must be a string or a list of ConversationMessage")
261-
response = _call_complete_rest(model, prompt, options, session=session, stream=stream, deadline=deadline)
242+
response = _call_complete_rest(model, prompt, options, session=session, deadline=deadline)
262243
assert response.status_code >= 200 and response.status_code < 300
263-
return _process_rest_response(response, stream=stream)
264-
if stream is True:
265-
raise ValueError("streaming can only be enabled in REST mode, set use_rest_api_experimental=True")
244+
return _return_stream_response(response, deadline)
266245
return _complete_sql_impl(function, model, prompt, options, session)
267246

268247

@@ -275,7 +254,6 @@ def Complete(
275254
*,
276255
options: Optional[CompleteOptions] = None,
277256
session: Optional[snowpark.Session] = None,
278-
use_rest_api_experimental: bool = False,
279257
stream: bool = False,
280258
timeout: Optional[float] = None,
281259
deadline: Optional[float] = None,
@@ -287,16 +265,13 @@ def Complete(
287265
prompt: A Column of prompts to send to the LLM.
288266
options: A instance of snowflake.cortex.CompleteOptions
289267
session: The snowpark session to use. Will be inferred by context if not specified.
290-
use_rest_api_experimental (bool): Toggles between the use of SQL and REST implementation. This feature is
291-
experimental and can be removed at any time.
292268
stream (bool): Enables streaming. When enabled, a generator function is returned that provides the streaming
293269
output as it is received. Each update is a string containing the new text content since the previous update.
294-
The use of streaming requires the experimental use_rest_api_experimental flag to be enabled.
295270
timeout (float): Timeout in seconds to retry failed REST requests.
296271
deadline (float): Time in seconds since the epoch (as returned by time.time()) to retry failed REST requests.
297272
298273
Raises:
299-
ValueError: If `stream` is set to True and `use_rest_api_experimental` is set to False.
274+
ValueError: incorrect argument.
300275
301276
Returns:
302277
A column of string responses.
@@ -307,7 +282,6 @@ def Complete(
307282
prompt,
308283
options=options,
309284
session=session,
310-
use_rest_api_experimental=use_rest_api_experimental,
311285
stream=stream,
312286
timeout=timeout,
313287
deadline=deadline,

0 commit comments

Comments
 (0)