Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/getting-started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ Install Liminal via pip:
pip install liminal-orm
```

Install Liminal via uv:

```bash
uv add liminal-orm
```

Install Liminal via github:

```bash
Expand Down
8 changes: 2 additions & 6 deletions docs/getting-started/prerequisites.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
1. **Benchling Admin Account**: Liminal builds on top of Benchling's LIMS system. You will need access and credentials to an admin account for your Benchling tenant(s). Liminal needs credentials with full admin priveleges in order to manipulate Benchling schemas through their API.
1. **Benchling Admin Account**: Liminal builds on top of Benchling's LIMS system. You will need access and credentials to an admin account for your Benchling tenant(s). Liminal needs admin priveleges in order to manipulate Benchling schemas through their API.

2. **SSO optional**: A requirement for Liminal's migration service to work is for your Benchling tenant to have SSO optional (or disabled). At the moment, a part of Liminal's API connection requires an admin email and password login (non-SSO). You can message Benchling support to request that your tenant be configured to be SSO optional (or disabled).

Note that as a Benchling admin, you can enforce SSO for all users and create only a single non-SSO user for Liminal to use. This is what we recommend to maintain the highest level of security.

3. **Python**: Liminal is built using Python. You will need Python 3.9 or later installed on your machine.
2. **Python**: Liminal is built using Python. You will need Python 3.9 or later installed on your machine.

### Notes

Expand Down
12 changes: 5 additions & 7 deletions docs/getting-started/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,15 @@
3. Populate the `env.py` file with your Benchling connection information, following the instructions in the file. For example:

```python
from liminal.connection import BenchlingConnection, TenantConfigFlags
from liminal.connection import BenchlingConnection

# It is highly recommended to use a secrets manager to store your credentials.
prod_connection = BenchlingConnection(
tenant_name="pizzahouse-prod",
tenant_name="pizzahouseprod",
tenant_alias="prod",
api_client_id="my-secret-api-client-id",
api_client_secret="my-secret-api-client-secret",
warehouse_connection_string="...",
internal_api_admin_email="my-secret-internal-api-admin-email",
internal_api_admin_password="my-secret-internal-api-admin-password",
config_flags=TenantConfigFlags(...)
)

staging_connection = BenchlingConnection(...)
Expand All @@ -29,8 +26,9 @@
```

* **Required**: The `api_client_id` and `api_client_secret` are used to connect to Benchling's SDK. For more information, see the [Benchling API documentation](https://docs.benchling.com/docs/getting-started-benchling-apps#calling-the-api-as-an-app).
* **Required**: The `internal_api_admin_email` and `internal_api_admin_password` are used to connect to Benchling's API for the migration service. This must be the email and password used to log in to an Admin account.
* Optional: The `warehouse_connection_string` is used to connect to Benchling's read-only warehouse. If you have access, set this as the connection string for the warehouse.
* **Required**: If your tenant has SSO set to optional or required, Liminal will prompt the user to log in through a playwright browser session that pops up automatically when Liminal is run. This ensures Liminal uses the user's Benchling authentication.
If your tenant has SSO turned off, `internal_api_admin_email` and `internal_api_admin_password` properties are required to be set and are used to connect to Benchling's API for the migration service. This must be the email and password used to log in to an Admin account.
* Optional: The `warehouse_connection_string` is used to connect to Benchling's read-only warehouse. If you have warehouse access, set this as the connection string for the warehouse.
* Optional: The `config_flags` parameter is used to set tenant-specific configuration flags. For more information, see the [BenchlingConnection](../reference/benchling-connection.md) reference.
* Set `schemas_enable_change_warehouse_name` to `True` if you want to enable changing schema and field warehouse names.

Expand Down
15 changes: 10 additions & 5 deletions docs/reference/benchling-connection.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## BenchlingConnection: [class](https://github.com/dynotx/liminal-orm/blob/main/liminal/connection/benchling_connection.py)

The `BenchlingConnection` class is used to define the connection information for a particular Benchling tenant. The BenchlingConnection class is defined in your `env.py` file and it also used to create a BenchlingService object. In the `env.py` file, the api_client and internal_api parameters are required for the BenchlingConnection object in orderto be used in the migration service. The BenchlingService can be imported from the liminal pacakage and be used to connect to [Benchling's SDK](https://docs.benchling.com/docs/getting-started-with-the-sdk), internal API, and/or Postgres warehouse.
The `BenchlingConnection` class is used to define the connection information for a particular Benchling tenant. The BenchlingConnection class is defined in your `env.py` file and it also used to create a BenchlingService object. In the `env.py` file, the api_client is required for the BenchlingConnection object in orderto be used in the migration service. The BenchlingService can be imported from the liminal pacakage and be used to connect to [Benchling's SDK](https://docs.benchling.com/docs/getting-started-with-the-sdk), internal API, and/or Postgres warehouse.

```python
# Example BenchlingConnection definition
Expand All @@ -13,8 +13,6 @@ connection = BenchlingConnection(
api_client_id="my-secret-api-client-id",
api_client_secret="my-secret-api-client-secret",
warehouse_connection_string="my-warehouse-connection-string",
internal_api_admin_email="my-secret-internal-api-admin-email",
internal_api_admin_password="my-secret-internal-api-admin-password",
config_flags=TenantConfigFlags()
)
```
Expand Down Expand Up @@ -43,11 +41,18 @@ connection = BenchlingConnection(

- **internal_api_admin_email: Optional[str] = None**

The email of the internal API admin.
The email of the internal API admin. If SSO is not enabled or optional on your Benchling tenant, this email is used to log in to Benchling, and give Liminal the authenticated internal API session cookie.

- **internal_api_admin_password: Optional[str] = None**

The password of the internal API admin.
The password of the internal API admin. If SSO is not enabled or optional on your Benchling tenant, this password is used to log in to Benchling, and give Liminal the authenticated internal API session cookie.

- **playwright_data_dir: Optional[str] = "~/.liminal/playwright_chrome_data/"**

The directory to store the playwright browser user data. If SSO is enabled and required on your Benchling tenant,
Liminal uses playwright so the user can log into Benchling in order to give Liminal the user's authenticated internal API session cookie.
This directory is used to store playwright's persistent context, allowing the user to set up a persistent Chrome user profile.
Set this to None in order to disable playwright's persistent context which enables automatic login.

- **fieldsets: bool = False**

Expand Down
6 changes: 1 addition & 5 deletions liminal/cli/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def _check_liminal_directory_initialized(liminal_dir_path: Path) -> None:
"""Raises an exception if the liminal directory does not exist at the given path."""
if not liminal_dir_path.exists() or not liminal_dir_path.is_dir():
raise Exception(
"Liminal directory not found at current working directory. Run `liminal init` or check your current working directory."
"/liminal directory not found at current working directory where `liminal` command was run. Run `liminal init` or ensure that your current working directory is where the /liminal environment is located."
)
else:
if not (liminal_dir_path / "env.py").exists():
Expand Down Expand Up @@ -67,10 +67,6 @@ def _read_local_env_file(
raise Exception(
"api_client_id and api_client_secret must be provided in BenchlingConnection in liminal/env.py. This is necessary for the migration service."
)
if not bc.internal_api_admin_email or not bc.internal_api_admin_password:
raise Exception(
"internal_api_admin_email and internal_api_admin_password must be provided in BenchlingConnection in liminal/env.py. This is necessary for the migration service."
)
return bc
raise Exception(
f"BenchlingConnection with tenant name or alias {benchling_tenant} not found in liminal/env.py. Please update the env.py file with a correctly defined BenchlingConnection."
Expand Down
10 changes: 8 additions & 2 deletions liminal/connection/benchling_connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,14 @@ class BenchlingConnection(BaseModel):
warehouse_connection_string: str | None = None
The connection string for the warehouse.
internal_api_admin_email: str | None = None
The email of the internal API admin.
The email of the internal API admin. If SSO is not enabled or optional on your Benchling tenant, this email is used to log in to Benchling, and give Liminal the authenticated internal API session cookie.
internal_api_admin_password: str | None = None
The password of the internal API admin.
The password of the internal API admin. If SSO is not enabled or optional on your Benchling tenant, this password is used to log in to Benchling, and give Liminal the authenticated internal API session cookie.
playwright_data_dir: str | None = "~/.liminal/chrome_data/"
The directory to store the playwright browser user data. If SSO is enabled and required on your Benchling tenant,
Liminal uses playwright so the user can log into Benchling in order to give Liminal the authenticated internal API session cookie.
This directory is used to store playwright's persistent context, allowing the user to set up a persistent chrome profile.
Set this to None in order to disable playwright's persistent context which enables automatic login.
fieldsets: bool = False
Whether your Benchling tenant has access to fieldsets.
config_flags: TenantConfigFlags = TenantConfigFlags()
Expand All @@ -54,6 +59,7 @@ class BenchlingConnection(BaseModel):
warehouse_connection_string: str | None = None
internal_api_admin_email: str | None = None
internal_api_admin_password: str | None = None
playwright_data_dir: str | None = "~/.liminal/playwright_chrome_data/"
fieldsets: bool = False
config_flags: TenantConfigFlags = TenantConfigFlags()

Expand Down
149 changes: 123 additions & 26 deletions liminal/connection/benchling_service.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
import asyncio
import logging
import os
from typing import Any

from playwright.async_api import async_playwright
import requests
from benchling_sdk.auth.client_credentials_oauth2 import ClientCredentialsOAuth2
from benchling_sdk.benchling import Benchling
Expand Down Expand Up @@ -32,6 +35,10 @@
REMOTE_REVISION_ID_FIELD_WH_NAME = "revision_id"


class SSODisabledError(ValueError):
pass


class BenchlingService(Benchling):
"""
Class that creates a connection object that can be used to connect to Benchling's API, database, or internal API.
Expand Down Expand Up @@ -88,30 +95,33 @@ def __init__(
)
self.use_internal_api = use_internal_api
if use_internal_api:
if (
connection.internal_api_admin_email
and connection.internal_api_admin_password
):
csrf_token, session = self.autogenerate_auth(
connection.tenant_name,
connection.internal_api_admin_email,
connection.internal_api_admin_password,
try:
authenticated_session = asyncio.get_event_loop().run_until_complete(
self.autogenerate_auth(
connection.tenant_name,
connection.internal_api_admin_email,
connection.internal_api_admin_password,
connection.playwright_data_dir,
)
)
self.custom_post_cookies = {
"session": session,
}
self.custom_post_headers = {
"X-Csrftoken": csrf_token,
"Referer": f"https://{connection.tenant_name}.benchling.com/",
"Content-Type": "application/json",
}
LOGGER.info(
f"Tenant {connection.tenant_name}: Connected to Benchling internal API."
except SSODisabledError as e:
raise SSODisabledError(
f"{e} Please provide `internal_api_admin_email` and `internal_api_admin_password` in your BenchlingConnection."
)
else:
raise ValueError(
"use_internal_api is True but internal_api_admin_email and internal_api_admin_password not provided in BenchlingConnection."
except RuntimeError as e:
raise RuntimeError(
f"{e}. If you are running this in a Jupyter notebook, use `nest_asyncio.apply()` to allow the async playwright login to run."
)
self.custom_post_cookies = {
"session": authenticated_session,
}
self.custom_post_headers = {
"Referer": f"https://{connection.tenant_name}.benchling.com/",
"Content-Type": "application/json",
}
LOGGER.info(
f"Tenant {connection.tenant_name}: Connected to Benchling internal API."
)

@property
def session(self) -> Session:
Expand Down Expand Up @@ -249,16 +259,101 @@ def upsert_remote_revision_id(self, revision_id: str) -> bool:
f"Error finding field on {REMOTE_LIMINAL_SCHEMA_NAME} schema with warehouse_name {REMOTE_REVISION_ID_FIELD_WH_NAME}. Check schema fields to ensure this field exists and is defined according to documentation."
)

@classmethod
async def autogenerate_auth(
cls,
benchling_tenant: str,
email: str | None = None,
password: str | None = None,
playwright_data_dir: str | None = None,
) -> str:
"""Logs in to Benchling using the admin email and password or playwright and returns the session cookie.
If email and password are not passed in or if SSO is set to required on the Benchling tenant, playwright is used to log in.
Otherwise, the admin email and password are used to log in."""
with requests.Session() as session:
if email and password:
signin_page = session.get(
f"https://{benchling_tenant}.benchling.com/signin",
allow_redirects=False,
)
if signin_page.status_code == 200:
return cls.get_authenticated_session_benchling_admin_login(
benchling_tenant, email, password
)

else:
signin_page = session.get(
f"https://{benchling_tenant}.benchling.com/ext/saml/signin:begin",
allow_redirects=False,
)
if signin_page.status_code == 403:
raise SSODisabledError(
f"admin_email and admin_password not provided when sso is turned off for Benchling tenant {benchling_tenant}."
)
if signin_page.status_code == 302:
return await cls.get_authenticated_session_sso_login_playwright(
benchling_tenant, playwright_data_dir
)
else:
raise ValueError(
f"Unexpected response: Status code {signin_page.status_code}: {signin_page.text}"
)

@classmethod
async def get_authenticated_session_sso_login_playwright(
cls, benchling_tenant: str, playwright_data_dir: str | None = None
) -> str:
"""Logs in to Benchling using playwright and returns the session cookie.
This can be used when SSO is enabled and required on the Benchling tenant."""
LOGGER.info(f"Log into your {benchling_tenant} Benchling tenant...")
async with async_playwright() as playwright:
if playwright_data_dir:
context = await playwright.chromium.launch_persistent_context(
channel="chrome",
headless=False,
user_data_dir=os.path.expanduser(playwright_data_dir),
)
else:
browser = await playwright.chromium.launch(
channel="chrome", headless=False
)
context = await browser.new_context()
page = await context.new_page()
try:
await page.goto(f"https://{benchling_tenant}.benchling.com")
except Exception:
raise ValueError(
f"Error navigating to https://{benchling_tenant}.benchling.com"
)
try:
await page.wait_for_url(
f"**/{benchling_tenant}.benchling.com/**", timeout=600_000
)
except Exception:
raise TimeoutError(
f"Log in cancelled or timed out (2 min timeout). Did not detect SSO log in for https://{benchling_tenant}.benchling.com."
)

cookies = await context.cookies()
session_cookie = next(
(c["value"] for c in cookies if c["name"] == "session"), None
)
if not session_cookie:
raise ValueError("No session cookie found.")
return session_cookie

@classmethod
@retry(
stop=stop_after_attempt(3),
retry=retry_if_exception_type(ValueError),
wait=wait_exponential(multiplier=1, min=1, max=8),
reraise=True,
)
def autogenerate_auth(
def get_authenticated_session_benchling_admin_login(
cls, benchling_tenant: str, email: str, password: str
) -> tuple[str, str]:
) -> str:
"""Logs in to Benchling using the admin email and password and returns the session cookie.
This can be used when SSO is disabled or optional on the Benchling tenant."""
with requests.Session() as session:
homepage = session.get(f"https://{benchling_tenant}.benchling.com/signin")
soup = BeautifulSoup(homepage.content, features="lxml")
Expand Down Expand Up @@ -286,6 +381,8 @@ def autogenerate_auth(
raise ValueError(
f"Failed to sign in to Benchling: {signin_response.text}"
)
return csrf_token, signin_response.headers["Set-Cookie"].split("; Secure")[
0
].removeprefix("session=")
return (
signin_response.headers["Set-Cookie"]
.split("; Secure")[0]
.removeprefix("session=")
)
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ dependencies = [
"psycopg2-binary>=2.9.10,<3",
"tornado==6.5.0",
"click>=8.0.0,<8.2.0",
"playwright>=1.58.0",
]

[virtualenvs]
Expand Down
Loading
Loading