The demo kit makes a demo and PoC super simple:
- Creates a tiny database in the cloud
- Automatic secure names, passwords, firewall rules
- Configures the database
- Enable replication on the catalog and tables
- Starts the Lakeflow Connect
- Connection, staging and target schemas, Gateway and Ingestion pipelines, and Jobs
- Generates DMLs on tables
- insert, update, delete on primary key table
intpk - insert on non primary key table
dtix
- insert, update, delete on primary key table
- Customize via CLIs
- Databricks CLI, database CLI, cloud CLI,
After two hours, all objects created are automatically deleted. A tiny database instance is created meant for a functional demo.
Don't reboot the laptop while the demo is running. Rebooting the laptop will kill the background cleanup jobs.
This is a one time task in the beginning. Copy and paste the commands in a terminal window to install CLI (one time or upgrade)
-
Open a new terminal using one of the ways below.
ttyd if setup from launchctl at http://localhost:7681/
- open a new tab from a browser with URL http://localhost:7681/

ttyd started from a terminal at http://localhost:7681/
- open
terminaloritermfrom the above - run ttyd
nohup ttyd -W tmux new -A -s lakeflow.ttyd &- open a new tab from a browser with URL http://localhost:7681/

- open a new tab from a browser with URL http://localhost:7681/
-
Use bash 4.0 or greater
/opt/homebrew/bin/bash
-
Initialize environment variables in a new terminal session for a new database. Customize with
exportcommands as required.# [ optional ] customize export commands here as required source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/00_lakeflow_connect_env.sh)
-
Start and configure one of the following database instances
SQL Server
SQL Server: AWS RDS SQL Server
source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/01_aws_sqlserver.sh) source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/02_sqlserver_configure.sh)
SQL Server: Azure SQL Server
source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/01_azure_sqlserver.sh) source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/02_sqlserver_configure.sh)
SQL Server: Azure SQL Server Managed Instance
The cost is relatively high if the free version is not available.source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/01_azure_managed_instance.sh) source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/02_sqlserver_configure.sh)
SQL Server: Google CloudSQL SQL Server
source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/01_gcloud_sqlserver_instance.sh) source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/sqlserver/02_sqlserver_configure.sh)
Postgres
Postgres: Azure Postgres Flexible Server
source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/postgres/01_azure_postgres.sh) source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/postgres/02_postgres_configure.sh)
Postgres: AWS RDS Postgres
source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/postgres/01_aws_postgres.sh) source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/postgres/02_postgres_configure.sh)
-
Start the Databricks Lakeflow Connect Database Demo
source <(curl -s -L https://raw.githubusercontent.com/rsleedbx/lakeflow_connect/refs/heads/main/03_lakeflow_connect_demo.sh)The terminal session maintains variables that include host, user names, and passwords. The variable names used for a connection are:
-
$DBA_USERNAME
-
$DBA_PASSWORD
-
$USER_USERNAME
-
$USER_PASSWORD
-
$DB_FQDN
-
$DB_PORT
-
$DB_CATALOG
Example of
echo $DBA_USERNAMEto see the value.L9P0RQPHY7:lakeflow_connect robert.lee$ echo $DBA_USERNAME eirai7opei9ahp3h
-
type
SQLCLI_DBAin the terminal after creating the database. This will issue commands to connect as the DBA using$DBA_USERNAME:$DBA_PASSWORD@$DB_HOST_FQDN:$DB_PORT/. For postgres, psql is used and postgres is the catalog. For sqlserver, sqlcmd is used and master is the catalog.Example of postgres using
SQLCLI_DBA:. postgres/01_azure_postgres.sh SQLCLI_DBA L9P0RQPHY7:lakeflow_connect robert.lee$ SQLCLI_DBA PGPASSWORD=$DBA_PASSWORD psql postgresql://eirai7opei9ahp3h@eip9aeth9ke3oiji-zp.postgres.database.azure.com:5432/ievoo7ai?sslmode=allow psql (14.15 (Homebrew), server 16.8) WARNING: psql major version 14, server major version 16. Some psql features might not work. Type "help" for help. ievoo7ai=>
-
type
SQLCLIin the terminal after configuring the database. This will issue commands to connect as the user using$USER_USERNAME:$USER_PASSWORD@$DB_HOST_FQDN:$DB_PORT/$DB_CATALOG. For postgres, psql is used. For sqlserver, sqlcmd is used.
Example of postgres using SQLCLI:
. postgres/02_postgres_configure.sh
SQLCLI
L9P0RQPHY7:lakeflow_connect robert.lee$ SQLCLI
PGPASSWORD=$USER_PASSWORD psql postgresql://eine4jeip3eej4ja@eip9aeth9ke3oiji-zp.postgres.database.azure.com:5432/ievoo7ai?sslmode=allow
psql (14.15 (Homebrew), server 16.8)
WARNING: psql major version 14, server major version 16.
Some psql features might not work.
Type "help" for help.
ievoo7ai=> - Manually connecting to the database
- type the following command to connect as the DBA
PGPASSWORD=$DBA_PASSWORD psql "postgresql://${DBA_USERNAME}@${DB_HOST_FQDN}:${DB_PORT}/postgres?sslmode=allow"- type the following command to connect as the user
PGPASSWORD=$USER_PASSWORD psql "postgresql://${USER_USERNAME}@${DB_HOST_FQDN}:${DB_PORT}/${DB_CATALOG}?sslmode=allow"BOTH is the default
Example usage:
Only replicate tables that do not have primary keys.
export CDC_CT_MODE=CDC
. ./00_lakeflow_connect_env.sh| CDC_CT_MODE | Postgres | SQL Server |
|---|---|---|
| CDC | set replica full on tables without pk |
enable CDC on tables without pk |
| CT | set replica default on tables with pk |
enable CT on tables with pk |
| BOTH | set replica full on tables without pk, set replica default on tables with pk |
enable CDC on tables without pk, enable CT on tables with pk |
| NONE | set replica nothing on the tables |
enable CDC and CT on the table |
The default is to open the database to the public. For security, a random server name, catalog name, user name, dba name, user password, dba password are used. The database is deleted in 1 hour by default.
Example usage:
Set up firewall to allow connections from 192.168.0.0/24 and 10.10.10.12/32
export DB_FIREWALL_CIDRS="192.168.0.0/24 10.10.10.12/32"
. ./00_lakeflow_connect_env.shThe default is to delete the database objects (server, catalog, schema, tables, UC Connection) the script creates after this many minutes.
- To not delete, make it
DELETE_DB_AFTER_SLEEP="" - To change the time, make it
DELETE_DB_AFTER_SLEEP="67m"for example.
If the server was already created, then it won't be deleted even if this is set.
Example usage:
export DELETE_DB_AFTER_SLEEP=""
. ./00_lakeflow_connect_env.shThe default is to delete the pipeline objects (gateway, ingestion, jobs) the script creates after this many minutes.
- To not delete, make it
DELETE_PIPELINES_AFTER_SLEEP="" - To change the time, make it
DELETE_PIPELINES_AFTER_SLEEP="67m"for example.
Example usage:
export DELETE_PIPELINES_AFTER_SLEEP=""
. ./00_lakeflow_connect_env.shThe default Databricks profile is DEFAULT. Change to a different profile name.
Example usage:
Example usage of using azure profile name from .databrickscfg file.
export DATABRICKS_CONFIG_PROFILE="azure"
. ./00_lakeflow_connect_env.sh\llist catalogs (databases)\dnlist schema\dt *.*to list schemas and tables\qquit
select * from information_schema.schemata;list schemasselect * from information_schema.tables;to list schemas and tables
- Ctrl +
b+0select window 0 - Ctrl +
b+1select window 1 - Ctrl +
b+ccreate a new windows - Ctrl +
b+%to split the current pane vertically. - Ctrl +
b+"to split the current pane horizontally. - Ctrl +
b+xto close the current pane.
To perform a full refresh of a table.
- select a table to refresh and start the pipeline
databricks api post /api/2.0/pipelines/$INGESTION_PIPELINE_ID/updates --json '{
"full_refresh":false,
"full_refresh_selection":[
"intpk"
]
}'
- kill jobs that delete the pipeline
kill $(jobs -l | grep "pipelines delete" | awk '{print $2}') ```

