Overview

For a demo, SQL Server source is support.

TL;DR

Start a Personal Compute Cluster
Open sqlserver-arcion-dbx notebook
Attach the notebook to the Personal Compute Cluster
Follow the instructions in the notebook

Purpose

This demo environment is designed to help answer the following questions:

How does it work
How fast is snapshot (bulk insert) performance
How fast is realtime (CDC) performance

Demo is not for for:

Schema and data type validation
Tuning based on environmental factors
Data value conversion
HA and resilience testing

Infrastructures

The following is the demo environment.

  +------------------ Databricks Personal Compute Cluster --------------------+  
  |  +-----------+    +-----------+          +----------+     +------------+  |
  |  |  Workload |    | Source DB |          |  Arcion  |     | Target DB  |  | 
  |  |           |    |           | <--------|   Demo   |     |            |  |
  |  |   YCSB    | -->| SQL Server|          | Notebook | --> | Databricks |  |
  |  |           |    |           | -------->|   CLI    |     |            |  |
  |  +-----------+    +-----------+          +----------+     +------------+  |
  +---------------------------------------------------------------------------+

In a cloud deployment, the following is the expected separation.

  +----------Customer Cloud---------+   F   +---------- Databricks -----------+  
  |  +-----------+    +-----------+ |   I   | +----------+     +------------+ |
  |  |  Workload |    | Source DB | |   R   | |  Arcion  |     | Target DB  | | 
  |  |           |    |           | | <-E-- | |   Demo   |     |            | |
  |  |   YCSB    | -->| SQL Server| |   W   | | Notebook | --> | Databricks | |
  |  |           |    |           | | --A-> | |    UI    |     |            | |
  |  +-----------+    +-----------+ |   L   | +----------+     +------------+ |  
  +---------------------------------+   L   +---------------------------------+

In an on-prem deployment, the following is the expected separation.
```text
  +-------------------Customer Cloud----------------+   F   +-- Databricks --+  
  |  +-----------+    +-----------+    +----------+ |   I   | +------------+ |
  |  |  Workload |    | Source DB |    |  Arcion  | |   R   | | Target DB  | | 
  |  |           |    |           |    |   Demo   | |   E   | |            | |
  |  |   YCSB    | -->| SQL Server| -->| Notebook | |   W   | | Databricks | |
  |  |           |    |           |    |    UI    | | --A-> | |            | |
  |  +-----------+    +-----------+    +----------+ |   L   | +------------+ |  
  +-------------------------------------------------+   L   +----------------+

Schema

In the demo, there are dense and sparse tables. Dense and sparse tables try to model a star schema. Dense tables can be long and wide and have most of the fields populated. Sparse tables can be short and wide if required and have most of the fields NOT populated.

An arbitrary number of dense and sparse tables be defined.
Each table can have defined number of fields.
Range of populated fields can be defined.
A specified number of records are inserted/appended using native bulk copy.

Dense tables have all fields populated to data to the max length of the field. Sparse tables fields are populated with NULLs.

+--------+    +--------+  +--------+    +--------+
| Dense  |    | Dense  |  | Sparse |    | Sparse | 
| Tables | ...| Tables |  | Tables | ...| Tables |
|   1    |    |   n    |  |   1    |    |    n   |
+--------+    +--------+  +--------+    +--------+

Dense and sparse tables allow one to model capacity and performance of:

Star schema
IOT data
big data

Workload

Workload uses customized Yahoo Cloud Services Benchmark YCSB.

Arcion

Legacy Arcion CLI is used.

Staging

DBFS

The root directory will be created if does not exist. can be access under /dbfs/

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
bin		bin
demo		demo
libsh		libsh
resources		resources
wip		wip
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
azure.md		azure.md
requirements.txt		requirements.txt
sqlserver-arcion-dbx.ipynb		sqlserver-arcion-dbx.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

TL;DR

Purpose

Infrastructures

Schema

Workload

Arcion

Staging

DBFS

About

Uh oh!

Releases 6

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

TL;DR

Purpose

Infrastructures

Schema

Workload

Arcion

Staging

DBFS

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages