|
5 | 5 | 1. [Overview](#overview) |
6 | 6 | - [Cost](#cost) |
7 | 7 | 2. [Prequisites](#prequisites) |
8 | | -3. [CloudShell Deployment](#cloudshell-deployment) |
9 | | -4. [Local Deployment (Mac / Linux)](#local-deployment-mac--linux) |
10 | | -5. [Deployment Validation](#deployment-validation) |
11 | | -6. [Running the Guidance](#running-the-guidance) |
12 | | -7. [Evaluation and Testing](#evaluation-and-testing) |
13 | | -8. [Tracing and Observability](#tracing-and-observability) |
14 | | -9. [Next Steps](#next-steps) |
15 | | -10. [Cleanup](#cleanup) |
16 | | -11. [Common issues, and debugging](#common-issues-and-debugging) |
17 | | -12. [Revisions](#revisions) |
18 | | -13. [Authors](#authors) |
| 8 | +3. [Notes](#notes) |
| 9 | +4. [CloudShell Deployment](#cloudshell-deployment) |
| 10 | +5. [Local Deployment (Mac / Linux)](#local-deployment-mac--linux) |
| 11 | +6. [Deployment Validation](#deployment-validation) |
| 12 | +7. [Running the Guidance](#running-the-guidance) |
| 13 | +8. [Evaluation and Testing](#evaluation-and-testing) |
| 14 | +9. [Tracing and Observability](#tracing-and-observability) |
| 15 | +10. [Next Steps](#next-steps) |
| 16 | +11. [Cleanup](#cleanup) |
| 17 | +12. [Common issues, and debugging](#common-issues-and-debugging) |
| 18 | +13. [Revisions](#revisions) |
| 19 | +14. [Authors](#authors) |
19 | 20 |
|
20 | 21 | ## Overview |
21 | 22 |
|
@@ -148,272 +149,89 @@ This Guidance uses AWS CDK. If you are using AWS CDK for the first time, CDK wil |
148 | 149 |
|
149 | 150 | ### Prerequisites |
150 | 151 |
|
151 | | -1. **Node.js and npm** (version 21.x or higher) |
| 152 | +- **Node.js 21.x+** and **npm**: `brew install node` |
| 153 | +- **AWS CLI 2.27.51+**: [Install guide](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) + `aws configure` |
| 154 | +- **AWS CDK CLI**: `npm install -g aws-cdk` |
| 155 | +- **Python 3.13+**: `brew install python` |
| 156 | +- **Container runtime**: [Podman](https://podman.io/) or [Docker](https://www.docker.com/) |
152 | 157 |
|
153 | | - ```bash |
154 | | - # For macOS |
155 | | - brew install node |
156 | | - ``` |
157 | | - |
158 | | -2. **AWS CLI** (version 2.27.51 or higher) |
159 | | - |
160 | | - ```bash |
161 | | - # For macOS |
162 | | - curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg" |
163 | | - sudo installer -pkg AWSCLIV2.pkg -target / |
164 | | - |
165 | | - # Configure with your credentials |
166 | | - aws configure |
167 | | - ``` |
168 | | - |
169 | | -3. **AWS CDK CLI** |
170 | | - |
171 | | - ```bash |
172 | | - npm install -g aws-cdk |
173 | | - ``` |
174 | | - |
175 | | -4. **Python** (version 3.13.x or higher) |
176 | | - |
177 | | - **Without** Homebrew |
| 158 | +### Quick Start |
178 | 159 |
|
179 | | - ```bash |
180 | | - /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" |
181 | | - python3 --version |
182 | | - ``` |
183 | | - |
184 | | - **With** Homebrew |
185 | | - |
186 | | - ```bash |
187 | | - brew install python |
188 | | - python3 --version |
189 | | - ``` |
190 | | - |
191 | | -5. Either **[Podman](https://podman.io/)** or **[Docker](https://www.docker.com/)** installed and running |
192 | | - |
193 | | -### Deployment Steps |
194 | | - |
195 | | -1. **Clone the repository** |
196 | | - |
197 | | - ```bash |
198 | | - git -b v2 clone https://github.com/aws-solutions-library-samples/guidance-for-agentic-data-exploration-on-aws.git |
199 | | - cd guidance-for-agentic-data-exploration-on-aws |
200 | | - ``` |
201 | | - |
202 | | -2. **Install dependencies:** |
203 | | - |
204 | | - ```bash |
205 | | - npm install |
206 | | - ``` |
207 | | - |
208 | | -3. **Bootstrap AWS environment** (if not already done): |
209 | | - |
210 | | - ```bash |
211 | | - npx cdk bootstrap |
212 | | - ``` |
213 | | - |
214 | | -4. **Start container runtime** (one time if not already done): |
215 | | - |
216 | | - ```bash |
217 | | - podman machine init |
218 | | - podman machine start |
219 | | - ``` |
220 | | - |
221 | | -5. **Deploy:** |
222 | | - |
223 | | - ```bash |
224 | | - # Standard deployment |
225 | | - ./dev-tools/deploy.sh |
226 | | - |
227 | | - # With Neptune graph database |
228 | | - ./dev-tools/deploy.sh --with-graph-db |
229 | | - ``` |
| 160 | +```bash |
| 161 | +# 1. Clone and setup |
| 162 | +git clone -b v2 https://github.com/aws-solutions-library-samples/guidance-for-agentic-data-exploration-on-aws.git |
| 163 | +cd guidance-for-agentic-data-exploration-on-aws |
| 164 | +npm install |
230 | 165 |
|
231 | | -### Deployment Configuration Options |
| 166 | +# 2. Bootstrap AWS (one-time) |
| 167 | +npx cdk bootstrap |
232 | 168 |
|
233 | | -#### Option A: Quick Deployment (New VPC) |
| 169 | +# 3. Start container runtime (one-time) |
| 170 | +podman machine init && podman machine start |
234 | 171 |
|
235 | | -```bash |
236 | | -# Deploy with new VPC, no graph database |
| 172 | +# 4. Deploy |
237 | 173 | ./dev-tools/deploy.sh |
238 | | - |
239 | | -# Deploy with new VPC and Neptune graph database |
240 | | -./dev-tools/deploy.sh --with-graph-db |
241 | | - |
242 | | -# Deploy with new VPC, graph database, and guardrails |
243 | | -./dev-tools/deploy.sh --with-graph-db --guardrail-mode enforce |
244 | 174 | ``` |
245 | 175 |
|
246 | | -#### Option B: Custom Configuration |
| 176 | +### Deployment Options |
247 | 177 |
|
248 | | -```bash |
249 | | -# Copy and customize the template |
250 | | -cp ./dev-tools/deploy-local-template.sh ./dev-tools/deploy-local.sh |
| 178 | +| Configuration | Command | |
| 179 | +| ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------- | |
| 180 | +| **New VPC, No Graph DB** | `./dev-tools/deploy.sh` | |
| 181 | +| **New VPC with New Graph DB** | `./dev-tools/deploy.sh --with-graph-db` | |
| 182 | +| **Existing VPC, No Graph DB** | `./dev-tools/deploy.sh --vpc-id vpc-123` | |
| 183 | +| **Existing VPC with New Graph DB** | `./dev-tools/deploy.sh --vpc-id vpc-123 --with-graph-db` | |
| 184 | +| **Existing VPC and Graph DB** | `./dev-tools/deploy.sh --vpc-id vpc-123 --neptune-sg sg-456 --neptune-host cluster.neptune.amazonaws.com --guardrail-mode enforce` | |
| 185 | +| **Guardrails Mode**<br/> **enforce** blocks<br/> **shadow** _(default)_ logs only | `./dev-tools/deploy.sh --guardrail-mode enforce` | |
251 | 186 |
|
252 | | -# Edit deploy-local.sh with your VPC/Neptune/Guardrails settings |
253 | | -# Then deploy with your custom configuration |
254 | | -./dev-tools/deploy-local.sh |
255 | | -``` |
| 187 | +### Advanced Configuration |
256 | 188 |
|
257 | | -#### Option C: Manual Parameters |
| 189 | +#### Template-based Configuration |
258 | 190 |
|
259 | 191 | ```bash |
260 | | -# Deploy with existing VPC and Neptune |
261 | | -./dev-tools/deploy.sh \ |
262 | | - --vpc-id vpc-123 \ |
263 | | - --neptune-sg sg-456 \ |
264 | | - --neptune-host cluster.neptune.amazonaws.com \ |
265 | | - --guardrail-mode shadow |
266 | | - |
267 | | -# Deploy with new Neptune graph database |
268 | | -./dev-tools/deploy.sh \ |
269 | | - --with-graph-db \ |
270 | | - --guardrail-mode enforce |
| 192 | +# Copy and customize the deployment template |
| 193 | +cp ./dev-tools/deploy-local-template.sh ./dev-tools/deploy-local.sh |
| 194 | +# Edit deploy-local.sh with your settings, then run: |
| 195 | +./dev-tools/deploy-local.sh |
271 | 196 | ``` |
272 | 197 |
|
273 | | -### VPC Configuration |
274 | | - |
275 | | -#### Default Deployment (New VPC) |
| 198 | +#### VPC Requirements (for existing VPC) |
276 | 199 |
|
277 | | -By default, the stack creates a new VPC with the required infrastructure: |
278 | | - |
279 | | -```bash |
280 | | -CDK_DOCKER=podman npx cdk deploy --require-approval never |
281 | | -``` |
| 200 | +Your VPC must have: |
282 | 201 |
|
283 | | -#### Bring Your Own VPC (Optional) |
| 202 | +- **Public subnets** (2+ AZs) with Internet Gateway routes and `aws-cdk:subnet-type = Public` tags |
| 203 | +- **Private subnets** (2+ AZs) with NAT Gateway routes and `aws-cdk:subnet-type = Private` tags |
| 204 | +- **Internet Gateway** and **NAT Gateway(s)** properly configured |
284 | 205 |
|
285 | | -You can deploy the services into an existing VPC by providing the VPC ID as a context value: |
| 206 | +#### Neptune Integration |
286 | 207 |
|
287 | 208 | ```bash |
288 | | -CDK_DOCKER=podman npx cdk deploy --context vpcId=vpc-xxxxxxxxx --require-approval never |
289 | | -``` |
290 | | - |
291 | | -#### VPC Requirements |
292 | | - |
293 | | -When using an existing VPC, it **must** have the following configuration: |
294 | | - |
295 | | -**Required Subnets:** |
296 | | - |
297 | | -1. **Public Subnets** (for Application Load Balancer): |
298 | | - |
299 | | - - At least 2 subnets in different Availability Zones |
300 | | - - Must have routes to an Internet Gateway (`0.0.0.0/0 → igw-xxxxx`) |
301 | | - - Must be tagged with `aws-cdk:subnet-type = Public` |
302 | | - - Must have `MapPublicIpOnLaunch = true` |
| 209 | +# Get Neptune details for existing cluster |
| 210 | +CLUSTER_NAME="your-cluster-name" |
| 211 | +NEPTUNE_SG=$(aws neptune describe-db-clusters --db-cluster-identifier $CLUSTER_NAME --query "DBClusters[0].VpcSecurityGroups[0].VpcSecurityGroupId" --output text) |
| 212 | +NEPTUNE_HOST=$(aws neptune describe-db-clusters --db-cluster-identifier $CLUSTER_NAME --query "DBClusters[0].ReaderEndpoint" --output text) |
303 | 213 |
|
304 | | -2. **Private Subnets** (for Fargate Tasks): |
305 | | - - At least 2 subnets in different Availability Zones |
306 | | - - Must have routes to NAT Gateway for outbound access (`0.0.0.0/0 → nat-xxxxx`) |
307 | | - - Must be tagged with `aws-cdk:subnet-type = Private` |
308 | | - - Must have `MapPublicIpOnLaunch = false` |
309 | | - |
310 | | -**Required Infrastructure:** |
311 | | - |
312 | | -- **Internet Gateway** attached to the VPC |
313 | | -- **NAT Gateway(s)** in public subnets for private subnet outbound access |
314 | | -- **Route Tables** properly configured: |
315 | | - - Public route table: `0.0.0.0/0 → Internet Gateway` |
316 | | - - Private route table: `0.0.0.0/0 → NAT Gateway` |
317 | | - |
318 | | -#### Example VPC Structure |
319 | | - |
320 | | -``` |
321 | | -VPC (172.30.0.0/16) |
322 | | -├── Public Subnets (Load Balancer) |
323 | | -│ ├── subnet-xxxxx (us-east-1a) - 172.30.3.0/24 |
324 | | -│ ├── subnet-xxxxx (us-east-1b) - 172.30.4.0/24 |
325 | | -│ └── subnet-xxxxx (us-east-1c) - 172.30.5.0/24 |
326 | | -├── Private Subnets (Fargate Tasks) |
327 | | -│ ├── subnet-xxxxx (us-east-1a) - 172.30.0.0/24 |
328 | | -│ ├── subnet-xxxxx (us-east-1b) - 172.30.1.0/24 |
329 | | -│ └── subnet-xxxxx (us-east-1c) - 172.30.2.0/24 |
330 | | -├── Internet Gateway |
331 | | -└── NAT Gateway(s) |
| 214 | +# Deploy with Neptune integration |
| 215 | +./dev-tools/deploy.sh --vpc-id vpc-123 --neptune-sg $NEPTUNE_SG --neptune-host $NEPTUNE_HOST |
332 | 216 | ``` |
333 | 217 |
|
334 | | -### Graph Database Deployment |
335 | | - |
336 | | -The AI Data Explorer can optionally deploy a Neptune graph database for advanced graph analytics and relationship modeling. |
337 | | - |
338 | | -**Deploy with Graph Database:** |
| 218 | +#### Set Deployment Region |
339 | 219 |
|
340 | 220 | ```bash |
341 | | -# Deploy everything including Neptune graph database |
342 | | -./dev-tools/deploy.sh --with-graph-db |
343 | | - |
344 | | -# Deploy with existing VPC |
345 | | -./dev-tools/deploy.sh --with-graph-db --vpc-id vpc-123 |
| 221 | +# Deploy to different region |
| 222 | +export AWS_DEFAULT_REGION=us-west-2 |
| 223 | +npx cdk bootstrap # if not already done in this region |
| 224 | +./dev-tools/deploy.sh |
346 | 225 | ``` |
347 | 226 |
|
348 | | -**Graph Database Components:** |
349 | | - |
350 | | -- **Neptune Cluster**: Serverless Neptune database with auto-scaling |
351 | | -- **ETL Pipeline**: Bedrock-powered data transformation workflows |
352 | | -- **Lambda Functions**: ETL processor and bulk data loader |
353 | | -- **DynamoDB Tables**: Logging for data analysis and schema translation |
354 | | -- **S3 Buckets**: ETL data storage and access logging |
355 | | - |
356 | | -**Deploy Graph Database Only:** |
| 227 | +#### Graph Database Only |
357 | 228 |
|
358 | 229 | ```bash |
359 | | -# Standalone graph database deployment |
| 230 | +# Deploy standalone Neptune cluster |
360 | 231 | ./dev-tools/deploy-graph-db.sh |
361 | | - |
362 | | -# With existing VPC |
363 | | -./dev-tools/deploy-graph-db.sh --vpc-id vpc-123 |
| 232 | +./dev-tools/deploy-graph-db.sh --vpc-id vpc-123 # with existing VPC |
364 | 233 | ``` |
365 | 234 |
|
366 | | -### Neptune Integration Setup |
367 | | - |
368 | | -For Neptune connectivity in the same VPC: |
369 | | - |
370 | | -1. **Get your Neptune cluster details**: |
371 | | - |
372 | | - ```bash |
373 | | - CLUSTER_NAME="name-of-your-cluster" |
374 | | - NEPTUNE_SG=$(aws neptune describe-db-clusters --db-cluster-identifier $CLUSTER_NAME --query "DBClusters[0].VpcSecurityGroups[0].VpcSecurityGroupId" --output text) |
375 | | - NEPTUNE_HOST=$(aws neptune describe-db-clusters --db-cluster-identifier $CLUSTER_NAME --query "DBClusters[0].ReaderEndpoint" --output text) |
376 | | - echo "Neptune Security Group ID: $NEPTUNE_SG" |
377 | | - echo "Neptune Reader Endpoint: $NEPTUNE_HOST" |
378 | | - ``` |
379 | | - |
380 | | -2. **Deploy with automatic Neptune integration**: |
381 | | - ```bash |
382 | | - CDK_DOCKER=podman npx cdk deploy \ |
383 | | - --context vpcId=vpc-0af137533d471cd3b \ |
384 | | - --context neptuneSgId=$NEPTUNE_SG \ |
385 | | - --context neptuneHost=$NEPTUNE_HOST \ |
386 | | - --require-approval never |
387 | | - ``` |
388 | | - |
389 | | -This automatically configures the security group rules and Neptune endpoint for the agent service. |
390 | | - |
391 | | -### Multi-Region Deployment |
392 | | - |
393 | | -The application is region-agnostic and can be deployed to any AWS region that supports Amazon Bedrock, Amazon S3 Vectors, and the features used by this guidance. |
394 | | - |
395 | | -**To deploy in a different region (e.g., us-west-2):** |
396 | | - |
397 | | -1. **Set your AWS CLI region:** |
398 | | - |
399 | | - ```bash |
400 | | - export AWS_DEFAULT_REGION=us-west-2 |
401 | | - # or |
402 | | - aws configure set region us-west-2 |
403 | | - ``` |
404 | | - |
405 | | -2. **Bootstrap CDK in the new region** (if not already done): |
406 | | - |
407 | | - ```bash |
408 | | - npx cdk bootstrap |
409 | | - ``` |
410 | | - |
411 | | -3. **Deploy normally:** |
412 | | - |
413 | | - ```bash |
414 | | - ./dev-tools/deploy.sh |
415 | | - ``` |
416 | | - |
417 | 235 | ## Deployment Validation |
418 | 236 |
|
419 | 237 | After deployment, verify the system is working correctly: |
|
0 commit comments