Use MSK CreateTopic API, upgrade Kafka to 3.8.x, and replace pykafka with confluent-kafka#4
Open
revsystem wants to merge 14 commits intoaws-samples:mainfrom
Open
Use MSK CreateTopic API, upgrade Kafka to 3.8.x, and replace pykafka with confluent-kafka#4revsystem wants to merge 14 commits intoaws-samples:mainfrom
revsystem wants to merge 14 commits intoaws-samples:mainfrom
Conversation
…de TestData.csv for stream ingestion testing; update notebooks for Kafka integration using confluent-kafka
…a in 2.StreamIngest - Template: KafkaVersion 2.8.0 -> 3.8.x (AWS recommended) - Template: Add Parameters KnowledgeBaseName, DataSourceName with defaults; Outputs; DescribeStacks IAM - Template: SageMakerMSKAccessPolicy + kafka:CreateTopic, kafka:ListTopics, kafka:DescribeTopic, kafka-cluster:CreateTopic - 1.Setup: Replace terminal topic creation with boto3 upgrade + CreateTopic + ListTopics; get KBName/DSName from stack outputs; KB/DS lookup by KBName/DSName - 2.StreamIngest: Migrate from pykafka to confluent-kafka; ../data/TestData.csv; timezone-aware datetime - 3.Cleanup: No change (identical to upstream) Co-authored-by: Cursor <cursoragent@cursor.com>
…aint Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Move data/TestData.csv to notebooks/TestData.csv
- Update 2.StreamIngest to pd.read_csv('TestData.csv')
- Update CLAUDE.md references
Co-authored-by: Cursor <cursoragent@cursor.com>
…enhancing clarity This update aims to simplify the setup process and improve the overall readability of the notebook.
…instructions and code structure improvements - Added markdown cell in 1.Setup notebook to clarify boto3/botocore upgrade requirements and handling dependency conflicts. - Improved code structure in 2.StreamIngest notebook by removing redundant lines and ensuring consistent formatting. - Included installation command for confluent-kafka in 2.StreamIngest notebook.
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…nused Export) - Align notebook JSON indentation with upstream (1-space) - Preserve upstream trailing whitespace in unchanged lines - Remove unused Export from CloudFormation Outputs Co-authored-by: Cursor <cursoragent@cursor.com>
…3 layer integration - Add instructions for creating and using a boto3 Lambda layer in CLAUDE.md - Update CloudFormation template to include Boto3LayerArn parameter and reference it in the Lambda function - Modify .gitignore to exclude additional build and configuration files Co-authored-by: Cursor <cursoragent@cursor.com>
60b4930 to
11e35dc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue #, if available: #3
Description of changes:
Summary
This PR simplifies the setup process and replaces deprecated dependencies across the sample project.
Changes
CloudFormation template (
templates/bedrock-kb-stream-ingest.yml)KnowledgeBaseNameandDataSourceNameas CloudFormation Parameters with sensible defaultskafka:CreateTopic,kafka:ListTopics,kafka:DescribeTopic,kafka-cluster:CreateTopic) andcloudformation:DescribeStacks1.Setup.ipynb
CreateTopicandListTopicsAPIs via boto3DescribeStacks, then look up Knowledge Base and Data Source by exact name matchKBNameandDSNamevia%storefor use in subsequent notebooks2.StreamIngest.ipynb
%store -rguard before the data ingestion cell to provide a clear error message if variables from 1.Setup are missingdatetime(timezone.utc) instead of deprecateddatetime.utcfromtimestamp()TestData.csvinnotebooks/(same directory as notebook) for simpler path handlingOther
notebooks/TestData.csv(was previously atdata/TestData.csvwhich did not exist in the repository)Testing
All changes verified end-to-end:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Made with Cursor