Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions Technical/Deployment-Guides/Nifi-Deployment-Guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# NiFi Setup and Configuration Guide
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the SPDX license header.

The PR description states that an Apache-2.0 SPDX license header is included, but it's missing from the file. Documentation files should include the license header at the top.

📄 Proposed fix to add license header
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
 # NiFi Setup and Configuration Guide
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# NiFi Setup and Configuration Guide
<!-- SPDX-License-Identifier: Apache-2.0 -->
# NiFi Setup and Configuration Guide
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Technical/Deployment-Guides/Nifi-Deployment-Guide.md` at line 1, Add the SPDX
license header as the very first line of the Markdown file by inserting an HTML
comment containing "SPDX-License-Identifier: Apache-2.0" (i.e., <!--
SPDX-License-Identifier: Apache-2.0 -->) above the existing title "NiFi Setup
and Configuration Guide" so the document clearly carries the Apache-2.0 SPDX
identifier.


After executing the Docker Compose setup, follow these steps to configure NiFi with the necessary templates and services:

## 1. Configure AWS Credentials

- Navigate to the **Controller Services** section.
- In the **AWSCredentialsProviderControllerService**, enter the required AWS credentials:
- **Access Key ID**
- **Secret Access Key**
Comment on lines +5 to +10
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add security best practices and required IAM permissions.

The guide instructs users to enter AWS credentials directly without mentioning security best practices or required permissions. Consider adding:

  1. Security guidance: Recommend using IAM roles when possible, or secrets management solutions for production environments, rather than hardcoded credentials.
  2. Required IAM permissions: Document the specific AWS permissions needed for the NiFi operations (S3, etc.).
  3. Region configuration: Specify if AWS region configuration is required.
  4. Navigation path: Provide the complete UI navigation path (e.g., "hamburger menu → Controller Settings → Controller Services").
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Technical/Deployment-Guides/Nifi-Deployment-Guide.md` around lines 5 - 10,
Update the "Configure AWS Credentials" section to add security best practices
and required settings: advise using IAM roles (instance/profile or IRSA) or a
secrets manager instead of hardcoding credentials, and show how to configure
AWSCredentialsProviderControllerService safely; list the specific IAM
permissions NiFi needs (e.g., s3:GetObject, s3:PutObject, s3:ListBucket plus
kms:Decrypt/kms:Encrypt if using KMS, and sts:AssumeRole if using role chaining)
and any required resource ARNs or least-privilege guidance; state that AWS
region must be set (or inherited) and where to set it in NiFi; and add the full
UI navigation path to reach Controller Services (e.g., hamburger menu →
Controller Settings → Controller Services) to make locating
AWSCredentialsProviderControllerService clear.


## 2. Configure Database Credentials

- For each database integrated with NiFi, enter the respective credentials:
- **Database passwords**
- Enable all controller services in the **NiFi Flow Configuration** by clicking the **lightning icon**.
Comment on lines +12 to +16
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clarify which databases and controller services need configuration.

The instructions are too vague:

  1. Specify databases: List which specific databases need to be configured (e.g., PostgreSQL, Arango, Redis, etc.).
  2. Identify controller services: Name the specific controller services where database passwords should be entered.
  3. Reorder steps: Enabling "all controller services" should occur after all configurations (including Parameter Contexts in step 3) are complete, not in this step.
  4. Improve UI reference: Instead of "lightning icon", use a more descriptive reference like "Configuration icon (lightning bolt) in the NiFi Flow Configuration menu".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Technical/Deployment-Guides/Nifi-Deployment-Guide.md` around lines 12 - 16,
Update the "2. Configure Database Credentials" section to explicitly list the
databases to configure (e.g., PostgreSQL, ArangoDB, Redis) and name the specific
NiFi controller services where credentials must be entered (e.g.,
DBCPConnectionPool for PostgreSQL, ArangoDBConnectionService,
RedisConnectionPool), move the instruction to enable controller services so it
occurs after completing all configurations and Parameter Contexts (i.e.,
reference enabling in a later step), and replace "lightning icon" with a clearer
UI reference such as "Configuration icon (lightning bolt) in the NiFi Flow
Configuration menu" to improve clarity.


## 3. Set Parameter Context for Specific Buckets

- To add a specific AWS bucket, navigate to **Parameter Contexts** and select **pbucket**.
- The default bucket name is **tazama** (case-sensitive).
- Add the following parameter contexts:
- **phttp** for InvokeHTTP
- **pozone** for PutS3Bucket
- **pbucket** for UpdateAttributes
Comment on lines +18 to +25
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clarify Parameter Context instructions and provide parameter details.

This section has confusing instructions and missing critical information:

  1. Clarify create vs. select: Line 20 says "select pbucket" but line 22 says "Add the following parameter contexts" including pbucket. Are these contexts pre-existing or do they need to be created? Make this explicit.

  2. Provide parameter details: For each parameter context, specify the actual parameter names and values to configure:

    • phttp: What parameters are needed for InvokeHTTP? (e.g., endpoint URLs, timeouts, headers?)
    • pozone: What parameters are needed for PutS3Bucket? (e.g., region, bucket name?)
    • pbucket: What is the parameter name for the bucket? (e.g., bucket-name: tazama?)
  3. Add examples: Include example parameter configurations or screenshots to improve clarity.

Without these details, users cannot complete the configuration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Technical/Deployment-Guides/Nifi-Deployment-Guide.md` around lines 18 - 25,
Clarify that the steps create or update Parameter Contexts rather than only
selecting them: explicitly state "Create or select the Parameter Context named
pbucket" and whether phttp and pozone must be created if missing; then list
exact parameter keys and example values to add to each context — for phttp add
parameters like http.endpoint (e.g., https://api.example.com/ingest),
http.timeout (e.g., 30s), http.headers (e.g., Authorization: Bearer <token>);
for pozone add s3.region (e.g., us-east-1) and s3.bucket (e.g., tazama) and
s3.credentials-id (NiFi credentials reference); for pbucket add bucket.name =
tazama (case-sensitive) and bucket.env (optional); finally include a short
example block showing the three contexts and their key=value pairs so users can
copy the exact parameter names and values when configuring phttp, pozone, and
pbucket.


## 4. Start All Processors

- Select all processors using **Shift + Click** or **Shift + Arrow Keys**.
- Go to the **Operate** menu and click **Start** to begin processing.

## Conclusion

Ensure all configuration steps are completed and all controller services are enabled. This will allow the NiFi flow to run smoothly and interact correctly with the configured AWS services and databases.
Loading