Skip to content

Handling errors in AmazonS3MoveCleanUpPolicy configuration #694

@Teju2404

Description

@Teju2404

Hi,

Does AmazonS3MoveCleanUpPolicy("fs. cleanup.policy-move.failure.aws.bucket.name": "”) configuration support storing the entire file to the error path when issue occur or does it store only the problematic records?

"name": "s3_file_pulse_connector",
"config": {
"connector class": "io. streamthoughts.kafka.connect.filepulse.source.FilePulseSourceConnector",
"topic": "",
"tasks.max": "1",
"tasks. reader.class": "io.streamthoughts.kafka.connect.filepulse.fs.reader.AmazonS3ROwFileInputReader",
"fs.listing.class": "io.streamthoughts.kafka.connect.filepulse.fs.AmazonS3FileSystemListing",
"aws. s3. bucket. name": "
"aws. s3. bucket-prefix": "”
"aws. s3.region": "",
"fs. cleanup policy.class": "io.streamthoughts.kafka.connect.filepulse.fs.clean.AmazonS3MoveCleanupPolicy",
"fs. cleanup-policy.move. success.aws.bucket.name": "”
"fs.cleanup.policy.move.success.aws.prefix.path": "”
"fs. cleanup.policy-move.failure.aws.bucket.name": "”
"fs.cleanup.policy.move.failure.aws.prefix.path": “",
"tasks. file.status.storage.bootstrap.servers": “”
"tasks. file.status.storage.topic":
"tasks. file.status.storage.topic.partitions":10,
"tasks. file.status.storage.topic.replication. factor":1,
"errors. log. include messages": "true",
"errors. log enable": "true",
"key, converter": "org. apache. kafka.connect. storage. StringConverter",
"value. converter": "org,apache.kafka.connect.storage.StringConverter"
}

Activity

fhussonnois

fhussonnois commented on Jan 24, 2025

@fhussonnois
Member

Hi @Teju2404, the cleanup policy is used to move the entire file when an error occured during processing.

Teju2404

Teju2404 commented on Jan 26, 2025

@Teju2404
Author

Hi @fhussonnois , Thank you for the response!

I have few other questions, can you please help me here?

  1. We have a use case where files need to be read from an S3 bucket using the FilePulse connector. If there are any issues reading a file with FilePulse, is there a configuration available to send individual failed records? If yes, could you provide the configuration details?
  2. When reading from S3 using the FilePulse connector, if there are multiple optional fields that do not need to be published to the topic, is there a configuration available to exclude those fields from being published?
Teju2404

Teju2404 commented on Jan 29, 2025

@Teju2404
Author

Hi @fhussonnois can you help me on my questions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @fhussonnois@Teju2404

        Issue actions

          Handling errors in AmazonS3MoveCleanUpPolicy configuration · Issue #694 · streamthoughts/kafka-connect-file-pulse