Add TTL for retention policy #1699
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of the issue
If a customer frequently restarts the agent frequently when having a high number of deployments/high number of log groups configured, DescribeLogGroups (DLG) and PutRetentionPolicy (PRP) can be throttled. The DLG/PRP calls are used to add/update retention policies for log groups at the start of the agent.
Description of changes
A new state file named
Amazon_CloudWatch_RetentionPolicyTTL
is being added. It will contain the log group and the last timestamp in which the retention policy was checked (and no updates were needed) or when the retention policy was updated. The format per line of the file isloggroupname:timestamp
. An example of two log groups:RetentionPolicyTTL
oldTimestamps
. TheIsExpired(group)
call will read fromoldTimestamps
and used to determine if the retention policy should be checked/updated.RetentionPolicyTTL
which has the fieldnewTimestamps
.a. There is a scenario in which timestamps from
oldTimestamps
are persisted, that is when the timestamp is expired. This is so we do not lose timestamps from previous agent runs. As a side effect, this will help clean up timestamps for log groups that are no longer configured by the user.Stop()
.Target
IsExpired(group)
call is made. If not expired, then persist the read timestamp into the new timestamp cache usingUpdateFromFile(group)
. If expired, then continue the logic of checking/updating the retention policyUpdate(group)
when the retention policy is valid (checked using DLG) or when the retention policy was updated (updated using PRP).Logfile Input
Translation
License
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Tests
Scenarios
1. Two log groups configured, first run
$ cat /opt/aws/amazon-cloudwatch-agent/logs/state/Amazon_CloudWatch_RetentionPolicyTTL
fails because file does not exist2. Two log groups configured, restarted within 5 minutes of TTL
3. Two log groups configured, restarted after 5 minutes of TTL
4. Two logs to one log group configured, restarted
Requirements
Before commit the code, please do the following steps.
make fmt
andmake fmt-sh
make lint