-
Notifications
You must be signed in to change notification settings - Fork 121
Add a new section for transitioning indices to data streams #2216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I'm also wondering whether this tutorial is related to the goal we're trying to achieve in the Migrate from Indices to DataStreams knowledgebase article. |
Also, applying some editiorial changes to blend in with the structure of the page
…ios described Hopefully this helps someone decide which procedure to follow.
manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md
Outdated
Show resolved
Hide resolved
## Manage general content with data streams [manage-general-content-with-data-streams] | ||
|
||
[Data streams](/manage-data/data-store/data-streams.md) are specifically designed for time series data. | ||
If you want to manage general content (data without timestamps) with data streams, you can set up [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to transform and enrich your general content at [ingest](/manage-data/ingest.md) time, so that you can transition from periodic indices to a data stream and get the benefits of time-based data management. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the final part of the sentence:
so that you can transition from periodic indices to a data stream and get the benefits of time-based data management.
-> the word periodic feels really weird.
I understand that we propose to set up an ingest pipeline, which transforms and enrich the doc at ingestion time.... but the final part (which is the key) is not clear in my opinion. I would suggest something direct like:
by adding a timestamp field and get the benefits of time-based data management
.
I'd add a warning note to tell users to double check that this makes sense and adds benefit to the customer user case (i'll share in private one use case example so you can consider adding something like that too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Edu! Instead of a warning I added an example for context to (hopefully) help users determine if this procedure fits their scenario/use case.
manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md
Outdated
Show resolved
Hide resolved
|
||
1. [Reindex with a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-reindex) to copy your documents from an existing index to the data stream you created. | ||
|
||
1. [Roll over](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-roll-over) the reindexed data stream so that the lifecycle policy and ingest pipeline you created will be applied to new data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rollover is supposed to happen automatically, I don't see the point of this step.
It would just prove to the user that rollover works, but if they have ILM setup to ensure rollover is done after 10GB this will cause an extra rollover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. That's a good point, I'm not sure why this step was included in the knowledgebase tutorial, but I will reach out to Zoia to double-check with her.
### Roll over the reindexed data stream [manage-general-content-with-data-streams-roll-over] | ||
|
||
Use the [_rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) to create a new write index for the stream. This ensures that the lifecycle policy and ingest pipeline you've created will apply to any new documents that you index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have already created an index and we have (maybe) performed a reindex, everything should be in place. What's the point of a rollover API call?
The user just needs to send data to the data stream, the rollover API should be used manually when you want an exceptional / extra rollover for any reason. You can do it, definitely, it's harmless, but you will cause a rollover regardless if the backend index really needs to be rolled over based on the ILM settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, my plan is to check with Zoia on the intention of this step and then remove this guidance if the step is just checking that indices get rolled over.
I can mention that info in the procedure description, as an optional step and remove this section as there's no need to elaborate on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it feels great, just added some minor comments for your consideration, including the removal of one step.
Hi @kilfoyle, I've implemented most of the changes that Edu suggested (one item is pending, I'm checking with the author about the goal of the manual rollover step), so depending on that, I might need to remove that step. Other than that, it should be ready for another peer review, whenever you get the chance. Please and thank you! 😃 P.S. The build seems to be broken now, but you should be able to use the preview regardless. |
manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! ⛵
Very nice work on this @yetanothertw!
The goal of this PR is to include the content from the Migrate from Indices to DataStreams knowledgebase article into our ILM-related documentation.
Initially, the suggestion was to include this new content as a new section on the Manage existing indices page. Upon further review of existing content, the Tutorial: Automate rollover seems like a more appropriate home for this content as it already includes two other equivalent use cases:
The reason I think these use cases are equivalent is because they're trying to use ILM policies to migrate from periodic indices to a more automated way to manage rollover and replace the need to schedule or script index creation (one option migrates from indices to data streams and the other one migrates to using aliases in order to manage their backing indices). The new content adds this use case that's equivalent in scope:
Fixes #1571