Description
Description
I had to re-index myEpisode
table due to adding a new attribute and running into various other issues. The table contains about 26k records where each record is reasonable size. Keep in mind all this data was already indexed and i am attempting to re-index it.
Attempt 1 - Episode.reindex!
My first attempt was to simply follow the code in the read me and call Episode.reindex!
. This caused the following error after some time and resulted in a lot of records missing:
413 Payload Too Large - The provided payload reached the size limit. The maximum accepted payload size is 20 MiB.. See https://docs.meilisearch.com/errors#payload_too_large. (MeiliSearch::ApiError)
I tried running this multiple times but always end up getting this error. I don't understand why data size would be an issue here since all my records were previously already indexed without encountering this issue. Especially when looking at the next 2 attempts.
Attempt 2 - Episode.reindex! with smaller batch size
For this i ran Episode.reindex!(100)
decreasing the default batch size to about 10% of the default 1000. This seems to work but takes forever and eventually times out. the timeout could be due to my SSH connection timing out at this point. However, in this case i don't get the payload_too_large
error which is strange since the same data is being sent.
This indicates to me that the issue might be a too big batch size? This seems like a bug.
Attempt 3 - Custom batch size and background job
The way i was finally able to make it (mostly - will open a separate issue for this) work is by batching records myself and moving the indexing to a background job.
class ReindexMeilisearchJob < ApplicationJob
queue_as :meilisearch_index
def perform(model_name, start_id, end_id)
model = model_name.constantize
records = model.where(id: start_id..end_id)
records.reindex!
end
end
Which i then just call:
Episode.in_batches(of: 1000).each do |batch|
ReindexMeilisearchJob.perform_later("Episode", batch.first.id, batch.last.id)
end
Notice that this is still creating batches of 1000 records but without throwing a payload_too_large
error. It also executes a lot faster since i am running 5 jobs in parallel.
I guess my questions are
- Why am i getting a
payload_too_large
error? - What is the proper way to re-index a Model, small or big?
- Do i need to re-index the whole table when i add a new attribute or is there a more efficient way?
Environment (please complete the following information):
- OS: [e.g. Debian GNU/Linux]
- Meilisearch server version: Cloud v1.11.3 (v1.12 in development due to some bug on the cloud dashboard)
- meilisearch-rails version: v0.14.1
- Rails version: v8.0