Replies: 3 comments 5 replies
-
|
Hi @lzh010817 thanks a lot for your proposal. I will check the details and back to you soon. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @lzh010817, thanks a lot for your proposal. IMO, I think a distributed cache is quite useful. We now only have the local cache, which will introduce the inconsistency problem when deploying multiple Gravitino nodes as a federation. But the problem you mentioned about the deployment complexity also exists. I would suggest if you can investigate more about different cache solutions, and we can discuss which one is the best fit for. Currently, I can think of 3 options:
Also loop in @unknowntpo . @unknowntpo has some initial investigations, maybe we can discuss more here. |
Beta Was this translation helpful? Give feedback.
-
|
@lzh010817 Thanks for your proposal, please allow me some time to elaborate on my thoughts. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I would like to initiate a discussion regarding the potential integration of Redis-based distributed caching to complement the existing Caffeine local cache. While Caffeine excels in providing low-latency, in-memory caching for single-node deployments, Gravitino may benefit from a distributed caching layer to address challenges in high-concurrency scenarios and multi-node environments.
Key Considerations for Redis Integration:
As Gravitino scales horizontally, node-specific local caches (e.g., Caffeine) may lead to data inconsistency during node restarts or parallel operations. Redis, as a distributed cache, could ensure consistent metadata access across all nodes, reducing redundant backend queries and improving throughput.
Redis offers features like persistence, replication, and automatic failover, which mitigate risks of cache loss during node failures. This aligns with Gravitino’s need for reliable metadata management in distributed setups.
While Caffeine provides nanosecond-level access latency, Redis can handle cross-node cache synchronization with minimal latency penalties using pipelining and cluster-mode operations. For frequently accessed metadata (e.g., catalog details), Redis could serve as a shared L2 cache, while Caffeine remains the L1 node-local cache.
4.1 Introduce a cache abstraction layer to support pluggable cache providers (e.g., Caffeine for local, Redis for distributed).
4.2 Lever Redis Cluster for high availability and cache strategies to preload hot metadata on startup.
4.3 Use key-based expiration and invalidation policies to ensure data freshness across nodes.
Open Questions for Community Feedback:
Are there specific use cases in Gravitino where distributed caching would provide the most value (e.g., multi-region deployments, frequent schema updates)?
How might we balance the trade-offs between added infrastructure complexity (Redis cluster management) and performance gains?
Would a hybrid cache architecture (Caffeine + Redis) be feasible, and what strategies could optimize cache coherence?
I believe exploring Redis integration could strengthen Gravitino’s performance in distributed environments while maintaining backward compatibility. Looking forward to your insights and collaboration!
Beta Was this translation helpful? Give feedback.
All reactions