|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "To escape the maze of optimizations" |
| 4 | +--- |
| 5 | + |
| 6 | +In the past, I used to work with map data visualization and one of the pain task |
| 7 | +was loading GeoJSON files. The amount of data is large enough to make browser |
| 8 | +interactions lagged. And they were all loaded from the beginning regardless zoom |
| 9 | +levels. One of many lessons as of web developers that we know we should use |
| 10 | +pagination. But at that time, I wondered what criteria I should use. |
| 11 | +You know, the map boundary was determined based on a center point and a zoom |
| 12 | +level, and the center point has its coordinations in floating-point numbers. |
| 13 | +They can't be used for chunked and cacheable endpoints. Until one day I learn |
| 14 | +about Uber's hexagon grid[0] which helps me created the endpoints to split the data |
| 15 | +into chunks. Nevertheless, the project schedule was short and went to an end but |
| 16 | +I have carried the feeling that the implementation wasn't optimal. |
| 17 | + |
| 18 | +The first thing to do is split the data in small chunks. But with the default |
| 19 | +grid XYZ tile system instead of Uber's hexagon grid. Since the endpoints now only |
| 20 | +use x/y/z which are integers. It's easy to be cached. |
| 21 | + |
| 22 | +The points are put in cluster ids so that all of them don't loaded on high view |
| 23 | +(small zoom value), thanks to the `ST_ClusterDBScan` function. I only have three |
| 24 | +layers of clusters for now and don't think it's a good fit. Like, at zoom level |
| 25 | +15, which is still pretty far but all the points are loaded. |
| 26 | + |
| 27 | +The event listeners for tiles loading (starts to load) and tiles unload plugged |
| 28 | +in. I want to use the unload events because when users zoom out, the numbers of |
| 29 | +points will be reduced based on its cluster id -- which means the points get |
| 30 | +removed. As simple as it seems, I started doing the coding and once the |
| 31 | +web page opens, the dev tools flooded with API calls and map interactions |
| 32 | +jigged. |
| 33 | + |
| 34 | +There's two main reasons: |
| 35 | + |
| 36 | + - Firstly, the network latency is high, about one and a half seconds. The |
| 37 | + points for a tile which is not loaded yet later get rendered even the |
| 38 | + tile itself is unloaded. Or the unload event got blocked until it get a |
| 39 | + chance to run, it wipes out the points that should be there. |
| 40 | + - And the other is that it fetches the same set of points on every tile. |
| 41 | + That wastes a lot of resources both in the web page and the API server. |
| 42 | + |
| 43 | +To reduce the API calls, I set a rule that if the users zoom in, the web page |
| 44 | +won't process the events if there's a zoom level on the same range of the cluster |
| 45 | +has been loaded. It's not a significant reduction but I feel good when I have |
| 46 | +it. |
| 47 | + |
| 48 | +The first issue is the reason I put the title - the maze of optimizations. I did |
| 49 | +a lot of trials, put listeners here and there and it doesn't seem to work. |
| 50 | +Dealing with user interactions is extremely hard because we don't know what they |
| 51 | +going to do with the application, especially this is a map. Not only it listens to those |
| 52 | +two events, there's also zoom and moving events and the zoom event triggers |
| 53 | +moveend event as well. |
| 54 | + |
| 55 | +I spent days trying. I thought to myself what a loser I am. The load-all |
| 56 | +approach works pretty well and there might be no issue at all because my |
| 57 | +website never gain that much data. There's bunch of temporary solutions live |
| 58 | +on production at my previous company without having any optimizations or |
| 59 | +refactoring. And the people are still proud, still adding features and say it's |
| 60 | +normal. They laugh when you cause a bug because of optimizations. Maybe they're |
| 61 | +right. Maybe I should get along and could have a happy life. |
| 62 | + |
| 63 | +Maybe, maybe... |
| 64 | + |
| 65 | +I went for a new method. I pipe all the tile load and unload events into a |
| 66 | +queue. Once the moveend triggered, all the events will be processed or filtered |
| 67 | +out if duplicate or discarded each other. I don't care any other interactions, |
| 68 | +just the data. I also offload it into a web worker, reducing the effect on the UI. |
| 69 | +This time I’m relieved. The truth lies in the data. I don't have to guess or try a |
| 70 | +scenario and then forget another. |
| 71 | + |
| 72 | +I feel it is the answer. There's few cases bugs happen, so I still need to write |
| 73 | +tests to ensure my specs. Feeling is relative, yes. I feel this function is not |
| 74 | +correct. I feel the class has multiple responsibilities. But I think with |
| 75 | +enough experience, observing few criteria give me a feeling about right or |
| 76 | +wrong, possible or impossible. From that I dig deeper to prove my objection. |
| 77 | + |
| 78 | +And I feel happy today. |
| 79 | + |
0 commit comments