Skip to content

Commit 2050b72

Browse files
committed
new post
1 parent 80f963e commit 2050b72

File tree

1 file changed

+79
-0
lines changed

1 file changed

+79
-0
lines changed
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
---
2+
layout: post
3+
title: "To escape the maze of optimizations"
4+
---
5+
6+
In the past, I used to work with map data visualization and one of the pain task
7+
was loading GeoJSON files. The amount of data is large enough to make browser
8+
interactions lagged. And they were all loaded from the beginning regardless zoom
9+
levels. One of many lessons as of web developers that we know we should use
10+
pagination. But at that time, I wondered what criteria I should use.
11+
You know, the map boundary was determined based on a center point and a zoom
12+
level, and the center point has its coordinations in floating-point numbers.
13+
They can't be used for chunked and cacheable endpoints. Until one day I learn
14+
about Uber's hexagon grid[0] which helps me created the endpoints to split the data
15+
into chunks. Nevertheless, the project schedule was short and went to an end but
16+
I have carried the feeling that the implementation wasn't optimal.
17+
18+
The first thing to do is split the data in small chunks. But with the default
19+
grid XYZ tile system instead of Uber's hexagon grid. Since the endpoints now only
20+
use x/y/z which are integers. It's easy to be cached.
21+
22+
The points are put in cluster ids so that all of them don't loaded on high view
23+
(small zoom value), thanks to the `ST_ClusterDBScan` function. I only have three
24+
layers of clusters for now and don't think it's a good fit. Like, at zoom level
25+
15, which is still pretty far but all the points are loaded.
26+
27+
The event listeners for tiles loading (starts to load) and tiles unload plugged
28+
in. I want to use the unload events because when users zoom out, the numbers of
29+
points will be reduced based on its cluster id -- which means the points get
30+
removed. As simple as it seems, I started doing the coding and once the
31+
web page opens, the dev tools flooded with API calls and map interactions
32+
jigged.
33+
34+
There's two main reasons:
35+
36+
- Firstly, the network latency is high, about one and a half seconds. The
37+
points for a tile which is not loaded yet later get rendered even the
38+
tile itself is unloaded. Or the unload event got blocked until it get a
39+
chance to run, it wipes out the points that should be there.
40+
- And the other is that it fetches the same set of points on every tile.
41+
That wastes a lot of resources both in the web page and the API server.
42+
43+
To reduce the API calls, I set a rule that if the users zoom in, the web page
44+
won't process the events if there's a zoom level on the same range of the cluster
45+
has been loaded. It's not a significant reduction but I feel good when I have
46+
it.
47+
48+
The first issue is the reason I put the title - the maze of optimizations. I did
49+
a lot of trials, put listeners here and there and it doesn't seem to work.
50+
Dealing with user interactions is extremely hard because we don't know what they
51+
going to do with the application, especially this is a map. Not only it listens to those
52+
two events, there's also zoom and moving events and the zoom event triggers
53+
moveend event as well.
54+
55+
I spent days trying. I thought to myself what a loser I am. The load-all
56+
approach works pretty well and there might be no issue at all because my
57+
website never gain that much data. There's bunch of temporary solutions live
58+
on production at my previous company without having any optimizations or
59+
refactoring. And the people are still proud, still adding features and say it's
60+
normal. They laugh when you cause a bug because of optimizations. Maybe they're
61+
right. Maybe I should get along and could have a happy life.
62+
63+
Maybe, maybe...
64+
65+
I went for a new method. I pipe all the tile load and unload events into a
66+
queue. Once the moveend triggered, all the events will be processed or filtered
67+
out if duplicate or discarded each other. I don't care any other interactions,
68+
just the data. I also offload it into a web worker, reducing the effect on the UI.
69+
This time I’m relieved. The truth lies in the data. I don't have to guess or try a
70+
scenario and then forget another.
71+
72+
I feel it is the answer. There's few cases bugs happen, so I still need to write
73+
tests to ensure my specs. Feeling is relative, yes. I feel this function is not
74+
correct. I feel the class has multiple responsibilities. But I think with
75+
enough experience, observing few criteria give me a feeling about right or
76+
wrong, possible or impossible. From that I dig deeper to prove my objection.
77+
78+
And I feel happy today.
79+

0 commit comments

Comments
 (0)