You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: blog/2025/Higher-Level Design Patterns.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -73,8 +73,9 @@ DSL are useful when it's high in abstraction level, and new requirements mostly
73
73
Mutation can be represented as data. Data can be interpreted as mutation.
74
74
75
75
- Instead of just doing in-place mutation, we can enqueue a command (or event) to do mutation later. The command is then processed to do actual mutation. (It's also moving computatin between stages)
76
-
-**Event sourcing**. Derive latest state from a events (log, mutations). Express the latest state as a view of old state + mutations. The idea is adopted by database WAL, data replication, Lambda architecture, etc.
77
76
- Layered filesystem (in Docker). Mutating or adding file is creating a new layer. The unchanged previous layers can be cached and reused.
77
+
-**Event sourcing**. Derive latest state from a events (log, mutations). Express the latest state as a view of old state + mutations. The idea is adopted by database WAL, data replication, Lambda architecture, etc.
78
+
-[Command Query Responsibility Segregation](https://en.wikipedia.org/wiki/Command_Query_Responsibility_Segregation). The system has two facades: the query facade doesn't allow mutation, and the command facade only accepts commands and don't give data.
78
79
79
80
The benefits:
80
81
@@ -271,6 +272,7 @@ About transitive rule: if X and Y both follow invariant, then result of "merging
271
272
- Dijkstra algorithm. The visited nodes are the nodes whose shortest path from source node are known. By using the nodes that we know shortest path, it "expands" on graph, knowing new node's shortest path from source. The algorithm iteratively add new nodes into the invariant, until it expands to destination node.
272
273
- Dynamic programming. The problem is separated into sub-problems. There is no cycle dependency between sub-problems. One problem's result can be quickly calculated from sub-problem's results (e.g. max, min).
273
274
- Querying hash map can skip data because $\text{hash}(a) \neq \text{hash}(b)$ implies $a \neq b$. Querying ordered search tree can skip data because $(a < b) \land (b < c)$ implies $a < c$.
275
+
- Parallelization often utilize associativity: $a * (b * c) = (a * b) * c$. For example, $a*(b*(c*d))=(a * b) * (c * d)$, where $a*b$ and $c*d$ don't depend on each other and can be computed in parallel. Examples: sum, product, max, min, max-by, min-by, list concat, set union, function combination, logical and, logical or. (Associativity with identity is monoid.)
274
276
- ......
275
277
276
278
Collapse file: blog/2025/How to Avoid Fighting Rust Borrow Checker.md
Copy file name to clipboardExpand all lines: blog/2025/How to Avoid Fighting Rust Borrow Checker.md
+13-6Lines changed: 13 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -330,7 +330,7 @@ Data-oriented design:
330
330
331
331
- Try to pack data into contagious array, (instead of objects laid out sparsely managed by allocator).
332
332
- Use handle (e.g. array index) or ID to replace reference.
333
-
-**Decouple object ID with memory address**. An ID can be save to disk and sent via network, but a pointer cannot (the same address cannot be used in another process or after process restart, because there may be other data in the same address).
333
+
-**Decouple object ID with memory address**. An ID can be saved to disk and sent via network, but a pointer cannot (the same address cannot be used in another process or after process restart, because there may be other data in the same address).
334
334
- The different fields of the same object doesn't necessarily need to be together in memory. The one field of many objects can be put together (parallel array).
335
335
- Manage memory based on **arenas**.
336
336
@@ -595,10 +595,10 @@ fn main() {
595
595
596
596
It will panic with `RefCell already borrowed` error.
597
597
598
-
Rust assumes that, if you have a mutable borrow `&mut T`, you can use it at any time. **But holding the reference is different to using reference**. There are use cases that I have two mutable references to the same object, but I only use one at a time. This is the use case that `RefCell` solves.
599
-
600
598
`RefCell` still follows mutable borrow exclusiveness rule. In previous contagious borrow example, the `Parent` is borrowed one immutablely and one mutable, thus `RefCell` will still panic at runtime.
601
599
600
+
Rust assumes that, if you have a mutable borrow `&mut T`, you can use it at any time. **But holding the reference is different to using reference**. There are use cases that I have two mutable references to the same object, but I only use one at a time. This is the use case that `RefCell` solves.
601
+
602
602
Another problem: It's hard to return a reference borrowed from `RefCell`.
603
603
604
604
As the previous example can be fixed by `Cell`, without `RefCell`, here is another contagious borrow example:
@@ -682,7 +682,7 @@ help: consider borrowing here
682
682
| +
683
683
```
684
684
685
-
Because the reference borrowed from `RefCell` is not normal reference, it's actually `Ref`. `Ref` implements `Deref` so it can be used similar to a normal borrow. But it's different to a normal borrow.
685
+
Because the borrow got from `RefCell` is not normal borrow, it's actually `Ref`. `Ref` implements `Deref` so it can be used similar to a normal borrow.
686
686
687
687
The "help: consider borrowing here" suggestion won't solve the compiler error. Don't blindly follow compiler's suggestions.
688
688
@@ -848,7 +848,12 @@ No matter what the definition of "GC" is, reference counting is different from t
848
848
| Propagates "death". (freeing one object may cause its children to be freed) | Propagates "live". (a living object cause its children to live, except for weak reference) |
849
849
| Cloning and dropping a reference involves atomic operation (except single-threaded `Rc`) | Reading/writing an on-heap reference may involve read/write barrier |
850
850
| Cannot automatically handle cycles. Need to use weak reference to cut cycle | Can handle cycles automatically |
851
-
| Cost is roughly O(how much time reference count change) | Cost is roughly O(count of living objects) |
851
+
| Cost is roughly O(how many times reference count change) [^reference_counting_cost]| Cost is roughly O(count of living objects) [^generational_gc]|
852
+
853
+
[^reference_counting_cost]: Contended reference counting is slower than non-contended.
854
+
855
+
[^generational_gc]: In generational GC, a minor GC only scans young generation, whose cost is roughly count of living young generation objects. But it still need to occasionally do full GC.
856
+
852
857
853
858
## Bump allocator
854
859
@@ -1117,7 +1122,9 @@ Rust's constraints also helps catch bugs other than memory safety and thread saf
1117
1122
- The receiver of [single-receiver channel](https://doc.rust-lang.org/std/sync/mpsc/fn.channel.html) cannot be copied, ensuring uniqueness of receiver.
1118
1123
- ......
1119
1124
1120
-
As previously mentioned, Rust creates obstacles for things like sharing mutable data and circular reference. Also Rust is harder to learn and compiles slower. Apart from that, there are other things that can sometimes be obstacles:
1125
+
As previously mentioned, Rust creates obstacles for things like sharing mutable data and circular reference. Also Rust is harder to learn and compiles slower.
1126
+
1127
+
Apart from borrow checker, there are other things that can sometimes be obstacles:
1121
1128
1122
1129
- Unit testing can be obstacle. When the business logic changed, unit test need to also be adjusted, which can feel annoying, because this work won't be needed if there is no unit test.
1123
1130
- Type system can be obstacle, especially in unexpressive type systems.
Copy file name to clipboardExpand all lines: blog/2025/Traps to Developers.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -169,7 +169,7 @@ This article spans a wide range of knowledge. If you find a mistake or have a su
169
169
-`std::remove` doesn't remove but just rearrange elements. `erase` actually removes.
170
170
- Literal number starting with 0 will be treated as octal number. (`0123` is 83)
171
171
- Undefined behaviors. The compiler optimization aim to keep defined behavior the same, but can freely change undefined behavior. Relying on undefined behavior can make program break under optimization. [See also](https://russellw.github.io/undefined-behavior)
172
-
- Accessing uninitialized memory is undefined behavior. Converting a `char*` to struct pointer can be seem as accessing uninitialized memory, because the object lifetime hasn't started. It's recommended to put the struct elsewhere and use `memcpy` to initialize it.
172
+
- Accessing uninitialized memory is undefined behavior. Converting a `char*` to struct pointer can be seen as accessing uninitialized memory, because the object lifetime hasn't started. It's recommended to put the struct elsewhere and use `memcpy` to initialize it.
173
173
- Accessing invalid memory (e.g. null pointer) is undefined behavior.
174
174
- Integer overflow/underflow is undefined behavior. Note that unsigned integer can underflow below 0.
175
175
- Aliasing.
@@ -243,6 +243,7 @@ This article spans a wide range of knowledge. If you find a mistake or have a su
243
243
- Reentrant lock:
244
244
- Reentrant means one thread can lock twice (and unlock twice) without deadlocking. Java's `synchronized` and `ReentrantLock` are reentrant.
245
245
- Non-reentrant means if one thread lock twice, it will deadlock. Rust `Mutex` and Golang `sync.Mutex` are not reentrant.
246
+
-[False sharing](https://en.wikipedia.org/wiki/False_sharing) of the same cache line costs performance.
Copy file name to clipboardExpand all lines: blog/2025/WebAsembly Limitations.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,7 +122,7 @@ The first solution, manually implementing GC encounters difficulties:
122
122
- Multi-threaded GC often use store barrier or load barrier to ensure scanning correctness. It also increases binary size and costs runtime performance.
123
123
- Cannot collect a cycle where a JS object and an in-Wasm object references each other.
124
124
125
-
[^safepoint_mechanism]: Safepoint mechanism allows a thread to pause at specific points. It can force the paused thread to expose all local variables on stack. When a thread is running, a local variable may be in register that cannot be scanned by another thread. And scanning a running thread's stack is not reliable due to memory order issues and race conditions. One way to implement safepoint is to have a global safepoint flag. The code frequently reads the safepoint flag and pause if flag is true. There exists optimizations such as using OS page fault handler.
125
+
[^safepoint_mechanism]: Safepoint mechanism allows a thread to cooporatively pause at specific points. Scanning a running thread's stack is not reliable, due to memory order issues and race conditions, and some pointers may be in register, not stack. If a threadis coorporatively paused, its stack can be reliably scanned. One way to implement safepoint is to have a global safepoint flag. The code frequently reads the safepoint flag and pause if flag is true. There exists optimizations such as using OS page fault signal handler.
126
126
127
127
What about using Wasm's built-in GC functionality? It requires mapping the data structure to Wasm GC data structure. Wasm's GC data structure allows Java-like class (with object header), Java-like prefix subtyping, and Java-like arrays.
0 commit comments