Replies: 14 comments 16 replies
-
|
Thanks!
You're spot on. I will happily extend your list, so we can address the issues one by one. For context, I didn't want to publish any crates. I was and am worried that:
I'm very interested to make things better. I did go back on my initial stance because in this early state, TB is mostly interesting to enthusiasts and enthusiasts are the most valuable users in terms of feedback and shaping the product. There's a a bit of tension, since enthusiast are more inclined to framework-use, while longer-term I would expect more users to rely on the JS/TS runtime, since it's a lore more approachable, can support a rich ecosystem, is still very fast and doesn't suffer from atrocious compile times. Sorry for the long blurp,just felt like I owe you the context. Now to your specific points:
Yes. Ideally there would be a stricter separation between public, stable and internal APIs rather than everything bunched together on AppState with a generic AppState::get_user_state().
That's another reason why I think stricter splitting would be beneficial. Yes, TrailBase has a bunch of dependencies and I don't love the compile-times either. I haven't done enough profiling to say exactly what contributes how much but you seem to think it's the C-library dependencies. How did you conclude that? FWIW, I do think you're probably right, i.e. the linking stage is probably the choke point and given V8 makes up 70+% of the binary size it will probably contribute significantly. I've given the dependencies several sweeps trying to eliminate as many as possible but there isn't much low hanging fruit left especially with a few's transitive dpeendencies being the major chunk. FWIW, even w/o restructuring, I do see nice improvements by using cranelift and mold.
I've changed the execution model in the past, this is what I settled on after extensively benchmarking various options from pools over locking, ... (https://github.com/ignatz/libsql_bench). A pool may help for very very read-heavy workloads but is quickly overshadowed by write lock contention. From all the things I tried, the current implementation performed the most consistently. The existing TrailBase benchmarks do test high-concurrency writes and high-concurrency read workloads. When you say "severe concurrency issues", I'm guessing that maybe you're thinking of the pathological case: mostly reads with some very expensive queries. FWIW, the current setup and async APIs allow us to evolve the execution model as we identify better compromises. I've certainly toyed with a more involved RW-lock approach with concurrent reads and serialized writes. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for such a receptive, detailed answer. You don't have to be sorry for anything. I can see your focus is deliver an alternative for PocketBase, with a nice end user experience, while mine is more of an "easier server boilerplate", with a bonus js and permissions. My comments about it: About the Axum AppState issue
About the single conn vs pool issue
About the compile times/linking
Btw, a couple more suggestions.
|
Beta Was this translation helpful? Give feedback.
-
|
Quick update, I updated the synthetic benchmarks to also incorporate r2d2 as well as a TB flavor with multiple connections: https://github.com/ignatz/libsql_bench/blob/pool/vendor/trailbase-sqlite/src/connection.rs#L59 . I'm certainly keen to increase read concurrency. Should probably improve the benchmark first to be a bit less synthetic. |
Beta Was this translation helpful? Give feedback.
-
|
Also did a quick PoC for the TB server with multiple connections: https://github.com/trailbaseio/trailbase/tree/multiconn |
Beta Was this translation helpful? Give feedback.
-
|
Regarding build times, just moved the JS assets into a separate crate to reduce false-positive rebuilds making the build in many cases go from: to: At this point, the build is very much dominated by link times. Building w/o v8 safes roughly a third. But the best cause of action for now is probably to use mold. |
Beta Was this translation helpful? Give feedback.
-
|
Hi there. Sorry for my absence. Crazy week here. Let's try to comment some of your points.
use rusqlite::{Connection, functions::FunctionFlags};
use std::time::Duration;
use std::thread::sleep;
fn add_sleep_function(conn: &Connection) -> rusqlite::Result<()> {
conn.create_scalar_function(
"sleep", // SQL function name
1, // Number of arguments
FunctionFlags::SQLITE_UTF8 | FunctionFlags::SQLITE_DETERMINISTIC,
|ctx| {
let ms = ctx.get::<i64>(0)?; // Get 1st arg as milliseconds
sleep(Duration::from_millis(ms as u64));
Ok(()) // Return nothing (NULL in SQL)
},
)
}then call it: let conn = Connection::open(":memory:")?;
add_sleep_function(&conn)?;
conn.execute("SELECT sleep(1000);", [])
WITH RECURSIVE delay(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM delay WHERE n < 10000000 -- Adjust n if needed
)or 3) a join among two huge unindexed tables. I know this is not the correct way to do it, but someone will, and blame TB because of some parallel query which does not return. Then run any of these options in a bench, in parallel, using a thread pool using a TL/single connection and you'll see what I mean about the concurrent issues. I'm sure you already get the point. If we have other connections available in pool, there will be mitigation of concurrency issues. Some notes about benchmarks in general I think if you're into benchmarks and this is a big selling point of the product, they should be more comprehensive to include parallel reads and writes, joins, slow queries, tables with more columns, etc. so to simulate a more realistic workflow; About criterion, I like it .It's a great tool for benchmarks and covers the math/metrics very well; it does warm ups, discards outliers, compares with the last run, etc. it's quite more accurate than a simple Instant-based test. I personally I have some sqlite benchmarks I made using criterion, and I might post them as a template later if you want, just as an example, but I'm pretty sure you can come up with something better. Btw criterion has group/throughput modes which could be very useful for this. Another nice selling point from a marketing standpoint of view could be comparing performance against PostgreSQL on a per-machine basis. Sqlite wins with large margin in this scenario, yet 90% of developers have no idea. There are rust embedded pgsql distros you could use in benches, again, if you are into the benchmarking thing as a selling point and comparing TB with other solutions. I saw the r2d2 new benchmark, performance looks similar (just a bit slower) - to TL/TL3. There are other pools as well, but I don't see the pool as a bottleneck compared to I/O itself. Optimizing binary sized is optional as I said. I'm much more into building my own binary, so the official distro size does not concern me personally at all. But 1/3 size looks good for deployment, specially on several machines. About the compiling times - I couldn't test still. This version has a lot of dependencies that weren't there the last time - prettier, prettier-astro, dart, etc. I guess there is a new example which uses them and deps aren't still handled automatically, I'll fix them and try later. |
Beta Was this translation helpful? Give feedback.
-
|
I should have explained the PostgreSQL thing better. I mean performance-wise app (TB/sqlite) vs app (PostgreSQL) performance/cost. The macro benchmarking, not the micro benchmark. Many developers believe for their application to be fast, they need PostgreSQL. Which btw is indeed an incredible database. But I don't know if you ever tried Supabase: it's really, really slow, I mean, it was like like 200-400 reqs/sec, because there is the network overhead. Compare with the ~10-20k req you can get from sqlite/simple queries using TB and there it is you selling point: in a 5 dollars vps, you get 25-100x the speed of PostgreSQL, at 1/5 of the cost (supabase is US$ 25/month), without having to manage servers, and no external dependencies, no database managing, just your TB binary. We're talking huge server expenses cutting, no dependency on a third party (that's a dealbreaker for me), easier managing. Of course if your application is huge, or you need consistence, stored procedures, specific features, PostgreSQL might be needed, but that's another story. Managed PostgreSQL has a better performance then supabase, but nothing that comes near these numbers a simple sqlite server can perform. But as I was saying, if your app can manage, say, 10k req/sec, 99% of the applications don't need anything else, right? 102410606024 = 884 million reqs/day. On a budget 5 dollars server. How many apps you know need more then this? And one can easily scale this up vertically on a better server, or add sharding to sqlite (see turso, libsql, litefs). Mileage can vary depending on the application and server for sure. But I dare someone to beat the cost/performance of sqlite for this kind of servers. I have made a server infrastructure similar to TB, using axum and sqlite, and I get 100k reads per second, 30 writes/sec (synthetic) on a 13 years old cpu. I mean, the speed is ridiculous. One difference for my app's server from TB is, I've used cached session IDs instead of JWTs, to prevent the overhead of decoding the JWT on every request. There is a small gain in this, you can benchmark if you want. Since sqlite is usually single served, I see no need in using JWTs. Besides, JWT are kinda hard to revoke (have to either wait them to expire or blacklist them). All of these issues are gone using a HashMap(session_id_, session). Back on topic - Pocketbase and TB are supabase/firebase killers. People still didn't realize that. So here is your selling point: do some good benchmarking, show the application is resilient/trustable and a cost-performance analysis against competition, and you'll see TB usage skyrocketing. And if you want to really raise a flag, benchmark TB against a guest supabase account, and do the performance vs cost analysis. You'll get something as, "TB is 100-500x cheaper than supabase hosting on the same hardware, without the lock in.". |
Beta Was this translation helpful? Give feedback.
-
|
Quick update: inspired by our discussion here, I added some of the benchmarks from the separate repo into the trailbase-sqlite crate with a criterion harness. More representative setups with a broader set of queries are still TBD. I'm bringing this up because I wanted to document a limitation (which I've run into in the past and blissfully keep forgetting about). The default criterion sampling approach doesn't work very reliably for the current setup because we heavily rely on sustained FS performance. Concretely, sqlite's in-memory DBs doesn't support WAL-mode so it's not a great representation, i.e. we have to write to some filesystem. If we write to temp files (which is what we currently do), we're non-hermetic, subject to hysteresis from the FS, SSD, ... . In other words, running the benchmark a few times gives quite a spread and the order in which we run the individual test benches in also affects the outcome. Ideally, we'd write to an in-memory fs, however I'm not aware of an easy, portable way of making this work. We could shell out but then benchmarks would only run on Linux (which may be fine). Alternatively, we could just settle for sqlite's in-memory mode as representative enough as an optimization target. |
Beta Was this translation helpful? Give feedback.
-
|
Ah ok, you're right about rust-vfs. I guess tmpfs could be a better solution for isolating fs/os instabilities, but as you previously pointed, linux-only. I guess it's fair for benchmarking. I don't know what queries produced these numbers, but they look quick. Btw I usually set up criterion to run benches in groups and show throughput, like: let mut group = c.benchmark_group("pools");
group.measurement_time(std::time::Duration::from_secs(10)); // Run for 10 seconds
group.sample_size(5000);
group.throughput(Throughput::Elements(1));
group.bench_function("tx inserts", benchmark_tx_inserts);
group.bench_function("(single prep) reads", benchmark_prep_reads);
group.bench_function("(rusqlite cached) prep reads ", benchmark_prep_reads_cached);
group.bench_function("unprep reads", benchmark_unprep_reads);
// ...
group.finish();It outputs like this: I prefer this format better than plain timings (just saying, see what you prefer). Actually throughput (i.e. reqs/sec) is the final number we like to see and compare and see at the end if the solution is a good cost/benefit. It would be nice, as I said, to have a benchmark comparing TB's performance against a PostgreSQL server under the same hardware. Or PB. That would mitigate any differences from hardware, fs, etc. Actually as I said before, it would be wonderful to see the benchmark run on a cheap VPS, comparing TB/sqlite against PB/sqlite, pgsql, or other with other solutions, because that would be the typical case after all right, a web server in a VPS. The cost of TL/pooling, it's not that is negligible, but will be a small fraction of this benchmark. Looking at your numbers, no doubt TL is (much!) faster then a pool, as your benches are showing, but what about performance in parallel, high concurrently, slow queries? That would be the bottleneck that a pool should fix. I don't know what exact queries produced the values above, but I'm curious about your numbers. This is getting even more interesting. |
Beta Was this translation helpful? Give feedback.
-
|
Btw you can override the TL limitations this with the likes of #[tokio::main(flavor = "multi_thread", worker_threads = N)]
async fn main() {
// app
}Or with a custom runtime: tokio::runtime::Builder::new_multi_thread()
.worker_threads(N)
.enable_all()
.build()
.unwrap()
.block_on(async {
axum::Server::bind(&addr).serve(app.into_make_service()).await.unwrap();
});but I'm not sure this would be a good way performance-wise. |
Beta Was this translation helpful? Give feedback.
-
|
Coming back to the original request before we went deep down the optimization path (which has been very productive), you were asking about custom state and middleware. I just broke up the API to allow for this: https://github.com/trailbaseio/trailbase/blob/dev/examples/custom-binary/src/main.rs#L9. I'm by no means convinced that this is ideal (just as a I think the public AppState API is a hot mess) but at least it should unblock your use-cases. |
Beta Was this translation helpful? Give feedback.
-
|
The upstream version now allows for parallel reads. Currently the parallelism is limited between 2 and 4 threads, though happy to make this configurable in the future. Thanks for pushing |
Beta Was this translation helpful? Give feedback.
-
|
Still working through this backlog, just looked at apalis as a queue system based on your recommendation - thanks. Looks great, I'll tentatively will give it a shot. It mostly looks pretty modular. I hope I can provide my own SQL integration for consistency but also avoid sqlx. |
Beta Was this translation helpful? Give feedback.
-
|
Ok I had a very busy week, saw quickly the new release, congrats. Nice to see the parallelism improvement. I'm yet to test the new version. I was building a criterion benchmark to show my concurrency issues point, but maybe it will not be needed and time is a real issue here lately. About apalis - yeah, I don't like sqlx very much as well (not for sqlite I mean) but there are trade offs in this case and I'll just leave my opinion because I messed with those in the past so you might benefit from that. First thing, yes as you know, sqlx is not performant as some claim, it increases compilation times, all those issues I'm sure you know quite well. I'm much more into rusqlite - I think are' on the same boat here.
Since for a queue sqlx's performance, even if subpar compared to rusqlite, should be just enough for a built-in queue.
Advantages:
Problems:
Advantages:
Problems:
So it's really a trade off, use the default sqlx impl or doing a ruslite port? |
Beta Was this translation helpful? Give feedback.


Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
Amazing project here, thanks. I have some doubts about the rust integration - I know it's not documented yet, so I got nowhere else to refer to.
Beta Was this translation helpful? Give feedback.
All reactions