Skip to content

Add name resolution, descriptor fetching and descriptor leasing to architecture docs #9218

@jordanlewis

Description

@jordanlewis

Jordan Lewis (jordanlewis) commented:

Currently, there is no documentation for name resolution or descriptor fetching/leasing in the architecture docs: https://www.cockroachlabs.com/docs/stable/architecture/overview.html This is a bit unfortunate, because understanding how these pieces work is critical for a good understanding of CockroachDB.

This ticket tracks adding information on these phases to the SQL architecture docs.

Copying some notes from a Slack conversation I had with @jseldess:

it covers something related, but it details just the resolution part. what we need is to explain to users that, in the query path, we need to resolve the names, but that the names are cached using leases that expire every 5 minutes. it’s pretty important to have this part documented, because it does come up from time to time when people run into problems if a node dies unexpectedly that is holding a lease - that can lead to some unexpected waits when doing a schema change on that object that was leased

jesse  5:18 PM
OK, yes, that sounds important. Leases are covered to some extent in the replication layer docs, but I imagine we’re incomplete on this particular topic. https://www.cockroachlabs.com/docs/v20.2/architecture/replication-layer.html#leases

jordan  5:18 PM
well, and this is a commonly mistaken point which is again why we should be documenting it, that lease is an entirely different sort of lease than the schema lease that i’m talking about (edited) 
5:19
it’s pretty confusing!
5:19
nodes can be leaseholders for raft groups, which is about replication.
but nodes can also hold leases on individual schema objects, to prevent them from having to re-lookup those schema objects over and over again on each query. they store them in memory. this is safe because of this “leasing” system.
5:20
we should probably call these things Raft Leases vs Schema Leases, or something.

jesse  5:21 PM
Yeah, that sounds like a good distinction.
5:23
How do descriptors come into play? I’d love to understand this better.

jordan  5:25 PM
there are 2 important system tables:
system.namespace, which tells you what the id of a descriptor is for a given name.
system.descriptor, which holds the descriptor for every id.
5:25
in a world with no schema leases, to serve a query like this:
select * from t where a = 10
we would have to do 2 things before we can do any actual data lookups: (edited) 
5:26
1. select * from system.namespace where name = 't' (slightly simplified - first we also actually would have to lookup the database name and the schema name)
5:26
2. if that gave back id 105, select * from system.descriptor where id = 105
5:27
then, we would have the descriptor for table t, containing its table layout and everything we need to actually read data from it. only then could we do the rest of the query
3. plan and execute the original query, returning results finally to the user
5:28
this would be very expensive! it would mean that, on every query, you’d need to potentially do 2 entire extra round trips to those system tables before you could get data.
5:28
so, we need a way to cache the descriptor information (the map from t to 105, and the map from 105 to its descriptor).
5:29
this is done via “leasing”. a lease on a descriptor means that we write some more data to another system table (system.lease) that informs the cluster that a given node has cached the descriptor. using this information, it is possible to know which nodes must be notified when we do scarier, mutating operations on the descriptors (aka schema changes)
5:29
if it weren’t for schema changes, every node could cache every descriptor forever, since they’d be immutable! so we’d have no problem. but, since people can change their schemas, we have to have this extra system in place.
5:29
does that make sense?

jesse  5:30 PM
Reading now
5:31
And this is separate from tracking down the range location, right?

jordan  5:31 PM
correct. this has nothing to do with ranges at all. it’s a higher level of the system.

jesse  5:32 PM
So you say, only then could we do the rest of the query .
5:35
I see now. So descriptors are cached via what we call “leasing”, but there can be multiple leases on a descriptor, right, unlike on a range?

jordan  5:35 PM
yes, correct
5:35
in the steady state of a workload, every node will have a lease on all of the descriptors it has to touch

jesse  5:36 PM
I see now. So is it such that once a node is the gateway for a query that touches table a, it will have a lease on that table’s descriptor? (edited) 

jordan  5:36 PM
yes, correct

jesse  5:36 PM
I get it. Essentially, it just means the descriptor is cached on that node.

jordan  5:37 PM
yep

jesse  5:37 PM
Curious why we called that mechanism a “lease”. It is confusing in contrast to the notion of range leases. (edited) 

jordan  5:37 PM
I don’t really know. it predated my joining the company :slightly_smiling_face:
:+1::skin-tone-4:
1

5:37
there are some more details for sure. like, you can only have so many leases total at once, to limit the size of the cache. and, the leases expire after 5 minutes. and, the schema change process interacts with the leases in a complex way.
but, you have the fundamental idea.

jesse  5:42 PM
OK, that’s super helpful. And to come back to your point, people might be confused about why queries are somewhat slower when there’s no lease or it has expired. I know when I was writing https://www.cockroachlabs.com/docs/stable/performance-tuning.html, I had a hard time with that fact, so I decided to write a CLI that repeats a query 20 times and then returns the median time.
5:43
That’s an old tutorial that probably should be remove in favor of https://www.cockroachlabs.com/docs/stable/demo-low-latency-multi-region-deployment.html anyway.
5:43
In any case, thanks for the explanation, Jordan. I think it’s a great topic to document. It might also be a good SQL FAQ.

jordan  5:46 PM
yes! exactly to your first point

@jseldess also mentions that it might be important to tie this in to a SQL FAQ. Specifically, the implementation of SQL name resolution, descriptor fetching and leasing can cause performance anomalies the first time a gateway node serves a query that touches given tables. This performance anomaly can be confusing to users and could be a good FAQ entry.

Jira Issue: DOC-877

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions