Skip to content

Commit 357d2a6

Browse files
committed
Centralize informer management with InformerSet
Replace factory-based informer creation with centralized InformerSet that: - Creates base SharedIndexInformers directly without factories - Manages lifecycle and synchronization in one place - Provides filtered views for ProviderConfig-specific controllers - Reduces boilerplate and improves maintainability Using informers directly without factories simplifies the logic and eliminates potential mistakes from unnecessary factory usage, such as cidentally creating duplicate informers or incorrect factory scoping.
1 parent 0f9248e commit 357d2a6

File tree

15 files changed

+826
-1016
lines changed

15 files changed

+826
-1016
lines changed

cmd/glbc/main.go

Lines changed: 4 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,11 @@ import (
2727

2828
firewallcrclient "github.com/GoogleCloudPlatform/gke-networking-api/client/gcpfirewall/clientset/versioned"
2929
networkclient "github.com/GoogleCloudPlatform/gke-networking-api/client/network/clientset/versioned"
30-
informernetwork "github.com/GoogleCloudPlatform/gke-networking-api/client/network/informers/externalversions"
3130
nodetopologyclient "github.com/GoogleCloudPlatform/gke-networking-api/client/nodetopology/clientset/versioned"
32-
informernodetopology "github.com/GoogleCloudPlatform/gke-networking-api/client/nodetopology/informers/externalversions"
3331
k8scp "github.com/GoogleCloudPlatform/k8s-cloud-provider/pkg/cloud"
3432
flag "github.com/spf13/pflag"
3533
crdclient "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
3634
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
37-
informers "k8s.io/client-go/informers"
3835
"k8s.io/client-go/kubernetes"
3936
restclient "k8s.io/client-go/rest"
4037
"k8s.io/client-go/tools/leaderelection"
@@ -54,7 +51,6 @@ import (
5451
serviceattachmentclient "k8s.io/ingress-gce/pkg/serviceattachment/client/clientset/versioned"
5552
"k8s.io/ingress-gce/pkg/svcneg"
5653
svcnegclient "k8s.io/ingress-gce/pkg/svcneg/client/clientset/versioned"
57-
informersvcneg "k8s.io/ingress-gce/pkg/svcneg/client/informers/externalversions"
5854
"k8s.io/ingress-gce/pkg/systemhealth"
5955
"k8s.io/ingress-gce/pkg/utils"
6056
"k8s.io/klog/v2"
@@ -252,19 +248,6 @@ func main() {
252248
if err != nil {
253249
klog.Fatalf("Failed to create ProviderConfig client: %v", err)
254250
}
255-
informersFactory := informers.NewSharedInformerFactory(kubeClient, flags.F.ResyncPeriod)
256-
var svcNegFactory informersvcneg.SharedInformerFactory
257-
if svcNegClient != nil {
258-
svcNegFactory = informersvcneg.NewSharedInformerFactory(svcNegClient, flags.F.ResyncPeriod)
259-
}
260-
var networkFactory informernetwork.SharedInformerFactory
261-
if networkClient != nil {
262-
networkFactory = informernetwork.NewSharedInformerFactory(networkClient, flags.F.ResyncPeriod)
263-
}
264-
var nodeTopologyFactory informernodetopology.SharedInformerFactory
265-
if nodeTopologyClient != nil {
266-
nodeTopologyFactory = informernodetopology.NewSharedInformerFactory(nodeTopologyClient, flags.F.ResyncPeriod)
267-
}
268251
ctx := context.Background()
269252
if flags.F.LeaderElection.LeaderElect {
270253
err := multiprojectstart.StartWithLeaderElection(
@@ -274,13 +257,11 @@ func main() {
274257
rootLogger,
275258
kubeClient,
276259
svcNegClient,
260+
networkClient,
261+
nodeTopologyClient,
277262
kubeSystemUID,
278263
eventRecorderKubeClient,
279264
providerConfigClient,
280-
informersFactory,
281-
svcNegFactory,
282-
networkFactory,
283-
nodeTopologyFactory,
284265
gceCreator,
285266
namer,
286267
stopCh,
@@ -294,13 +275,11 @@ func main() {
294275
rootLogger,
295276
kubeClient,
296277
svcNegClient,
278+
networkClient,
279+
nodeTopologyClient,
297280
kubeSystemUID,
298281
eventRecorderKubeClient,
299282
providerConfigClient,
300-
informersFactory,
301-
svcNegFactory,
302-
networkFactory,
303-
nodeTopologyFactory,
304283
gceCreator,
305284
namer,
306285
stopCh,

pkg/multiproject/README.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# Multi-Project Controller Architecture
2+
3+
## Overview
4+
5+
The multi-project controller enables Kubernetes ingress-gce to manage Network Endpoint Groups (NEGs) across multiple Google Cloud Platform (GCP) projects. This allows for multi-tenant scenarios where different namespaces or services can be associated with different GCP projects through ProviderConfig resources.
6+
7+
## Architecture
8+
9+
### Core Components
10+
11+
```
12+
┌─────────────────────────────────────────────────────────────┐
13+
│ Main Process │
14+
│ │
15+
│ ┌────────────────────────────────────────────────────────┐ │
16+
│ │ start.Start() │ │
17+
│ │ - Creates base SharedIndexInformers │ │
18+
│ │ - Starts informers with globalStopCh │ │
19+
│ │ - Creates ProviderConfigController │ │
20+
│ └────────────────────┬───────────────────────────────────┘ │
21+
│ │ │
22+
│ ┌────────────────────▼───────────────────────────────────┐ │
23+
│ │ ProviderConfigController │ │
24+
│ │ - Watches ProviderConfig resources │ │
25+
│ │ - Manages lifecycle of per-PC controllers │ │
26+
│ └────────────────────┬───────────────────────────────────┘ │
27+
│ │ │
28+
│ ┌────────────────────▼───────────────────────────────────┐ │
29+
│ │ ProviderConfigControllersManager │ │
30+
│ │ - Starts/stops NEG controllers per ProviderConfig │ │
31+
│ │ - Manages controller lifecycle │ │
32+
│ └──────────┬─────────────────────┬───────────────────────┘ │
33+
│ │ │ │
34+
│ ┌────────▼──────────┐ ┌───────▼──────────┐ │
35+
│ │ NEG Controller #1 │ │ NEG Controller #2 │ ... │
36+
│ │ (ProviderConfig A) │ │ (ProviderConfig B) │ │
37+
│ └────────────────────┘ └──────────────────┘ │
38+
└─────────────────────────────────────────────────────────────┘
39+
```
40+
41+
### Key Design Principles
42+
43+
1. **Shared Informers**: Base informers are created once and shared across all ProviderConfig controllers
44+
2. **Filtered Views**: Each NEG controller gets a filtered view of resources based on ProviderConfig
45+
3. **Lifecycle Management**: Controllers can be started/stopped independently as ProviderConfigs are added/removed
46+
4. **Channel Management**: Proper channel lifecycle ensures clean shutdown and resource cleanup
47+
48+
## Component Details
49+
50+
### start/start.go
51+
Main entry point that:
52+
- Creates base SharedIndexInformers via InformerSet (no factories)
53+
- Starts all informers with the global stop channel
54+
- Creates the ProviderConfigController
55+
- Manages leader election (when enabled)
56+
57+
### controller/controller.go
58+
ProviderConfigController that:
59+
- Watches ProviderConfig resources
60+
- Enqueues changes for processing
61+
- Delegates to ProviderConfigControllersManager
62+
63+
### manager/manager.go
64+
ProviderConfigControllersManager that:
65+
- Maintains a map of active controllers per ProviderConfig
66+
- Starts NEG controllers when ProviderConfigs are added
67+
- Stops NEG controllers when ProviderConfigs are deleted
68+
- Manages finalizers for cleanup
69+
70+
### neg/neg.go
71+
NEG controller factory that:
72+
- Wraps base SharedIndexInformers with provider-config filters via ProviderConfigFilteredInformer
73+
- Sets up the NEG controller with proper GCE client
74+
- Manages channel lifecycle (globalStopCh vs providerConfigStopCh)
75+
76+
### filteredinformer/
77+
Filtered informer implementation that:
78+
- Wraps base SharedIndexInformers
79+
- Filters resources based on ProviderConfig labels
80+
- Provides filtered cache/store views
81+
82+
## Channel Lifecycle
83+
84+
The implementation uses three types of channels:
85+
86+
1. **globalStopCh**: Process-wide shutdown signal
87+
- Closes on leader election loss or process termination
88+
- Used by base informers and shared resources
89+
90+
2. **providerConfigStopCh**: Per-ProviderConfig shutdown signal
91+
- Closed when a ProviderConfig is deleted
92+
- Used to stop PC-specific controllers
93+
94+
3. **joinedStopCh**: Combined shutdown signal
95+
- Closes when either globalStopCh OR providerConfigStopCh closes
96+
- Used by PC-specific resources that should stop in either case
97+
98+
## Resource Filtering
99+
100+
Resources are associated with ProviderConfigs through labels:
101+
- Services, Ingresses, etc. have a label indicating their ProviderConfig
102+
- The filtered informer only passes through resources matching the PC name
103+
- This ensures each controller only sees and manages its own resources
104+
105+
## Informer Lifecycle
106+
107+
### Creation
108+
1. Base informers are created via `InformerSet` using `NewXInformer()` functions
109+
2. Base informers are started by `InformerSet.Start` with `globalStopCh`
110+
3. Filtered wrappers are created per ProviderConfig using `ProviderConfigFilteredInformer`
111+
112+
### Synchronization
113+
- `InformerSet.Start` waits for base informer caches to sync
114+
- Filtered informers rely on the synced base caches
115+
- Controllers use `CombinedHasSynced()` from filtered informers before processing
116+
117+
### Shutdown
118+
- Base informers stop when globalStopCh closes
119+
- Filtered informers are just wrappers (no separate shutdown)
120+
- Controllers stop when their providerConfigStopCh closes
121+
122+
## Configuration
123+
124+
Key configuration flags:
125+
- `--provider-config-name-label-key`: Label key for PC association (default: cloud.gke.io/provider-config-name)
126+
- `--multi-project-owner-label-key`: Label key for PC owner
127+
- `--resync-period`: Informer resync period
128+
129+
## Testing
130+
131+
### Unit Tests
132+
- Controller logic testing
133+
- Filter functionality testing
134+
- Channel lifecycle testing
135+
136+
### Integration Tests
137+
- Multi-ProviderConfig scenarios
138+
- Controller start/stop sequencing
139+
- Resource cleanup verification
140+
141+
### Key Test Scenarios
142+
1. Single ProviderConfig with services
143+
2. Multiple ProviderConfigs
144+
3. ProviderConfig deletion and cleanup
145+
4. Shared informer survival across PC changes
146+
147+
## Common Operations
148+
149+
### Adding a ProviderConfig
150+
1. Create ProviderConfig resource
151+
2. Controller detects addition
152+
3. Manager starts NEG controller
153+
4. NEG controller creates filtered informers
154+
5. NEGs are created in target GCP project
155+
156+
### Removing a ProviderConfig
157+
158+
The deletion process follows a specific sequence to ensure proper cleanup:
159+
160+
1. **External automation initiates deletion**:
161+
- Server-side automation triggers the deletion process
162+
- All namespaces belonging to the ProviderConfig are deleted first
163+
164+
2. **Namespace cleanup**:
165+
- Kubernetes deletes all resources within the namespaces
166+
- Services are deleted, triggering NEG cleanup
167+
- NEG controller removes NEGs from GCP as services are deleted
168+
169+
3. **Wait for namespace deletion**:
170+
- External automation waits for all namespaces to be fully deleted
171+
- This ensures all NEGs and other resources are cleaned up
172+
173+
4. **ProviderConfig deletion**:
174+
- Only after namespaces are gone, ProviderConfig is deleted
175+
- Controller stops the NEG controller for this ProviderConfig
176+
- Finalizer is removed from ProviderConfig
177+
- ProviderConfig resource is removed from Kubernetes
178+
179+
**Important**: NEGs are not automatically deleted when a ProviderConfig is removed. They are cleaned up as part of the namespace/service deletion process that happens before ProviderConfig deletion.

0 commit comments

Comments
 (0)