feat(consul): filter nodes in upstream with metadata #12448

jizhuozhi · 2025-07-20T05:57:24Z

Description

This PR introduces metadata-based node filtering for consul discovery, supporting Consul service discovery based upstreams

Motivation

Currently, APISIX selects upstream nodes based on service name from discovery without additional filtering logic. In real-world scenarios like canary release or swimlane routing, users often tag backend instances with custom metadata (e.g., version, env, lane, dc, and etc) and expect the gateway to route only to specific subsets.

This change allows users to define a metadata_match field in discovery_args configuration, which filters nodes before load balancing based on their metadata values.

Changes

Consul discovery:

Include Service.Meta in the node definition and respect its weight if available (aligned with Eureka).
Filter with discovery_args when fetch upstream nodes.

Tests: Add test cases to cover both:
- discovery-based upstream (Consul) with metadata_match

Example Usage

upstream:
  type: roundrobin
  scheme: http
  discovery_type: eureka
  discovery_args:
    metadata_match:
      lane:
      - prod
      - canary
      dc:
      - us-east-1
      - us-east-2

Only nodes with metadata.lane in [prod, canary] and metadata.dc in [us-east-1, us-east-2] will be used for load balancing.

Fixes

Fixes #12464

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible (no breaking changes)

Baoyuantop · 2025-07-21T02:43:23Z

Hi @jizhuozhi, thanks for your contribution.

I think it is useful for node filtering of Consul discovery. But I don't understand static upstream filtering. It seems that I need to mark metadata for each node in the upstream object, and then use metadata_match to configure filtering? Because each node is manually defined, if it is not needed, can I just add or delete the node?

jizhuozhi · 2025-07-21T02:47:08Z

Hi @jizhuozhi, thanks for your contribution.

I think it is useful for node filtering of Consul discovery. But I don't understand static upstream filtering. It seems that I need to mark metadata for each node in the upstream object, and then use metadata_match to configure filtering? Because each node is manually defined, if it is not needed, can I just add or delete the node?

Here is a unified approach: I only determine whether there is a filtering rule, without distinguishing whether it is a service discovery or static list.

However, according to my previous experience as a gateway administrator, there will be corresponding business developers who temporarily add rules for some debugging considerations but do not want to change the original instance list (for quick adjustment)

Baoyuantop · 2025-07-21T03:04:46Z

there will be corresponding business developers who temporarily add rules for some debugging considerations but do not want to change the original instance list

Thanks for your reply. Could you please describe the scenario in detail? Why can't the existing methods solve this problem?

jizhuozhi · 2025-07-21T11:21:47Z

Thanks for your reply. Could you please describe the scenario in detail? Why can't the existing methods solve this problem?

It is not a production environment, but it is common in the testing and verification phase. We need to specify specific instances frequently (for example, to capture flame graphs for performance analysis), but we need to add other instances back after deleting them, so we need to specify instances by filtering.

In fact, we also matched according to the dynamic colored metadata when loading balancing but not predefine the routes, similar to https://github.com/kitex-contrib/loadbalance-tagging (I am also using lua to implement the same capabilities, but this is not within the scope of this discussion).

Baoyuantop · 2025-07-22T05:59:30Z

Thanks for your reply. Could you please describe the scenario in detail? Why can't the existing methods solve this problem?

It is not a production environment, but it is common in the testing and verification phase. We need to specify specific instances frequently (for example, to capture flame graphs for performance analysis), but we need to add other instances back after deleting them, so we need to specify instances by filtering.

In fact, we also matched according to the dynamic colored metadata when loading balancing but not predefine the routes, similar to https://github.com/kitex-contrib/loadbalance-tagging (I am also using lua to implement the same capabilities, but this is not within the scope of this discussion).

I still have doubts about what is in Example Usage. Do you mean that if I need to adjust the nodes used, I don't need to change the content of the nodes list, but adjust metadata_match?
By the way, are you using static nodes or service discovery?

jizhuozhi · 2025-07-22T06:20:00Z

I still have doubts about what is in Example Usage. Do you mean that if I need to adjust the nodes used, I don't need to change the content of the nodes list, but adjust metadata_match?

Yes, just adjust metadata_match (but the discussion of this use case has been separated from this PR). For the runtime, it is a unified filtering rule for the service list that does not need to distinguish the source.

By the way, are you using static nodes or service discovery?

We are currently using Consul on kubernetes. When I was working in another company a few years ago, we were using cloud virtual machines (or EC2). The cloud platform did not provide an API interface, but we used scripts to synchronize static instance lists at regular intervals. At this time, the static list was also a kind of dynamic discovery. (why not filter in the script? Because we were lazy:)

Baoyuantop · 2025-07-25T05:41:34Z

Hi @jizhuozhi, There is currently no modification to the upstream schema, which means that the current modifications in the upstream only serve consul. Is it more appropriate to put all these logics into the consul module?

jizhuozhi · 2025-07-25T05:55:27Z

Hello, @Baoyuantop, thanks for your reply.

Hi @jizhuozhi, There is currently no modification to the upstream schema, which means that the current modifications in the upstream only serve consul. Is it more appropriate to put all these logics into the consul module?

Not only consul, but also Eureka (which has already supported metadata in apisix) will inherit this function. In my forked dashboard has already supported configuring metadata_match for Consul and Eureka
https://github.com/jizhuozhi/apisix-dashboard/blob/9bd72c82e4fcbfa0d2bf34420280028c2ca853c8/web/src/components/Upstream/components/ServiceDiscovery.tsx#L30-L37

(The examples in the PR description are just examples, because this allows testing without the registry, and we don't need to care about service discovery or static nodes.)

And the current discovery package is responsible for pulling all instances, and the filtering in discovery is effective for all service names and upstreams, it means that I can only configure general filtering rules, but cannot configure differentiated matching for different routes and upstreams. This is our current online effect

jizhuozhi · 2025-07-25T06:13:54Z

Hello @Baoyuantop , I see. Currently, upstream has passed the discovery args to nodes, so the loop can be closed in discovery. I will modify it.

        local new_nodes, err = dis.nodes(up_conf.service_name, up_conf.discovery_args)
        if not new_nodes then
            return HTTP_CODE_UPSTREAM_UNAVAILABLE, "no valid upstream node: " .. (err or "nil")
        end

jizhuozhi · 2025-07-25T06:44:10Z

Hello @Baoyuantop , PTAL, thanks :)

We also have Spring Cloud applications with Eureka, but I have no time to write test case now, so I will create a new PR for Eureka later.

Baoyuantop · 2025-07-25T07:08:50Z

Hi @jizhuozhi, we are still discussing whether to accept the feature of this PR, and we need to reach a consensus before we can start the review. Since there is no separate issue to discuss this issue, you need to clearly tell the maintainer what this feature does and why it is needed in the PR description (the current description already exists).

The examples in the PR description are just examples, because this allows testing without the registry, and we don't need to care about service discovery or static nodes.

This is inappropriate and you need to replace it with an example from a real scenario. The current example will confuse other maintainers. In the latest changes, I see that you have cancelled the upstream related code. The current PR seems to focus on the filtering of consul services. Please update the PR description to reflect this. Thanks again for your contribution.

jizhuozhi · 2025-07-25T07:20:51Z

This is inappropriate and you need to replace it with an example from a real scenario. The current example will confuse other maintainers.

Thank you for your reminder, the PR content has been updated

Baoyuantop · 2025-07-28T06:45:56Z

Please fix the failed CI.

docs/zh/latest/discovery/consul.md

apisix/discovery/consul/init.lua

apisix/utils/discovery.lua

t/discovery/consul2.t

SkyeYoung · 2025-07-31T07:49:50Z

t/discovery/consul_dump.t

@@ -95,7 +95,7 @@ discovery:
 --- request
 GET /t
 --- response_body
-{"service_a":[{"host":"127.0.0.1","port":30511,"weight":1}],"service_b":[{"host":"127.0.0.1","port":8002,"weight":1}]}
+{"service_a":[{"host":"127.0.0.1","metadata":{"service_a_version":"4.0"},"port":30511,"weight":1}],"service_b":[{"host":"127.0.0.1","metadata":{"service_b_version":"4.1"},"port":8002,"weight":1}]}


Why does it affect this place? I don't think this test should be changed

Since we are now relying on metadata, we need to persist it when persisting.

@jizhuozhi Ok. Another question is, if this is added, do we now lack a test case for the situation where there is "no metadata"?

I will add it :)

jizhuozhi · 2025-08-04T04:24:59Z

@jizhuozhi

Please try not to use force-push. It forces me to review the entire PR from the beginning, rather than just the parts you've changed since my last review.

Thanks 😸

Thanks for your reminder, I will pay attention to it. The previous problem here was because of cross-environment submission (I testing in EC2 and coding in office computer), resulting in multiple different and invalid submitters. This situation will not occur in the future.

SkyeYoung

Others LGTM

Pls complete these two things, and then we can ask other maintainers to review it:

add no metadata test case
fix ci

apisix/utils/discovery.lua

SkyeYoung

LGTM

Copilot

Pull Request Overview

This PR introduces metadata-based node filtering for Consul service discovery in APISIX, enabling users to route traffic to specific service instance subsets based on metadata criteria. This supports use cases like canary releases and swimlane routing.

Key changes include:

Adding metadata support to Consul discovery by including Service.Meta in node definitions
Implementing a metadata filtering mechanism through discovery_args.metadata_match configuration
Creating shared utility functions for metadata matching across discovery types

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
apisix/discovery/consul/init.lua	Core implementation adding metadata support and filtering to Consul discovery
apisix/utils/discovery.lua	New utility module providing metadata matching functions
apisix/schema_def.lua	Schema definition for metadata_match configuration in discovery_args
docs/en/latest/discovery/consul.md	Documentation for the new metadata filtering feature
t/discovery/consul.t	Test case verifying metadata filtering functionality
t/discovery/consul_dump.t	Updated test expectations to include metadata in responses
t/discovery/consul2.t	Updated test expectations to include metadata in responses

docs/en/latest/discovery/consul.md

jizhuozhi · 2025-08-06T07:39:25Z

Hello @SkyeYoung. Please help try again, it seems to be unrelated to this change

membphis · 2025-08-08T01:45:22Z

apisix/discovery/consul/init.lua

+                    local metadata = service.Meta
+                    -- ensure that metadata is an accessible table,
+                    -- avoid userdata likes `null` returned by cjson
+                    if type(metadata) ~= "table" then


pls take a look, skip current service if the metadat is invalid

if type(metadata) == "cdata" then metadata = nil elseif type(metadata) ~= "table" then core.log.error("wrong meta data, ...", ...) goto CONTINUE end

Hello @membphis. I will add cdata check and error log, but I am not sure whether it is appropriate to skip this node.

If it is a container template release, it means that all the containers released in this batch will be skipped, and users only need to pay attention to whether it is valid when they need to use metadata. If it is skipped directly, the service quality will be affected. So I tend to keep these nodes, print logs and leave them empty.

SkyeYoung · 2025-08-11T08:13:24Z

@jizhuozhi Please fix the errors reported in the CI.

SkyeYoung

LGTM

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jul 20, 2025

jizhuozhi mentioned this pull request Jul 21, 2025

feat(nacos): add metadata filtering support to nacos discovery #12445

Open

5 tasks

moonming requested a review from Copilot July 21, 2025 08:31

This comment was marked as outdated.

Sign in to view

jizhuozhi force-pushed the master branch from fa69d85 to 043128e Compare July 25, 2025 06:38

jizhuozhi changed the title ~~feat(upstream): filter nodes in upstream with metadata~~ feat(consul): filter nodes in upstream with metadata Jul 25, 2025

jizhuozhi mentioned this pull request Jul 26, 2025

feat: I hope we could filter nodes via metadata when config upstream #12464

Open

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 27, 2025

jizhuozhi force-pushed the master branch from dd1fcfd to 70e3a82 Compare July 27, 2025 18:10

Baoyuantop added this to ⚡️ Apache APISIX Roadmap Jul 28, 2025

Baoyuantop moved this to 👀 In review in ⚡️ Apache APISIX Roadmap Jul 28, 2025

Baoyuantop assigned jizhuozhi Jul 28, 2025

SkyeYoung reviewed Jul 31, 2025

View reviewed changes

Baoyuantop added the wait for update wait for the author's response in this issue/PR label Aug 1, 2025

feat(consul): filter nodes in upstream with metadata

0e23e48

github-actions bot added the user responded label Aug 4, 2025

SkyeYoung reviewed Aug 4, 2025

View reviewed changes

apisix/utils/discovery.lua Outdated Show resolved Hide resolved

SkyeYoung removed the user responded label Aug 4, 2025

jizhuozhi added 2 commits August 5, 2025 00:21

feat(consul): filter nodes in upstream with metadata

3867b84

feat(consul): filter nodes in upstream with metadata

2e7a2df

SkyeYoung added user responded and removed wait for update wait for the author's response in this issue/PR labels Aug 5, 2025

Baoyuantop requested a review from SkyeYoung August 5, 2025 10:17

SkyeYoung previously approved these changes Aug 6, 2025

View reviewed changes

SkyeYoung requested review from membphis, bzp2010, nic-6443, Baoyuantop, Revolyssup and Copilot August 6, 2025 01:10

Copilot AI reviewed Aug 6, 2025

View reviewed changes

feat(consul): filter nodes in upstream with metadata

9cd9451

jizhuozhi dismissed SkyeYoung’s stale review via 9cd9451 August 6, 2025 03:30

SkyeYoung previously approved these changes Aug 8, 2025

View reviewed changes

SkyeYoung requested a review from AlinsRan August 8, 2025 01:20

membphis reviewed Aug 8, 2025

View reviewed changes

feat(consul): filter nodes in upstream with metadata

dfd4ad2

jizhuozhi dismissed SkyeYoung’s stale review via dfd4ad2 August 10, 2025 17:27

jizhuozhi added 2 commits August 13, 2025 00:07

feat(consul): filter nodes in upstream with metadata

225cb4a

feat(consul): filter nodes in upstream with metadata

d0243d9

SkyeYoung approved these changes Aug 14, 2025

View reviewed changes

SkyeYoung requested a review from membphis August 14, 2025 06:30

feat(consul): filter nodes in upstream with metadata #12448

Are you sure you want to change the base?

feat(consul): filter nodes in upstream with metadata #12448

Uh oh!

Conversation

jizhuozhi commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Changes

Example Usage

Fixes

Checklist

Uh oh!

Baoyuantop commented Jul 21, 2025

Uh oh!

jizhuozhi commented Jul 21, 2025

Uh oh!

Baoyuantop commented Jul 21, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

jizhuozhi commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Baoyuantop commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jizhuozhi commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Baoyuantop commented Jul 25, 2025

Uh oh!

jizhuozhi commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jizhuozhi commented Jul 25, 2025

Uh oh!

jizhuozhi commented Jul 25, 2025

Uh oh!

Baoyuantop commented Jul 25, 2025

Uh oh!

jizhuozhi commented Jul 25, 2025

Uh oh!

Baoyuantop commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SkyeYoung Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

jizhuozhi Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

SkyeYoung Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jizhuozhi Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

jizhuozhi commented Aug 4, 2025

Uh oh!

SkyeYoung left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SkyeYoung left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

jizhuozhi commented Jul 20, 2025 •

edited

Loading

jizhuozhi commented Jul 21, 2025 •

edited

Loading

Baoyuantop commented Jul 22, 2025 •

edited

Loading

jizhuozhi commented Jul 22, 2025 •

edited

Loading

jizhuozhi commented Jul 25, 2025 •

edited

Loading

SkyeYoung Aug 4, 2025 •

edited

Loading