Skip to content

rewrite eradius#241

Open
RoadRunnr wants to merge 23 commits intomasterfrom
feature/next-major
Open

rewrite eradius#241
RoadRunnr wants to merge 23 commits intomasterfrom
feature/next-major

Conversation

@RoadRunnr
Copy link
Copy Markdown
Member

v3 rewrite of eradius, major changes:

  • clients and servers are now started and configured through APIs, not
    app env settings any more
  • IPv6 support
  • supports multiple server and client instances
  • metrics are optional and callback based (allows the easy use of other
    metrics frameworks)
  • distributed handlers are no longer support, use erpc to replicate in
    use case specific code if needed.
  • removed proxy support (use freeradius or similar instead)

@RoadRunnr RoadRunnr force-pushed the feature/next-major branch from 0ab0e31 to e6f81cb Compare May 6, 2024 07:32
Comment thread README.md Outdated
Comment thread dicts_compiler.erl Outdated
Comment thread dicts_compiler.erl Outdated
@RoadRunnr RoadRunnr force-pushed the feature/next-major branch from e6f81cb to a2c9a75 Compare May 6, 2024 08:15
@0xAX
Copy link
Copy Markdown
Member

0xAX commented May 6, 2024

removed proxy support (use freeradius or similar instead)

Was it approved for the use-cases where it was is used?

clients and servers are now started and configured through APIs, not
app env settings any more

What does it give? The projects that are using old configuration need to have adjustments to handle previous eradius configuration by themselves (or to completely change/remove it what is not always possible) and to call API instead of eradius did it automatically (start servers/clients 'manually' via API). What is the purpose of it? To have some ability to do things in runtime? The old version reconfigure API, why not to add API/wrappers around set_env like add_server,remove_server and so on but break the behaviour?

@RoadRunnr RoadRunnr force-pushed the feature/next-major branch 3 times, most recently from 022331f to dc3bc8e Compare May 6, 2024 10:20
@RoadRunnr
Copy link
Copy Markdown
Member Author

removed proxy support (use freeradius or similar instead)

Was it approved for the use-cases where it was is used?

There is no need to move existing use-case to the new code. If there is something that uses proxing, it can continue to use the old version.
However, I would strongly recommend to move those project to something else.

clients and servers are now started and configured through APIs, not
app env settings any more

What does it give? The projects that are using old configuration need to have adjustments to handle previous eradius configuration by themselves (or to completely change/remove it what is not always possible) and to call API instead of eradius did it automatically (start servers/clients 'manually' via API). What is the purpose of it? To have some ability to do things in runtime? The old version reconfigure API, why not to add API/wrappers around set_env like add_server,remove_server and so on but break the behaviour?

Again, existing projects can stay with the version they are using. There is nothing that forces them to move.
The changes in configuration are just to big to provide sensible compatibility mappers. It is saner if projects want to use the version, that they migrate to the new API.

@RoadRunnr RoadRunnr force-pushed the feature/next-major branch 2 times, most recently from ec43b3c to 5a4066d Compare May 7, 2024 15:41
@RoadRunnr RoadRunnr marked this pull request as ready for review May 8, 2024 07:02
@RoadRunnr RoadRunnr requested a review from a team as a code owner May 8, 2024 07:02
ebengt
ebengt previously approved these changes May 8, 2024
Copy link
Copy Markdown

@ebengt ebengt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Comment thread dicts_compiler.erl Outdated
Comment thread src/eradius_client_socket.erl Outdated
Comment thread src/eradius_client_socket.erl
Comment thread src/eradius_server.erl
Comment thread src/eradius_server.erl
RoadRunnr added 12 commits April 2, 2026 07:36
Seperate concerns better. Socket managing is not in the mngr module,
while the sending logic is in the client module.
There was no logic in place to remove entries from the pending
registry once the timeout hits. The client side timeout would only
handle re-transmission, but not the cleanup.

Move the timeout logic to the sender and let it also cleanup the
pending registry.
Fully prepare client suite for full IPv6 support and add some minor
tweaks to the others to move them closer to IPv6 support.
v3 rewrite of eradius, major changes:
* clients and servers are now started and configured through APIs, not
  app env settings any more
* IPv6 support
* supports multiple server and client instances
* metrics are optional and callback based (allows the easy use of other
  metrics frameworks)
* distributed handlers are no longer support, use erpc to replicate in
  use case specific code if needed.
* removed proxy support (use freeradius or similar instead)
…ent field

- packet/1: pre-generate random authenticator for 'request' cmd before
  encoding attributes, so that scramble encryption (User-Password) and
  Message-Authenticator HMAC computation both use the same authenticator
  value that will be written into the packet header

- encode_body/2 for 'request': use pre-generated authenticator from packet/1
  if present, instead of always generating a new one

- encode_message_authenticator/2: fix typo 'reqid' -> 'req_id' in pattern
  match, so Message-Authenticator HMAC is actually computed when msg_hmac=true

- request/4: make 'client' field optional in NAS map (fall back to <<>>),
  since NAS maps often only carry 'secret' and omitting 'client' caused a
  function_clause crash on the server side when decoding incoming packets
- handle_call({reconfigure, ...}): also update #state.servers when
  reconfiguring, so that new servers are actually used for sending requests
  (previously only config was updated, leaving servers empty → no_active_servers)

- start_client/1: return the client manager pid instead of the supervisor pid,
  so callers can pass it directly to eradius_client:send_request/3,4
  (previously the supervisor pid was returned, causing send_request to crash
  with {wanna_send,...} arriving at the supervisor gen_server)
RoadRunnr and others added 11 commits April 2, 2026 07:37
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix `secret := binary` typo to `secret := binary()` in server_opts()
  and server() types — `binary` without parens is the atom, not the type
- Add missing `inet_backend => inet | socket` to client_opts() type,
  which is already used by eradius_client_socket
- Fix test secrets from "secret" (string) to <<"secret">> (binary)
  to match the server_opts() contract

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change GH workflow from `rebar3 as dialyzer do dialyzer` to
  `rebar3 as dialyzer dialyzer` to match the documented command
- Restore `{plt_apps, all_deps}` to dialyzer config — was dropped
  during the rewrite, required for full dependency PLT coverage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove unused `mode` binding from the udp handle_info clause.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Convert all %% @doc / %% @moduledoc EDoc comments to OTP 27 native
-doc / -moduledoc attributes throughout the codebase. Remove
-feature(maybe_expr, enable) directives now that maybe is standard.

Drop support for OTP releases prior to 27.3. Update minimum_otp_vsn
in rebar.config, the CI matrix, and the README version support section.

Add comprehensive inline documentation to all public API modules:
eradius, eradius_server, eradius_client, eradius_client_mngr, and
eradius_req. Document the handler callback, all public functions,
and key types. Add module-level usage examples.

Expand README with a v3 quickstart covering server implementation,
client setup, and failover, and a "Why a major, breaking rewrite?"
section explaining the rationale for the API incompatibility.

Update ex_doc configuration with groups_for_modules for cleaner
navigation in generated docs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
prometheus 6.x returns label values as proplists with atom keys
(e.g. {server_name, good}) instead of string keys ("server_name", good)
used in 4.x. Update all check_metric/check_metric_multi calls to use
atom-keyed label filters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OTP 28 deprecated try...catch clauses that omit the exception class
(which silently defaulted to catching only throw); OTP 29 removes them.

- eradius_eap_packet: `catch _` → `catch _:_` to catch all classes
- eradius_client_socket: `{nodedown, _}` → `exit:{nodedown, _}` to
  correctly match the exit exception thrown by gen_server:call on node down

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace `case catch l2i(Id) of` with `try l2i(Id) of ... catch error:_ ->`
as the standalone `catch Expr` expression is deprecated in OTP 28 and
removed in OTP 29. `list_to_integer` raises error:badarg on failure,
so the explicit `error:_` class is correct.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
prometheus 6.1.2 uses the plain `catch Expr` form in
prometheus_quantile_summary.erl which OTP 29 made a hard error.
Add a rebar3 override to pass nowarn_deprecated_catch when compiling
prometheus until upstream ships a fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a v3 ground-up rewrite of eradius that replaces v2’s application-environment-driven configuration with explicit APIs for starting/configuring clients and servers, adds IPv6 support, supports multiple concurrent client/server instances, and restructures metrics/logging integration.

Changes:

  • Introduces API-driven server startup (eradius:start_server/3,4) and new supervision layout for dynamic server/client instances.
  • Reworks client/server request representation around eradius_req:req() (maps), updating core logic and test suites accordingly.
  • Removes legacy features (proxy, distributed handlers, counter aggregator) and updates documentation/CI for OTP 27.3+ and ex_doc.

Reviewed changes

Copilot reviewed 44 out of 52 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test/eradius_test.hrl Test macro formatting adjustments.
test/eradius_test_lib.erl Adds test helpers for IPv4/IPv6 loopback and inet-family selection.
test/eradius_test_handler.erl Updates test handler to new client/server APIs and map-based requests.
test/eradius_proxy_SUITE.erl Removes proxy test suite (proxy support removed).
test/eradius_metrics_SUITE.erl Updates metrics suite to callback-based Prometheus metrics and new request type.
test/eradius_logtest.erl Removes legacy log test module.
test/eradius_lib_SUITE.erl Migrates tests from record-based APIs to eradius_req APIs + stdlib asserts.
test/eradius_config_SUITE.erl Removes config suite tied to env-driven server configuration (deprecated model).
test/eradius_client_SUITE.erl Updates client suite for multiple families/backends and new client manager API.
test/eradius_client_socket_test.erl Removes legacy socket test helper (socket subsystem refactored).
test/eradius_auth_SUITE.erl Formatting-only adjustments in auth suite list layout.
src/eradius.erl Adds public API for starting servers; removes env/config_change driven behavior.
src/eradius.app.src Updates registered processes list and app metadata formatting.
src/eradius_sup.erl Replaces old server/client supervision tree with new server + client top supervisors.
src/eradius_server_top_sup.erl Removes obsolete top supervisor layer.
src/eradius_server_sup.erl Simplifies server supervisor interface for starting instances via child specs.
src/eradius_server_mon.erl Removes env-based server/NAS registry/manager.
src/eradius_proxy.erl Removes built-in proxy implementation.
src/eradius_node_mon.erl Removes distributed handler/node monitoring subsystem.
src/eradius_log.erl Replaces file-based logger gen_server with helper functions for logger metadata + formatting.
src/eradius_internal.hrl Adds internal defaults for retries/timeout.
src/eradius_eap_packet.erl Refactors EAP registry API + encoding/decoding paths and doc attributes.
src/eradius_dict.erl Modernizes module docs/types and table load/unload structure.
src/eradius_counter.erl Removes legacy counter subsystem.
src/eradius_counter_aggregator.erl Removes counter aggregation across nodes.
src/eradius_config.erl Removes env-driven configuration validation module.
src/eradius_client_top_sup.erl Adds supervisor for dynamically started client instances.
src/eradius_client_sup.erl Adds per-client supervision tree (socket sup + client manager).
src/eradius_client_socket.erl Refactors client socket worker to map-based config and request tracking.
src/eradius_client_socket_sup.erl Adds socket supervisor for per-client socket workers.
src/eradius_auth.erl Migrates auth helpers to map-based request type and adds docs/xref/dialyzer annotations.
sample/src/eradius_server_sample.erl Minor formatting update in sample benchmark loop.
sample/src/eradius_server_sample.app.src Formatting-only changes.
sample/rebar.config Formatting-only changes.
sample/example.config Formatting-only changes (note: still uses legacy env-style config).
rebar.config Raises OTP baseline to 27.3, adds ex_doc/dialyzer config, updates deps and xref checks.
README.md Rewrites docs for v3 API model, removes v2 config/proxy guidance, updates OTP support text.
priv/dev.config Formatting-only changes to dev sys.config content.
include/eradius_lib.hrl Updates type alias and removes legacy record definitions.
include/eradius_dict.hrl Formatting-only changes to record declarations.
dicts_compiler.erl Style/robustness tweaks (comments, try/catch cleanup, formatting).
applications/eradius_prometheus_collector/src/eradius_prometheus_collector.erl Removes legacy embedded collector app (metrics now callback-based).
applications/eradius_prometheus_collector/src/eradius_prometheus_collector.app.src Removes collector application spec.
applications/eradius_prometheus_collector/rebar.config Removes collector app rebar config.
.github/workflows/main.yml Updates CI matrix to OTP 27.3/28/29, adds dialyzer and artifact upload.
.github/workflows/hex.yaml Switches Hex publishing doc generation from edoc to ex_doc.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

uses: actions/upload-artifact@v4
with:
name: test-output-${{ matrix.otp }}
path: ct-logs-${{ matrix.otp }}.tar.xz
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compressed test logs file is created as ct-logs-${{ matrix.otp }}.tar.zst but the upload-artifact step references ct-logs-${{ matrix.otp }}.tar.xz, so the artifact upload will fail. Update the path to match the produced filename (or change the compression command/extension).

Suggested change
path: ct-logs-${{ matrix.otp }}.tar.xz
path: ct-logs-${{ matrix.otp }}.tar.zst

Copilot uses AI. Check for mistakes.
Comment on lines +66 to 77
-doc "decode a EPA message".
decode(<<Code:8, Id:8, Len:16, Rest/binary>>) ->
DataLen = Len - 4,
case Rest of
<<Data:DataLen/bytes, _/binary>> ->
do_decode_payload(code(Code), Id, Data);
_ ->
{error, invalid_length}
<<Data:DataLen/bytes, _/binary>> ->
do_decode_payload(code(Code), Id, Data);
_ ->
{error, invalid_length}
end.

%% @doc endecode a EPA message
-doc "endecode a EPA message".
encode(Code, Id, Msg) ->
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings say "EPA" and "endecode"; these look like typos for "EAP" and "encode". Correcting them will improve generated docs/searchability.

Copilot uses AI. Check for mistakes.
Comment thread src/eradius_dict.erl
terminate(_Reason, _State) -> ok.

%% @private
code_change(_OldVsn, _NewVsn, _State) -> {ok, state}.
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code_change/3 should return {ok, State} (and its second argument is the current State per the gen_server callback contract). Returning {ok, state} (an atom) will corrupt the server state during upgrade and likely crash on the next call.

Suggested change
code_change(_OldVsn, _NewVsn, _State) -> {ok, state}.
code_change(_OldVsn, _NewVsn, State) -> {ok, State}.

Copilot uses AI. Check for mistakes.
end;

handle_call(_Request, _From, State) ->
{noreply, State}.
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For unexpected gen_server:call/2 messages, handle_call/3 returns {noreply, State} without ever replying, which will cause the caller to hang indefinitely. Return {reply, {error, bad_call}, State} (or similar) for unknown calls.

Suggested change
{noreply, State}.
{reply, {error, bad_call}, State}.

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +202
P, A) ->
{ok, PA} = parse_ip(A),
test(PA == NA, "Adress not configured") and
case NA of
OA ->
{_, Rest} = split(NP, array:to_list(OS)),
test(P == NP,"Ports not configured") and
test(maps:fold( fun(_Peer, {NextPortIdx, _NextReqId}, Akk) ->
Akk and (NextPortIdx =< NP)
end, true, NC), "Invalid port counter") and
test(getSocketCount() =< NP, "Sockets not closed") and
test(array:size(NS) =< NP, "Socket array not resized") and
test(lists:all(fun(Pid) -> testSocket(Pid) end, Rest), "Sockets still available");
_ ->
test(array:size(NS) == 0, "Socket array not cleaned") and
test(getSocketCount() == 0, "Sockets not closed") and
test(lists:all(fun(Pid) -> testSocket(Pid) end, array:to_list(OS)), "Sockets still available")
end.
case NA of
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in assertion message: "Adress not configured" -> "Address not configured".

Copilot uses AI. Check for mistakes.
Comment thread src/eradius.app.src
]},
{maintainers, ["Andreas Schultz", "Vladimir Tarasenko", "Yury Gargay"]},
{licenses, ["MIT"]},
{links, [{"Github", "https://github.com/travelping/eradius"}]},
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link label uses "Github"; the proper capitalization is "GitHub".

Suggested change
{links, [{"Github", "https://github.com/travelping/eradius"}]},
{links, [{"GitHub", "https://github.com/travelping/eradius"}]},

Copilot uses AI. Check for mistakes.
Comment thread src/eradius_log.erl
Comment on lines +64 to +66
line(#{cmd := Cmd, req_id := ReqId, server_addr := {IP, Port}} = Req) ->
StatusType = format_acct_status_type(Req),
io_lib:format("~s:~p [~p]: ~s ~s", [inet:ntoa(IP), Port, ReqId, format_cmd(Cmd), StatusType]).
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line/1 formats the prefix as <Client-IP>:<Client-Port>, but it uses server_addr (the listening server address per eradius_req docs) rather than client_addr. This will log the wrong endpoint for server-side requests. Consider using client_addr when present (server-side), and falling back to server_addr (client-side) if client_addr is missing.

Copilot uses AI. Check for mistakes.
Comment thread src/eradius_log.erl
Comment on lines +101 to +102
format_sender(#{req_id := ReqId, server_addr := {IP, Port}}) ->
<<(format_ip(IP))/binary, $:, (i2b(Port))/binary, " [", (i2b(ReqId))/binary, $]>>.
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_sender/1 uses server_addr, which means format_req/1 will log the server listening address instead of the peer address for server-side requests. If the intention is to log the packet sender, prefer client_addr when available (and fall back to server_addr for client-side requests).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants