Skip to content

Conversation

pyropy
Copy link
Contributor

@pyropy pyropy commented Aug 19, 2025

Add support for retrieval of pieces from PDP data sets that don't have withCDN flag enabled. Changes do not include testing of such deals, it only extends the support with ability to do so.

Related issue https://www.notion.so/filecoindev/Deal-bot-24cdc41950c1809484d7f2741cdddf53?source=copy_link

@pyropy pyropy self-assigned this Aug 19, 2025
@pyropy pyropy requested a review from Copilot August 19, 2025 09:50
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Contributor

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it difficult to review this pull request when we don't know the high-level design for checking the retrievability of PDP deals.

For example, depending on that design, we may not need to support all three possible values for the withCDN filter. Maybe all we need is two modes:

  1. Sample all PDP deals & retrieve directly from SP
  2. Sample CDN deals only & retrieve via FilCDN

Then we need to consider how to report the outcomes. Currently, filcdn-bot is highly tailored to our needs:

  1. Produce synthetic traffic to populate our dashboard.
  2. Alert us when data is not retrievable (but ignore failures caused by SPs.

What outputs does FilOz need from the checker testing direct retrievals of all PDP deals?

@pyropy
Copy link
Contributor Author

pyropy commented Aug 21, 2025

I find it difficult to review this pull request when we don't know the high-level design for checking the retrievability of PDP deals.

In the linked document it's stated that "For non-CDN proofsets, run retrieval checks at a lower frequency (e.g., every hour at random)". We're already doing a similar thing where we're testing a random deals (only in our case we're doing it every second, but that's adjustable).

https://github.com/filcdn/bot/blob/a064225ba5da95752c9cc84abfbd648a6b50b677/bin/bot.js#L40-L52

However this PR does not introduce testing of non-CDN deals, it only enables the support for it.

For example, depending on that design, we may not need to support all three possible values for the withCDN filter. Maybe all we need is two modes:

  1. Sample all PDP deals & retrieve directly from SP
  2. Sample CDN deals only & retrieve via FilCDN

This update introduces support for two (similar) modes:

  1. Sample CDN disabled PDP deals & retrieve directly from SP
  2. Sample CDN enabled PDP deals only & retrieve via FilCDN

However, we could change the fist model to sample for first random PDP deal if withCDN flag is set to false.

Then we need to consider how to report the outcomes. Currently, filcdn-bot is highly tailored to our needs:

  1. Produce synthetic traffic to populate our dashboard.
  2. Alert us when data is not retrievable (but ignore failures caused by SPs.

At the moment of publishing this PR it was unclear who would operate this bot and collect data for non-CDN deals. Now when we know that FilCDN will continue to operate the bot in extended capacity we'd need to discuss on how we should collect the stats. This could be also done is some follow-up PRs.

What outputs does FilOz need from the checker testing direct retrievals of all PDP deals?

@jennijuju would you be able to comment on this?

proofSetId: setId,
})
if (!baseUrl) {
console.error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
console.error(
throw new Error(

Otherwise withCDN=false tests can silently fail

index.js Outdated
if (withCDN) {
baseUrl = `https://${clientAddress}.${CDN_HOSTNAME}`
} else {
baseUrl = await maybeGetResolvedProofSetRetrievalUrl({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repo needs to be updated to new naming (data set, fwss, piece, etc)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take care of this now

@bajtos
Copy link
Contributor

bajtos commented Sep 11, 2025

I propose to close this PR as no longer relevant.

See https://github.com/FilOzone/dealbot

CDN Performance: A/B test results comparing CDN vs non-CDN deals

@pyropy
Copy link
Contributor Author

pyropy commented Sep 11, 2025

I propose to close this PR as no longer relevant.

See https://github.com/FilOzone/dealbot

CDN Performance: A/B test results comparing CDN vs non-CDN deals

@bajtos You're right, the deal bot now has retrieval component built-in. Closing in favor of https://github.com/FilOzone/dealbot.

@pyropy pyropy closed this Sep 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants