Skip to content

add infrastructure-alerting blog post #2912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 8, 2025
Merged

add infrastructure-alerting blog post #2912

merged 6 commits into from
Jul 8, 2025

Conversation

jcudit
Copy link
Contributor

@jcudit jcudit commented Jun 20, 2025

Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

  • Add an entry to _blog.yml.
  • Add a thumbnail. There are no requirements here, but there is a template if it's helpful.
  • Check you use a short title and blog path.
  • Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
  • Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
  • Ensure the publication date is correct.
  • Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: #2382

Getting a Review

Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.

Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.

@jcudit jcudit requested review from XciD and Michellehbn June 20, 2025 16:27
Copy link
Contributor

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's pretty cool! I don't have any nits. you might want to redirect readers towards an action you want them to take in the end of the blog imo

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Light review, checked format and structure, approving to unblock.

_blog.yml Outdated
author: jcudit
thumbnail: /blog/assets/infrastructure-alerting/thumbnail.png
date: June 25, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to update


The Infrastructure team at Hugging Face is excited to share a behind-the-scenes look at the inner workings of Hugging Face's production infrastructure, which we’ve had the privilege of helping to build and maintain. Our team's dedication to designing and implementing a robust monitoring and alerting system has been instrumental in ensuring the stability and scalability of our platforms. We’re constantly reminded of the impact that our alerts have on our ability to identify and respond to potential issues before they become major incidents.

In this blog post, we’ll dive into the details of three mighty alerts that play their unique role in supporting our production infrastructure, and explore how they've helped us maintain the high level of performance and uptime that our community relies on.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe would be cool to mention a couple of nice metrics.

jcudit and others added 2 commits July 7, 2025 09:56
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Copy link
Member

@XciD XciD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jcudit jcudit merged commit 0ba9557 into main Jul 8, 2025
1 check passed
@jcudit jcudit deleted the infra-alerting branch July 8, 2025 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants