Fix offline cache clean #42308

Aaraviitkgp · 2025-11-20T17:16:21Z

Enhanced cache resolution in cached_files() to properly locate models
downloaded in subprocess when loading offline.

Changes:

Check multiple cache directories (HF_HOME, TRANSFORMERS_CACHE, HF_HUB_CACHE)
Search snapshot directories when refs are missing
Return cached files early to avoid network access

Tests:

test_subprocess_warm_cache_then_offline_load
test_pipeline_offline_after_subprocess_warm
Both tests verify no network access with socket blocking

Wauplin · 2025-11-21T09:06:47Z

Hi @Aaraviitkgp , maintainer of the underlying huggingface_hub library here. IMO this issue is more of a misunderstanding over how to enable offline mode rather than a bug in transformers. If a socket connection is attempted, it's means that transformers/huggingface_hub did not know about HF_HUB_OFFLINE being switched to 1. This is because HF_HUB_OFFLINE environment variable is evaluated once at import time. If you really want to update it at runtime, you need to patch huggingface_hub.constant.HF_HUB_OFFLINE.

We can think about adding helpers like enable_offline_mode/disable_offline_mode to properly manage that if needed. But IMO there shouldn't be any transformers-side code changes.

(unless I missed something here...)

Wauplin · 2025-11-21T09:20:51Z

I've just tested (with latest huggingface_hub version + transformers from main branch) to run the scripts from test_subprocess_warm_cache_then_offline_load locally:

1.py

# 1.py
import os

os.environ["HF_HOME"] = "./tmp_cache_dr"  # to adapt

from transformers import AutoConfig, AutoModel, AutoTokenizer

config = AutoConfig.from_pretrained("hf-internal-testing/tiny-random-bert")
model = AutoModel.from_pretrained("hf-internal-testing/tiny-random-bert")
tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-bert")
print("CACHE_WARMED")

2.py

# 2.py
import os
os.environ["HF_HOME"] = "./tmp_cache_dr" # to adapt
os.environ["HF_HUB_OFFLINE"] = "1"

# Import transformers first
# Then block sockets to ensure no network access
import socket
from transformers import AutoConfig, AutoModel, AutoTokenizer

original_socket = socket.socket

def guarded_socket(*args, **kwargs):
    raise RuntimeError("Network access attempted in offline mode!")

socket.socket = guarded_socket

try:
    config = AutoConfig.from_pretrained("hf-internal-testing/tiny-random-bert")
    model = AutoModel.from_pretrained("hf-internal-testing/tiny-random-bert")
    tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-bert")
    print("OFFLINE_SUCCESS")
except RuntimeError as e:
    if "Network access" in str(e):
        print(f"NETWORK_ATTEMPTED: {e}")
        exit(1)
    raise
except Exception as e:
    print(f"FAILED: {e}")
    import traceback

    traceback.print_exc()
    exit(1)

And both succeeded for me, without the need for this PR:

➜ python 1.py         
model.safetensors: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 520k/520k [00:00<00:00, 7.33MB/s]
Loading weights: 100%|██████████████████████████████████████████████████████████████████████████████████| 87/87 [00:00<00:00, 6066.68it/s, Materializing param=pooler.dense.weight]
BertModel LOAD REPORT from: (...)
tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 321/321 [00:00<00:00, 2.87MB/s]
vocab.txt: 4.68kB [00:00, 649kB/s]
tokenizer.json: 12.9kB [00:00, 34.3MB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 112/112 [00:00<00:00, 799kB/s]
CACHE_WARMED

➜ python 2.py 
Loading weights: 100%|██████████████████████████████████████████████████████████████████████████████████| 87/87 [00:00<00:00, 7663.97it/s, Materializing param=pooler.dense.weight]
BertModel LOAD REPORT from: (...)
OFFLINE_SUCCESS

Could you try it on your side as well and let me know if you spot any issue?

Aaraviitkgp · 2025-11-21T09:23:31Z

@Wauplin You are right about that, I just went through code and realised that was bit of misunderstanding 😁,but we can pivot this pr to add helpers like enable_offline_mode/disable_offline_mode to properly manage that if needed.

Aaraviitkgp · 2025-11-21T09:23:54Z

I've just tested (with latest huggingface_hub version + transformers from main branch) to run the scripts from test_subprocess_warm_cache_then_offline_load locally:

1.py

# 1.py
import os

os.environ["HF_HOME"] = "./tmp_cache_dr"  # to adapt

from transformers import AutoConfig, AutoModel, AutoTokenizer

config = AutoConfig.from_pretrained("hf-internal-testing/tiny-random-bert")
model = AutoModel.from_pretrained("hf-internal-testing/tiny-random-bert")
tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-bert")
print("CACHE_WARMED")

2.py

# 2.py
import os
os.environ["HF_HOME"] = "./tmp_cache_dr" # to adapt
os.environ["HF_HUB_OFFLINE"] = "1"

# Import transformers first
# Then block sockets to ensure no network access
import socket
from transformers import AutoConfig, AutoModel, AutoTokenizer

original_socket = socket.socket

def guarded_socket(*args, **kwargs):
    raise RuntimeError("Network access attempted in offline mode!")

socket.socket = guarded_socket

try:
    config = AutoConfig.from_pretrained("hf-internal-testing/tiny-random-bert")
    model = AutoModel.from_pretrained("hf-internal-testing/tiny-random-bert")
    tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-bert")
    print("OFFLINE_SUCCESS")
except RuntimeError as e:
    if "Network access" in str(e):
        print(f"NETWORK_ATTEMPTED: {e}")
        exit(1)
    raise
except Exception as e:
    print(f"FAILED: {e}")
    import traceback

    traceback.print_exc()
    exit(1)

And both succeeded for me, without the need for this PR:

➜ python 1.py         
model.safetensors: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 520k/520k [00:00<00:00, 7.33MB/s]
Loading weights: 100%|██████████████████████████████████████████████████████████████████████████████████| 87/87 [00:00<00:00, 6066.68it/s, Materializing param=pooler.dense.weight]
BertModel LOAD REPORT from: (...)
tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 321/321 [00:00<00:00, 2.87MB/s]
vocab.txt: 4.68kB [00:00, 649kB/s]
tokenizer.json: 12.9kB [00:00, 34.3MB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 112/112 [00:00<00:00, 799kB/s]
CACHE_WARMED

➜ python 2.py 
Loading weights: 100%|██████████████████████████████████████████████████████████████████████████████████| 87/87 [00:00<00:00, 7663.97it/s, Materializing param=pooler.dense.weight]
BertModel LOAD REPORT from: (...)
OFFLINE_SUCCESS

Could you try it on your side as well and let me know if you spot any issue?

it works for me aswelll.

Wauplin · 2025-11-21T09:41:55Z

@Aaraviitkgp I'm open to review a PR for that yes, but first I'd like to understand the need. In which case do you want to have a script downloading files and only then set the offline mode? In general it's more robust to not change env variables / environment behavior in the middle of a process.

Aaraviitkgp · 2025-11-21T10:01:30Z

@Wauplin I think you are right, it is not ideal to change env variable in middle of a process. I think I will close this pr.

fr1ll · 2025-11-22T20:17:07Z

@Wauplin thanks for explaining this, @Aaraviitkgp thanks for looking into this.

HF_HUB_OFFLINE environment variable is evaluated once at import time. If you really want to update it at runtime, you need to patch huggingface_hub.constant.HF_HUB_OFFLINE

tl;dr: When possible, set the environment variable before importing transformers.

I will update / close my issue #42197. (I honestly thought I had set the variable before import, not sure how I missed testing this.)

Also, there's a related issue #42269 and PR #42318 that can affect these kinds of "load while online, then access while offline" use cases. The local_files_only argument already gives a way in some settings to prevent hub access at runtime, just appears from that issue that it needs to be fixed for pipeline.

Wauplin · 2025-11-24T09:11:19Z

Thanks for flagging @fr1ll , the pipeline not respecting local_files_only is indeed a bug.

Aaraviitkgp added 5 commits November 20, 2025 22:39

issue_42197_resolve

d2d8339

issue resolved

aa1d426

Fix code formatting with ruff

bc75bbc

Merge branch 'main' into fix-offline-cache-clean

f8321b9

Merge branch 'main' into fix-offline-cache-clean

ecf6789

fr1ll mentioned this pull request Nov 20, 2025

Attempt to access socket despite HF_HUB_OFFLINE = 1 if cache warmed outside current process #42197

Closed

4 tasks

Aaraviitkgp closed this Nov 21, 2025

Aaraviitkgp deleted the fix-offline-cache-clean branch November 23, 2025 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix offline cache clean #42308

Fix offline cache clean #42308

Uh oh!

Aaraviitkgp commented Nov 20, 2025 •

edited

Loading

Uh oh!

Wauplin commented Nov 21, 2025

Uh oh!

Wauplin commented Nov 21, 2025

Uh oh!

Aaraviitkgp commented Nov 21, 2025

Uh oh!

Aaraviitkgp commented Nov 21, 2025

Uh oh!

Wauplin commented Nov 21, 2025

Uh oh!

Aaraviitkgp commented Nov 21, 2025

Uh oh!

fr1ll commented Nov 22, 2025 •

edited

Loading

Uh oh!

Wauplin commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix offline cache clean #42308

Fix offline cache clean #42308

Uh oh!

Conversation

Aaraviitkgp commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Wauplin commented Nov 21, 2025

Uh oh!

Wauplin commented Nov 21, 2025

Uh oh!

Aaraviitkgp commented Nov 21, 2025

Uh oh!

Aaraviitkgp commented Nov 21, 2025

Uh oh!

Wauplin commented Nov 21, 2025

Uh oh!

Aaraviitkgp commented Nov 21, 2025

Uh oh!

fr1ll commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Wauplin commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Aaraviitkgp commented Nov 20, 2025 •

edited

Loading

fr1ll commented Nov 22, 2025 •

edited

Loading