Skip to content

fix: migration overwrites stale profile cookies with fresh login data#247

Merged
teng-lin merged 2 commits into
teng-lin:mainfrom
LittleBitPlanet:fix/migration-stale-overwrite
May 13, 2026
Merged

fix: migration overwrites stale profile cookies with fresh login data#247
teng-lin merged 2 commits into
teng-lin:mainfrom
LittleBitPlanet:fix/migration-stale-overwrite

Conversation

@LittleBitPlanet
Copy link
Copy Markdown
Contributor

@LittleBitPlanet LittleBitPlanet commented Apr 5, 2026

Summary

  • Migration from legacy flat layout to profiles/default/ skipped copying storage_state.json when the destination already existed — regardless of whether the source was newer
  • This caused a recurring auth failure cycle: login writes fresh cookies to the legacy root path, migration deletes the fresh file but keeps the stale profile copy, auth fails
  • Fix: compare st_mtime before skipping — if the legacy root file is newer, overwrite the profile copy

Root Cause

_legacy_fallback() in paths.py resolves get_storage_path() to the root ~/.notebooklm/storage_state.json when it exists (for backwards compat). So login writes there. But ensure_profiles_dir() runs on every CLI invocation and triggers migrate_to_profiles() whenever legacy files exist at root. The migration copied root → profile on first run, but on subsequent runs it saw the profile copy already existed and skipped the copy — then deleted the (newer) root file anyway.

The original comment even said "skip if destination already exists and is newer" but the and is newer part was never implemented in the condition.

Test plan

  • Verify ruff format, ruff check, mypy pass (confirmed locally)
  • Scenario: fresh login → next CLI command → auth still works (fresh cookies preserved in profile)
  • Scenario: profile copy is already up-to-date → migration skips correctly (no unnecessary overwrites)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Profile migration now compares file modification times and only replaces profile data when the source is newer, avoiding unintended overwrites.
  • Tests
    • Added regression tests to verify timestamp-based precedence during migration (overwrite when legacy is newer; keep when profile is newer).

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

📝 Walkthrough

Walkthrough

Migration now compares source and destination modification times when copying legacy files into profiles/default, overwriting the profile only if the legacy file is newer and skipping (with updated log) when the profile is the same age or newer. Two tests cover both precedence cases.

Changes

Migration timestamp precedence

Layer / File(s) Summary
Compare mtimes and copy/skip logic
src/notebooklm/migration.py
The per-file migration loop now compares st_mtime and copies legacy files into profiles/default only when the source is newer; the skip debug message was updated to state the profile copy is the same age or newer.
Regression tests for mtimes
tests/unit/test_migration.py
Added two tests: one verifies overwriting a stale profiles/default/storage_state.json when legacy is newer; the other verifies keeping a newer profile when legacy is older. Both assert the legacy file is removed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through timestamps, soft and spry,
I watched the newer file outshine the dry.
If legacy's fresh, the profile will yield,
Else the newer stay firm upon the field.
Hooray for tidy migration, carrot sealed! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: migration now overwrites stale profile files with fresh data by comparing modification times, which directly addresses the auth failure issue caused by stale cookies.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the migration logic in src/notebooklm/migration.py to overwrite destination files if the source file is newer, ensuring that fresh data written to legacy paths is correctly migrated. I have no feedback to provide.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/notebooklm/migration.py (1)

82-87: Please add explicit tests for both mtime branches.

The new condition introduces two critical paths (dst >= src skip, src > dst overwrite) that are not directly asserted in current migration tests. Add focused cases to prevent regressions in this auth-critical flow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/notebooklm/migration.py` around lines 82 - 87, Add two focused unit tests
that exercise the file-copy mtime branches in migration.py's loop over
legacy_files: (1) create a legacy source file and a destination file where
dst.stat().st_mtime >= src.stat().st_mtime, run the migration routine that
iterates legacy_files, and assert the destination file was not overwritten and
the logger emitted the "Skipping %s (profile copy is same age or newer)"
message; (2) create a legacy source file and a destination file where
src.stat().st_mtime > dst.stat().st_mtime, run the same migration routine, and
assert the destination was overwritten (content changed) by the copy. Use
tmp_path (or tempfile) and os.utime to set mtimes deterministically, locate the
files used by the migration via the same path logic that computes dst =
default_dir / src.name, and verify behavior for the legacy_files -> dst copy
branch governed by the dst.exists() and mtime comparison (dst.stat().st_mtime >=
src.stat().st_mtime).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/notebooklm/migration.py`:
- Around line 82-87: Add two focused unit tests that exercise the file-copy
mtime branches in migration.py's loop over legacy_files: (1) create a legacy
source file and a destination file where dst.stat().st_mtime >=
src.stat().st_mtime, run the migration routine that iterates legacy_files, and
assert the destination file was not overwritten and the logger emitted the
"Skipping %s (profile copy is same age or newer)" message; (2) create a legacy
source file and a destination file where src.stat().st_mtime >
dst.stat().st_mtime, run the same migration routine, and assert the destination
was overwritten (content changed) by the copy. Use tmp_path (or tempfile) and
os.utime to set mtimes deterministically, locate the files used by the migration
via the same path logic that computes dst = default_dir / src.name, and verify
behavior for the legacy_files -> dst copy branch governed by the dst.exists()
and mtime comparison (dst.stat().st_mtime >= src.stat().st_mtime).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dfdd0822-04f3-48a2-bc81-3cb3288a81f2

📥 Commits

Reviewing files that changed from the base of the PR and between abeae92 and 72f8fb2.

📒 Files selected for processing (1)
  • src/notebooklm/migration.py

Copy link
Copy Markdown
Owner

@teng-lin teng-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix, @LittleBitPlanet! The root cause analysis in the PR description is excellent — the comment/code mismatch where "skip if destination already exists and is newer" was never actually implemented is a great catch.

The fix itself is correct and minimal: >= with shutil.copy2's mtime preservation gives you idempotency for free. All 17 CI checks pass and the code reads cleanly.

Multi-model review summary

I ran this through several review passes (including error-handling, test-coverage, and type-design analysis). Here's the consensus:

✅ Strengths

  • Fix implements what the original comment always intended — clean, minimal 4-line diff
  • >= comparison correctly handles idempotency via shutil.copy2's mtime preservation
  • Excellent PR description with clear root cause analysis

📝 Recommendation: add mtime tests before merge

The PR changes a branch condition but doesn't add a test that would fail if the change were reverted. Two focused tests using os.utime would lock in the fix:

def test_overwrites_when_source_is_newer(self, tmp_path):
    """Source file newer than profile copy triggers overwrite (the bug fix)."""
    default_dir = tmp_path / "profiles" / "default"
    default_dir.mkdir(parents=True)
    dst = default_dir / "storage_state.json"
    dst.write_text('{"cookies": ["old"]}')
    os.utime(dst, (1_000_000, 1_000_000))

    src = tmp_path / "storage_state.json"
    src.write_text('{"cookies": ["fresh"]}')
    os.utime(src, (2_000_000, 2_000_000))

    with patch.dict(os.environ, {"NOTEBOOKLM_HOME": str(tmp_path)}, clear=True):
        migrate_to_profiles()
    assert json.loads(dst.read_text()) == {"cookies": ["fresh"]}


def test_skips_when_destination_is_newer(self, tmp_path):
    """Profile copy newer than legacy source is preserved."""
    default_dir = tmp_path / "profiles" / "default"
    default_dir.mkdir(parents=True)
    src = tmp_path / "storage_state.json"
    src.write_text('{"cookies": ["stale"]}')
    os.utime(src, (1_000_000, 1_000_000))

    dst = default_dir / "storage_state.json"
    dst.write_text('{"cookies": ["current"]}')
    os.utime(dst, (2_000_000, 2_000_000))

    with patch.dict(os.environ, {"NOTEBOOKLM_HOME": str(tmp_path)}, clear=True):
        migrate_to_profiles()
    assert json.loads(dst.read_text()) == {"cookies": ["current"]}

💡 Minor observations (not blocking)

  1. Directory migration inconsistency — Lines 94-101 still use the old if dst.exists(): skip pattern for browser_profile/. If browser profile data can be regenerated at the legacy root after a previous migration, the same stale-overwrite issue could apply. Might be worth a follow-up issue.

  2. FAT32 mtime granularity — On FAT32/exFAT, mtime has 2-second precision, so a login within the same 2-second window as a previous copy could produce a false tie. Extremely unlikely in practice but a brief code comment would be nice.

Overall this is a solid, well-motivated fix. Just the two tests to add and it's good to go. 🎉

🤖 Generated with Claude Code

@teng-lin
Copy link
Copy Markdown
Owner

teng-lin commented Apr 5, 2026

Follow-up: deeper investigation

After a more thorough investigation of the code flow, I want to flag some concerns for @teng-lin's consideration.

The described scenario may be unreachable

The PR describes a "recurring auth failure cycle" where login writes fresh cookies to the legacy root path via _legacy_fallback(), then migration deletes the fresh file. However, tracing the actual code paths:

  1. _legacy_fallback() (paths.py:232) returns the root path only when not profile_path.exists() and resolved_profile == "default" and the root path exists
  2. After a successful migration, the root file is deleted and the profile file exists
  3. get_storage_path()_legacy_fallback() → profile path exists → returns profile path
  4. login (session.py:210, 330) writes to get_storage_path() → writes to profile path, not root
  5. I checked all write paths (context.storage_state() in Playwright login, storage_path.write_text() in --browser-cookies) — all go through get_storage_path()_legacy_fallback() → profile path when it exists

There is no code path in this codebase that recreates storage_state.json at the root after a successful migration. For the described scenario to occur, both root and profile copies would need to exist simultaneously with root being newer — but login always writes to wherever get_storage_path() points, which is the profile path once it exists.

The only ways I can see this triggering are:

  • Manual/external creation of ~/.notebooklm/storage_state.json by a user or external tool
  • A crash during migration (but then both files have identical content from shutil.copy2)

The fix itself is correct but the narrative may be overstated

The code change is harmless and technically sound — implementing the mtime check that the original comment always described. The >= comparison with shutil.copy2's mtime preservation is correct for idempotency. It makes the migration more robust against edge cases involving external file manipulation.

However, the "recurring auth failure cycle" framing suggests a critical production bug, when the actual impact appears limited to scenarios involving external file creation outside this tool's control.

Contributor investigation

The contributor account (created 2026-01-21) has no prior activity. The fork → branch → PR was completed in ~61 seconds. The commit is co-authored with "Claude Opus 4.6 (1M context)". This appears to be an AI-generated contribution.

Recommendation

The fix is correct and harmless — I'd still accept it (with the mtime tests I suggested earlier), but wanted to flag these findings for transparency. @teng-lin, you're the best judge of whether there's a scenario I'm missing.

🤖 Generated with Claude Code

@teng-lin teng-lin added the bot-generated Likely AI/bot-generated contribution label Apr 5, 2026
@teng-lin teng-lin added the bug Something isn't working label May 3, 2026
@teng-lin
Copy link
Copy Markdown
Owner

teng-lin commented May 3, 2026

Plausible bug fix and the change is small. Before merge, please add a regression test that exercises the migration path: a stale profile cookie should be overwritten by fresh login data. A unit test in tests/unit/ that drives the migration helper directly would be sufficient.

LittleBitPlanet and others added 2 commits May 12, 2026 21:26
The migration from legacy flat layout to profiles/ skipped copying when
the destination file already existed, regardless of timestamps. This
caused a recurring auth failure cycle:

1. `notebooklm login` writes fresh cookies to the legacy root path
   (because _legacy_fallback resolves there when the root file exists)
2. Next CLI command triggers ensure_profiles_dir() → migrate_to_profiles()
3. Migration sees profiles/default/storage_state.json already exists,
   skips the copy, then deletes the fresh root file
4. The stale profile copy (from a prior migration) is now the only auth
   source → auth fails

Fix: compare st_mtime before skipping. If the legacy root file is newer
than the profile copy, overwrite it. This matches the original comment
intent ("skip if destination already exists and is newer") which was
never implemented in the condition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cover both directions of the new st_mtime check:
- Stale profile copy is overwritten when legacy root is newer
- Up-to-date profile copy is preserved when legacy root is older

The first case is the regression target — verified to fail against the
pre-fix migrate_to_profiles.
@teng-lin teng-lin force-pushed the fix/migration-stale-overwrite branch from 72f8fb2 to 15a2c82 Compare May 13, 2026 01:31
@teng-lin
Copy link
Copy Markdown
Owner

Took this over and added the regression tests. New commit 15a2c82:

  • test_overwrites_stale_profile_when_legacy_is_newer — newer legacy root file overwrites a stale profile copy (the regression target; verified to fail against the pre-fix migrate_to_profiles)
  • test_keeps_newer_profile_when_legacy_is_older — up-to-date profile copy is preserved when the legacy root is older

Branch was 131 commits behind main, so I rebased onto cff4250 first. Diff is unchanged from the original two-line fix plus the two new test cases.

Local pre-commit suite is green: ruff format --check, ruff check, mypy, and full pytest (2665 passed).

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/test_migration.py (1)

140-181: ⚡ Quick win

Add an equal-mtime regression test to lock in >= semantics.

These two tests cover newer/older precedence well, but they don’t exercise the “same timestamp” branch that the migration logic now relies on for idempotency with shutil.copy2. A small third test with equal mtimes would prevent regressions in that exact contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_migration.py` around lines 140 - 181, Add a third unit test
to assert the migration's "equal mtime" branch preserves the profile copy
(locking in >= semantics): create a default profile file "storage_state.json"
and a legacy "storage_state.json" with identical mtimes, call
migrate_to_profiles(), then assert the profile file still contains the profile
data and the legacy file was removed; name the test something like
test_keeps_profile_when_mtimes_equal and reference migrate_to_profiles to locate
the migration logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/test_migration.py`:
- Around line 140-181: Add a third unit test to assert the migration's "equal
mtime" branch preserves the profile copy (locking in >= semantics): create a
default profile file "storage_state.json" and a legacy "storage_state.json" with
identical mtimes, call migrate_to_profiles(), then assert the profile file still
contains the profile data and the legacy file was removed; name the test
something like test_keeps_profile_when_mtimes_equal and reference
migrate_to_profiles to locate the migration logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 615acd16-1fa8-4cb4-b04c-a323b901f94d

📥 Commits

Reviewing files that changed from the base of the PR and between 72f8fb2 and 15a2c82.

📒 Files selected for processing (2)
  • src/notebooklm/migration.py
  • tests/unit/test_migration.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/notebooklm/migration.py

@teng-lin teng-lin merged commit 8a57a0e into teng-lin:main May 13, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot-generated Likely AI/bot-generated contribution bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants