Skip to content

Conversation

savetheclocktower
Copy link
Contributor

This is one of those PRs that may sit here for a while without much attention. But that’s OK, because the alternative is that it sits on my hard drive or in one of my Git stashes.

In theory, we have a pluggable file-watching architecture that handles some of our file-watching tasks. The original PR that added it to Atom explained that various packages were using various file-watching libraries and probably duplicating efforts, so it made sense for it to be part of Atom’s API offering.

nsfw was picked to back the initial implementation, but Atom then forked nsfw, and in parallel were working on their own @atom/watcher library. (Not to be confused with node-pathwatcher, which dates back to the early Atom days and is more challenging to remove from our codebase than asbestos is from buildings.)

We don’t want to maintain our own file-watcher library, but neither do we want to be beholden to someone else’s library. It’s good to have options. A while back I poked around to see how VS Code was handling file watching, and it turns out they’re using @parcel/watcher. It’s built by the Parcel folks and offers a similar feature set to nsfw.

I have weird ideas of “fun”

I gave myself an evening to integrate this into Pulsar as an alternative to nsfw. At first, this PR was designed to point to the master branch — @parcel/watcher claims to support Node versions all the way back to v10 — but it failed during the electron-rebuild stage. That suggests to me that it installs fine on Node 16 (the version I run in my pulsar folder) but not on Node 14 (the major version against which it builds when it does electron-rebuild targeting v12.2.3), hence its engines field may be inaccurate.

That was the first hiccup.

I wrote a wrapper around @parcel/watcher that was nearly identical to the one around nsfw. All worked great until I needed to reload the window; when I did, I got a renderer process crash on a consistent basis. Sentry’s stack trace points to the new library but gives me a pretty generic error:

Screenshot 2025-06-26 at 12 58 05 AM

Pretty much all crashes in N-API are due to thread-safe functions, so that’s not much help! I wonder if it’s not context-aware in practice; N-API gives you the framework to be context-aware, but you still have to do the hard work of not accidentally sharing things between contexts that should not be shared.

I vaguely recalled that VS Code ran their file-watcher in a separate process, so I thought I’d give that a shot. It went great!

Task: the underappreciated core class

We’ve got a class whose purpose is to simplify the job of spawning another process to run a worker. It’s called Task and, though there aren’t many usages within the Pulsar codebase, it was easier to figure out how to use it than to start from scratch. I recalled some of the design decisions we made while building linter-eslint-node and that made things go even faster.

The short version:

  • One process is all you need. No matter how many different folders you want to monitor for changes, you can hold all of their file-watchers in one worker process.
  • It’s the same sort of bi-directional communication you might be accustomed to if you’ve ever worked with WebSockets. In other words: sometimes you send a message to a worker and you want to wait around for the reply, much like an HTTP model. And other times, the worker wants to push stuff to you whenever it arrives — like filesystem events.
  • Ultimately, even though it’s a lot more code than NSFWNativeWatcher, it’s not bad at all. It means one more process per window taking up space in your task manager, but that process uses hardly any CPU and is basically hibernating most of the time.
  • Like most file-watchers (other than nsfw), @parcel/watcher doesn’t have explicit support for renaming as a filesystem event. But since it delivers events in batches, it’s not hard to envision that we could detect renames by looking for two files in the same batch of events — one with type deleted and the other with type created.

Why did I do this?

There’s someone on Discord who often experiences a Pulsar-related process using 100% of CPU (on a single core) while they’ve got it open. Louder fans, lower battery life, et cetera. The mystery goes pretty deep and we’re not sure exactly where the cost is being paid, but at this point we’ve got it narrowed down to file-watching.

This user is on Linux and often has a project window open to a path called /user/home — which, regardless of the particular Linux distro, has the potential to be a busy directory with lots of stuff in it.

On top of which: Linux uses inotify for filesystem watchers. Unlike the FSEvents API on macOS — which is basically a metadata firehose that can easily be filtered down to only the filesystem events you’re interested in — inotify is not recursive and can monitor one directory at a time. So to approximate a recursive watcher, nsfw crawls a directory tree and creates one inotify watcher for each directory.

By itself that isn’t enough to explain how a process ends up using 100% of CPU, but it’s a hard data point to ignore. I asked same user to test what happens if they open a new Pulsar project in a new empty folder; the user says that they’re not able to reproduce the high CPU usage in this test.

Anyway, it felt like a good time to audition an alternative file-watching library, if only to keep our options open! In support situations like the one I just described, I don’t have much to offer for troubleshooting outside of “build Pulsar from scratch, generate a debug build of nsfw, and step through it in your favorite C/C++ debugger” — and that’s rough. It’d be great if I could say something like “try switching to a different filesystem watcher in your settings and see if that makes a difference.”

There are other thoughts I eventually want to capture about recursive vs. non-recursive watchers, but it’s late and I can do that tomorrow.

What should I do with this PR?

Play around with it a bit! There’s not much urgency, since this is feeling like a post-PulsarNext task. My eventual goal is to introduce this as a new option while keeping the same default, much like we did with the SQL state store vis-a-vis the IndexedDB state store. Folks could switch to it to try it out, and if it doesn’t work great, the worst that happens to them is that they’d have to switch back.

That reminds me: the setting to change is under Core -> File System Watcher in the settings GUI. There’s currently only one option (“native operating system APIs”), and the name is misleading, but I envision three options: NSFW, @parcel/watcher, and “default,” where you just let us pick one. Amazingly, you can change this setting on the fly in the middle of a session — the PathWatcher class handles all the work of tearing down the old watchers and creating new ones!

Screenshot 2025-06-26 at 1 23 43 AM

No tests or anything yet, so this is a draft to start out. I'll probably ignore it for a while now, but at least it's on public display.

@savetheclocktower savetheclocktower added the pulsar-next Related to the version of Pulsar that runs on the latest Electron label Jun 26, 2025
@mauricioszabo
Copy link
Contributor

Ok, that is kind of amazing. One thing that I want is to check if we could, somehow, use the same API to remove pathwatcher too in the future (considering this code already does some interesting stuff with tasks).

I'll merge this branch in my own fork and try things out to inform if there's some change (I do have some weirdness with the "normal" file watcher of Pulsar, but I can't really reproduce easily what's wrong)

@asiloisad
Copy link
Contributor

I also experienced some problems with file-watcher in pulsar-next (but it's still much better than in pulsar stable). I'd be happy to test the new option.

The first problem is that after switching to @parcel/watcher stylesheet.less stopped working. It's not even read after start, nor updates are considered.

@savetheclocktower
Copy link
Contributor Author

I also experienced some problems with file-watcher in pulsar-next (but it's still much better than in pulsar stable). I'd be happy to test the new option.

The first problem is that after switching to @parcel/watcher stylesheet.less stopped working. It's not even read after start, nor updates are considered.

Do you meanstyles.less, the user stylesheet? If so, that's odd — I would've thought we used pathwatcher for that instead of nsfw. But I'll take a look. Thanks!

@savetheclocktower
Copy link
Contributor Author

Ok, that is kind of amazing. One thing that I want is to check if we could, somehow, use the same API to remove pathwatcher too in the future (considering this code already does some interesting stuff with tasks).

pathwatcher is doable, too; it'll just take longer and be more painful because it involves API changes.

There are lots of places in the codebase where there are subtle and annoying assumptions that, once I set up a watcher via pathwatcher, it's ready to watch on the very next line. No modern file-watchers work that way; the method that sets up the file-watching is always async. At some point last year I tried just wrapping pathwatcher’s API around watchPath, but the unit tests in particular started failing left and right.

The File and Directory classes might be most annoying ones. When you do…

let file = new File(`/foo/bar/baz.txt`)
file.onDidChange(() => console.log('did change!'))

…it sets up a watcher. But which of those methods could we make async? Probably not the onDidChangeeveryone would trip over that, since we have plenty of other onDidFoo methods in the codebase and all of them are simple synchronous methods that register event listeners. And definitely not the new File, since constructors can't go async!

So this would probably need to change to something like…

let file = new File(`/foo/bar/baz.txt`)
await file.watch(() => console.log('did change!'))

…and I suppose that's not that bad, once I type it out. Same with Directory — we could add an async watch method.

Of course, they can support multiple listeners, so we could instead do…

let file = new File(`/foo/bar/baz.txt`)
file.onDidChange(() => console.log('did change!'))
await file.watch()

…and the file watcher function just emits did-change when it detects filesystem events.

Only trouble is that this will probably break some code in subtle ways, since File and Directory can be imported into packages and used as-is; and we can't change the API too much, or else the existing usages will break entirely. So if we do this version, we'd want:

let file = new File(`/foo/bar/baz.txt`)
file.onDidChange(() => console.log('did change!')) // <-- still sets up the file watcher…
await file.watch() // <-- and this method merely waits until the file-watcher promise resolves

That's probably good enough.

So the solution is to change the expectations of everything that still uses pathwatcher by converting it to a modern async API with as much backward compatibility as possible. Once that happens, we can swap in whatever we want in its place.

@savetheclocktower savetheclocktower changed the title Experiment: Add @pulsar/watcher as a file-watcher Experiment: Add @parcel/watcher as a file-watcher Jun 28, 2025
@savetheclocktower
Copy link
Contributor Author

OK, made a dumb typo. @asiloisad, the stylesheet issue should be fixed now. (Surprised it wasn't broken even more than it was!) @mauricioszabo, if you were testing on this branch, please grab latest and disregard any prior findings.

@asiloisad
Copy link
Contributor

asiloisad commented Jun 30, 2025

The style.less is working fine now. I'm going to test it now. There is one more inconvenience. A console pop-up and instantly close after editor startup or after setting change to @parcel/watcher. Win10x64

PulsarNext_7SNnaiXlGg.mp4

image

* The `nsfw` watcher is now also running in a separate process.
* Both watchers now inherit `core.ignoredNames` in an effort to limit the cost of recursive watchers on Linux.
* Watchers update automatically when `core.ignoredNames` changes.
@savetheclocktower
Copy link
Contributor Author

A console pop-up and instantly close after editor startup or after setting change to @parcel/watcher. Win10x64

Oof! Not my favorite side effect. @asiloisad, if you get a chance, could you install Watchman and see if that changes this symptom?

@savetheclocktower
Copy link
Contributor Author

OK, this PR is updated to reflect further experimentation.

As I ranted about in Discord and in #1306, both of these watcher implementations make it easy to exhaust a limited number of allowed inotify watches on Linux simply through adding a watcher to one or two directories. This happens because directory watchers in both these libraries are automatically recursive.

The least intrusive way to fix this is to apply the user's core.ignoredNames setting when creating the watcher. In both libraries, as far as I can tell, this helps by preventing the watcher from spidering its way through a bunch of descendant paths and adding an inotify watch to each one indiscriminately. It should skip directories like .git and node_modules, for instance. (.git is part of the default list for core.ignoredNames, but node_modules is not; we might want to consider adding it as a default.)

It was hard to make this work in nsfw because, frustratingly, it doesn't allow you to specify ignored paths via glob. Instead you've got to specify every single folder you want to exclude as an absolute path. Libraries like glob and fast-glob exist primarily for this use case (turning globs into lists of actual paths) but I wanted something that would stop searching below any path that matched an exclusion, since it'd just be useless work. So I built something with fdir and micromatch.

It's costly to turn globs into actual file paths! And, as you'd expect, it's more costly when the directory tree is deeper. My “typical” use case test was the pulsar codebase itself (with node_modules added to core.ignoredNames); my nightmare scenario use case test was the .pulsar-next folder (with node_modules and compile_cache added to core.ignoredNames). (@mauricioszabo fixed this in #1305, but I haven't updated this branch yet with that fix because it's handy for testing. :))

fdir can go async — and strongly urges that you use it that way! — but generating all the path exclusions for .pulsar-next took about 4.5 seconds on the renderer process. Strangely, if I went synchronous (against fdir’s recommendations), it took only ~75ms. But that's still far too long to monopolize the renderer process.

So I did what I've already done for @parcel/watcher: I moved nsfw usage into a worker. This freed me to choose the fastest fdir approach without locking up the renderer process. Yet, now that it's in a worker, it turns out that the async approach is a bit faster! The two use cases together took ~330ms to generate their path exclusions, which is nearly half the time it took when they ran synchronously one after the other. (The earlier tests were done in mainline Pulsar, but I switched to PulsarNext when I migrated nsfw to a worker.)

For @parcel/watcher, it's much easier, since it can accept an ignore option that is treated as an array of globs. It handles the glob-matching logic in its native code. Simple tests confirm that it does not respond to touch ./node_modules/foo.txt run from my Pulsar root directory, but does respond to touch foo.txt.

The watchers themselves should also update their lists of exclusions whenever core.ignoredNames is changed, no matter which implementation is used.

I also tweaked the description in config-schema.js to conform to what I had in mind — rather than native vs. emulated, the new choices offered would be all the file-watcher implementations plus “let Pulsar pick.”

@mauricioszabo and @asiloisad, I'm also curious about how the presence of Watchman affects perceived performance and other side effects. @parcel/watcher’s README makes it sound like it's vastly superior, though it may have the biggest benefit when it comes to querying historical changes (something we don't need).

In Linux's case, I wonder whether there's somehow a further reduction in inotify watches used when Watchman is running compared to when it isn't.

In Windows's case, I wonder whether the symptom @asiloisad described above (brief appearance of a terminal window) is something inherent to @parcel/watcher or something inherent to spawning a background task for file-watching (in which case this change would mean it'd start happening no matter which file-watcher you've chosen). I strongly suspect it's the latter, and if I'm right, that would suck righteously and need to be addressed somehow.

@savetheclocktower
Copy link
Contributor Author

The hypothetical visitor to this PR (someone who wasn't in Discord for the discussion) may wonder why I haven't pursued using a project's .gitignore (or other VCS ignore files) for watchPath. After all, node_modules is nearly always specified in a .gitignore file, so if we incorporated VCS exclusions we wouldn't need to add node_modules to core.ignoredNames.

Here's why:

Right now, there isn't a 1:1 correlation between usages of watchPath (each of which creates an instance of a PathWatcher class) and “native” watchers created by either of our implementations (each of which is an instance of a subclass of NativeWatcher). Pulsar tries very hard to keep the PathWatcher:NativeWatcher ratio below 1 by reusing an existing watcher if possible (like if the existing watcher is an ancestor of the one being added) or moving an existing watcher up one level (if the existing watcher path is a sibling path of the new one we're trying to watch).

It's therefore currently impossible for us to use .gitignore to reduce the costliness of a NativeWatcher without those “custom” exclusions possibly getting in the way of another watcher — or failing to be applied because Pulsar didn't actually create a new NativeWatcher instance and instead reused an existing one. To do this, we'd have to prevent NativeWatcher reuse either partially (if a call to watchPath specifies custom exclusions) or entirely (abandoning the whole NativeWatcherRegistry). We could maintain custom exclusions by making it a concern of the PathWatcher instance, but that's far less useful, as it would still attach inotify watches to the directories listed in .gitignore (and all their descendants).

So the options here are:

  1. The status quo of this PR: apply core.ignoredNames as exclusions for all native watchers and don't use VCS exclusions at all.
  2. Add the ability to exclude custom paths via glob as an option to watchPath, but handle it entirely within the PathWatcher instance by skipping notification on any file activity that matches an exclusion.
  3. Add the ability to exclude custom paths via glob as an option to watchPath, and support it by mandating that any PathWatcher instance with a custom exclusion list has its own private NativeWatcher instance that cannot be shared with any other PathWatcher.
  4. Like 3, but use some sort of sophisticated logic to discern whether specified custom exclusions are identical to those of an existing watcher so that there can be at least some reuse of NativeWatchers in this use case. (Painful.)
  5. Abandon the entire NativeWatcherRegistry system as overengineered and make it so that PathWatcher instances always have their own NativeWatcher instances — unless they share the exact same path with another PathWatcher instance and the two agree on their exclusions. (Tempting, but I wonder how it would affect resource utilization in the real world. I should look more at how VS Code handles consolidation of file-watchers to see if there are other strategies we can use that aren't as limiting as our NativeWatcherRegistry.)

…ar into pulsar-next-experimental-parcel-watcher
…when asked to watch a file rather than a directory.
…when switching between file-watcher implementations.
@savetheclocktower
Copy link
Contributor Author

I found a bug in the spec code for watchPath. It's meant to run a describe block twice (once for each of the built-in implementations) — but whichever one we tested first ended up persisting through both runs. This would explain why the watchPath specs didn't catch @parcel/watcher’s inability to watch individual files.

Still, that doesn't explain all the test failures in CI. I can't reproduce the macOS failures locally. I might try to run the watchPath specs in my Linux VM as a sanity check, but I'm willing to bet they'd pass locally, too. Meanwhile, stuff seems to be failing for @asiloisad on Windows. I am not well-equipped to test on Windows (2016 gaming PC is not an ideal candidate), but I might give it a shot eventually if I can't figure something else out.

@savetheclocktower
Copy link
Contributor Author

Actually, I just realized that fdir wasn't in the package.json. That'd explain it. I added it for the NSFW worker's “ignore stuff in core.ignoredNames” logic. Hopefully that addresses the CI failures.

@savetheclocktower
Copy link
Contributor Author

Thinking more about our options for ignored names (as outlined in this comment), I'm gravitating toward a sixth option:

Add a core.ignoredFileWatcherNames setting that would function much like core.ignoredNames, but apply only to file-watching. When we watch a new path, we combine the globs from the two settings.

This would be useful for adding paths that should not necessarily be part of core.ignoredNames, but which are good candidates for file-watcher pruning. .cache, node_modules, and tmp would be good candidates for default values.

@savetheclocktower
Copy link
Contributor Author

There’s someone on Discord who often experiences a Pulsar-related process using 100% of CPU (on a single core) while they’ve got it open. Louder fans, lower battery life, et cetera. The mystery goes pretty deep and we’re not sure exactly where the cost is being paid, but at this point we’ve got it narrowed down to file-watching.

This user is on Linux and often has a project window open to a path called /user/home — which, regardless of the particular Linux distro, has the potential to be a busy directory with lots of stuff in it.

Oh, also: this mystery got solved! Since the project window was the user's entire home directory, this was always going to involve a lot of inotify watches — but there was also a symlink within ~/.wine that targeted /! So the practice of spidering through descendant directories and attaching an inotify watch to each one meant, in practice, that the entire drive was being observed for changes. The user was able to move this symlink to another directory outside of /user/home and this fixed the problem for them.

@asiloisad
Copy link
Contributor

styles.less works again! However there are little problems:

  1. nsfw watcher produce some logs:
image
  1. @parcel/watcher watcher still invoke blink of empty cmd window.

These watchers are much better than previous one!

@savetheclocktower
Copy link
Contributor Author

savetheclocktower commented Aug 3, 2025

styles.less works again! However there are little problems:

  1. nsfw watcher produce some logs:
image
  1. @parcel/watcher watcher still invoke blink of empty cmd window.

These watchers are much better than previous one!

Excellent! The logging is easy to fix, so I just pushed a new commit.

I've got no earthly idea about the empty command prompt window, but I figure there must be some way around that, since VS Code is using this same watcher. I'll come back to it at some point.

@savetheclocktower
Copy link
Contributor Author

I just discovered another regression on this branch: when using the github package's Git sidebar, it seems not to be able to detect when I've switched branches.

I'm nearly certain that this is because of the core.ignoredNames change, and the fact that we're no longer recursively watching within .git folders. If I'm right, this will be pretty straightforward to work around if I can find where the branch detection was actually happening.

@savetheclocktower
Copy link
Contributor Author

My hypothesis was correct. The github package assumes it can call watchPath on the Git repository's root directory (which usually aligns with the project root) and be able to detect changes that happen anywhere within the .git directory. This is thwarted by the default presence of .git in the core.ignoredNames setting.


As an aside: one quick-and-dirty option here would be to stop using core.ignoredNames and invent a new setting that was entirely for file-watching. There are tons of reasons for the editor to ignore .git directories specifically, but they're not major offenders in terms of filesystem depth. So we could have core.ignoredFileWatcherNames and make it include most of the stuff that's already in core.ignoredNames, but not all of it.

That would be one way to fix this without changing anything about the github package's code. But it's not a great experience for the user.


The good news is that the basic functionality of detecting branch changes can be replicated with a simple non-recursive fs.watch-based watcher on the .git directory.

The bad news is that such a change would have to be made in the github package. Since the fork, we've largely handled that package with big rubber gloves and tongs, touching it only when its dependencies need to be updated. The code conventions are vastly different (it was written by Facebook people, I think?), it uses an old version of React, and it generally gives off a mildly distasteful smell.

Rewriting that package is on my long-term to-do list, but for the shorter term I'll have to come up with a more targeted patch if the core.ignoredNames optimization is ever going to land.

@asiloisad
Copy link
Contributor

Is there a chance to merge this branch into pulsar-next branch with beta flag as example? I'm stick to this branch, as it's work much better.

@savetheclocktower
Copy link
Contributor Author

Is there a chance to merge this branch into pulsar-next branch with beta flag as example? I'm stick to this branch, as it's work much better.

I might have to do it without the core.ignoredNames stuff (or have that hide behind a second experimental flag) because the github package needs modification if it can't recursively watch under .git. But it's on my list! First priority is to get PulsarNext shipped and then we'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pulsar-next Related to the version of Pulsar that runs on the latest Electron
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants