Skip to content

cmd-diff: a few misc enhancements #4253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

dustymabe
Copy link
Member

See individual commit messages.

We now have a --metal-ls and --metal-du and also a --difftool to tell cosa diff to output using git difftool versus git diff so we can use vimdiff.

It had some common elements so let's use a loop.
Since we could be operating on directories or files change
file_from -> path_from and file_to -> path_to.

Also change the temporary output filenames to more properly indicate
they are outputs.
This strategy simply changes directory into the given path before
running the provided command rather than replacing a templated `{}`
with the path.

Useful for commands that operate more cleanly when operated on in
the directory where you want the operation to occur.
Having vimdiff is a lot more powerful to me because you can manually
do a few things in the terminal to massage the files on each side of the
diff to give you more information (i.e. narrow in on exactly what diff
you are interested in). Let's add a --difftool boolean flag to trigger
`git difftool`, which will allow diffs to be displayed using vimdiff.

Additionally include vim-enhanced so we have `vimdiff` installed and
at our disposal.
This way we can have more commands that can leverae this code for
different "diffs" on the resulting mounted filesystems. Prep for
a future commit.
A few more views of differences between two metal disk image mounted
filesystems.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several enhancements to cmd-diff, including new --metal-ls and --metal-du differs and a --difftool option to use an external diff tool like vimdiff. The refactoring to support the new differs is well-executed, with shared logic extracted into a reusable generator.

My review focuses on a couple of areas for improvement. The implementation of --difftool relies on a global variable, which can affect maintainability. I've suggested alternatives to make this dependency explicit. I also found a minor issue with a superfluous attribute access in the diff_cmd_outputs function. Overall, these are good additions to the tool.

Comment on lines +59 to +61
# Modify the USE_DIFFTOOL global based on the --difftool argument
global USE_DIFFTOOL
USE_DIFFTOOL = args.difftool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a global variable like USE_DIFFTOOL makes the code harder to reason about and test due to hidden dependencies. It's better to make dependencies explicit.

A preferred approach would be to pass the difftool setting down through function arguments. While this might require a larger refactoring, it would improve code quality. A less disruptive alternative could be to encapsulate settings into a configuration object that is created in main() and passed to the functions that need it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A preferred approach would be to pass the difftool setting down through function arguments. While this might require a larger refactoring, it would improve code quality. A less disruptive alternative could be to encapsulate settings into a configuration object that is created in main() and passed to the functions that need it.

indeed. I started this way, but passing that around to all the functions proved a bit more "unclean" than I wanted. It was part me being lazy and part just not wanting to pass it all around because it's ugly.

Though, this is ugly too.

Let's see what other people think.

@HuijingHei
Copy link
Member

I tried to build cosa with the patch on aarch64 and failed, not sure if I missed something.

[coreos-assembler]$ sudo dnf install -y vim-enhanced
[coreos-assembler]$ git clone -b dusty-cmd-diff-enhancements https://github.com/dustymabe/coreos-assembler.git
[coreos-assembler]$ cd coreos-assembler/
[coreos-assembler]$ git branch -v
* dusty-cmd-diff-enhancements 8ef5b0e2d cmd-diff: support --metal-du and --metal-ls
[coreos-assembler]$ make && sudo make install

[coreos-assembler]$ cosa buildfetch --build=42.20250807.20.0 --artifact=metal --arch=aarch64
[coreos-assembler]$ cosa buildfetch --build=43.20250801.91.0 --artifact=metal --arch=aarch64

[coreos-assembler]$ cosa diff --from=42.20250807.20.0 --to=43.20250801.91.0 --arch=aarch64 --metal-du
Error in guestfs process for builds/43.20250801.91.0/aarch64/fedora-coreos-43.20250801.91.0-metal.aarch64.raw.xz: findfs_label: findfs exited with status 1: findfs: unable to resolve 'LABEL=root'            
Error in guestfs process for builds/42.20250807.20.0/aarch64/fedora-coreos-42.20250807.20.0-metal.aarch64.raw.xz: findfs_label: findfs exited with status 1: findfs: unable to resolve 'LABEL=root'
2025-08-12 10:23:57,952 INFO - Running command: ['fusermount', '-u', 'tmp/diff-cache/metal/42.20250807.20.0']
fusermount: entry for /home/jenkins/agent/workspace/debug-pod/tmp/diff-cache/metal/42.20250807.20.0 not found in /etc/mtab
2025-08-12 10:23:57,954 INFO - Running command: ['fusermount', '-u', 'tmp/diff-cache/metal/43.20250801.91.0']
fusermount: entry for /home/jenkins/agent/workspace/debug-pod/tmp/diff-cache/metal/43.20250801.91.0 not found in /etc/mtab
Traceback (most recent call last): 
  File "/usr/lib/coreos-assembler/cmd-diff", line 500, in <module>
    main()
    ~~~~^^                                                                                                                         
  File "/usr/lib/coreos-assembler/cmd-diff", line 103, in main
    differ.function(diff_from, diff_to)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/coreos-assembler/cmd-diff", line 420, in diff_metal_du
    for mount_dir_from, mount_dir_to in diff_metal_helper(diff_from, diff_to):
                                        ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/coreos-assembler/cmd-diff", line 393, in diff_metal_helper
    raise Exception(f"A guestfs process for {os.path.basename(d)} died unexpectedly.")
Exception: A guestfs process for 42.20250807.20.0 died unexpectedly.
failed to execute cmd-diff: exit status 1

@dustymabe
Copy link
Member Author

I tried to build cosa with the patch on aarch64 and failed, not sure if I missed something.

Potentially running out of memory? Or maybe the machine you are on doesn't support virtualization (i.e. does /dev/kvm exist)?

@dustymabe
Copy link
Member Author

[coreos-assembler]$ cosa buildfetch --build=42.20250807.20.0 --artifact=metal --arch=aarch64
[coreos-assembler]$ cosa buildfetch --build=43.20250801.91.0 --artifact=metal --arch=aarch64

Make sure to also:

cosa decompress --build=42.20250807.20.0
cosa decompress --build=43.20250801.91.0

The .stdout at the end of this line is unnecessary and has no effect.
When stdout is redirected to a file object, the stdout attribute of
the returned CompletedProcess object is None.

Assisted-By <gemini-code-assist>

coreos#4253 (comment)
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like new differs.

src/cmd-diff Outdated
Comment on lines 408 to 411
c = list(cmd)
if '{}' not in c:
c += ['{}']
idx = c.index('{}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be in the loop. Probably cleaner to still clone the list for each iteration though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. it doesn't need to be in the loop but keeping the code grouped together with the strategy was nice rather than having multiple if strategy= branches.

@@ -401,16 +401,21 @@ def diff_metal(diff_from, diff_to):
shutdown_process(p_to)


def diff_cmd_outputs(cmd, path_from, path_to):
def diff_cmd_outputs(cmd, path_from, path_to, strategy='template'):
workingdir = os.getcwd()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor/optional: in situations like this, I prefer to use the signature default so that it's equivalent to not passing an argument at all in the default case. Often (like here), that's None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e.

Suggested change
workingdir = os.getcwd()
workingdir = None

? I can make that change if you like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly.

@@ -401,16 +401,21 @@ def diff_metal(diff_from, diff_to):
shutdown_process(p_to)


def diff_cmd_outputs(cmd, path_from, path_to):
def diff_cmd_outputs(cmd, path_from, path_to, strategy='template'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels a bit weird to use a string for this. Couldn't it just be a boolean?

That avoids the assert also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of liked leaving this open to different strategies in the future and also being more explicit in the naming.

i.e. what would the boolean argument's name be? use_cd_not_template=False ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just chdir?

Or we could make it into a proper enum too like OSTreeImport if we want to keep this structure.

@@ -110,3 +110,6 @@ python3-libguestfs

# For generating kubernetes YAML files (e.g Konflux resources)
kustomize

# For vimdiff
vim-enhanced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, how can we structure this so that this functionality is not dependent on the $preferred_tool shipping in cosa? One option is that cosa diff --difftool in that case just outputs the command and you run it.

Or... it could also do something like systemd-run --user -t --same-dir --wait --collect git difftool ... maybe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I like git-delta so being able to plug my own tool would be nice. How about we juste write the git diff output to a file and then we can invoke the tool we want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cosa diff --difftool in that case just outputs the command and you run it.

Only thing about this is it implies not cleaning up after itself.

Or... it could also do something like systemd-run --user -t --same-dir --wait --collect git difftool ... maybe.

This implies functionality we don't necessarily have today. i.e. in the run via bash function case where we run inside the COSA container, we don't have things mounted into the container that would allow it to run a systemd unit on the host.

It definitely sucks installing this into the COSA container too, but it was the easiest thing to do. We don't have to do that, i'd just continue to do what I've been doing, which is cosa shell and then sudo dnf install vim before running cosa diff --difftool

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing about this is it implies not cleaning up after itself.

Indeed. I think it's OK in that case to tell the user they're responsible for cleanup.

This implies functionality we don't necessarily have today. i.e. in the run via bash function case where we run inside the COSA container, we don't have things mounted into the container that would allow it to run a systemd unit on the host.

Right yeah, the idea would be to add another option to the alias.

It definitely sucks installing this into the COSA container too, but it was the easiest thing to do. We don't have to do that, i'd just continue to do what I've been doing, which is cosa shell and then sudo dnf install vim before running cosa diff --difftool

Not strongly against to be clear, but I think it'd be a larger win to try to make this work for everyone.

# Now that the mounts are live, we can diff them
git_diff(mount_dir_from, mount_dir_to)
# Allow the caller to operate on these values
yield mount_dir_from, mount_dir_to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a more appropriate model for the single yield case be to just take a function as argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess. Like a callback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into this a little, but time boxed myself on it and ran out of time. Can we leave it and improve later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I mean something like

def diff_metal_ls(diff_from, diff_to):
	cmd = ['find', '.']
	def differ(mount_dir_from, mount_dir_to):
		diff_cmd_outputs(cmd, mount_dir_from, mount_dir_to, strategy='cd')
	diff_metal_helper(diff_from, diff_to, differ)

or did that not work?

@HuijingHei
Copy link
Member

Sorry for the late reply, test on x86_64, and it works.
(One minor question, not sure if can make the list sorted)

cosa buildfetch --build=42.20250807.20.0 --artifact=metal --arch=x86_64
cosa decompress --build=42.20250807.20.0
cosa buildfetch --build=43.20250801.91.0 --artifact=metal --arch=x86_64
cosa decompress --build=43.20250801.91.0

[coreos-assembler]$ cosa diff --from=42.20250807.20.0 --to=43.20250801.91.0 --arch=x86_64 --metal-du --difftool
2025-08-13 06:37:16,694 INFO - Running command: ['git', 'difftool', '--no-index', '/tmp/find-qoyoxybk', '/tmp/find-qy6vimkn']
  3.5G    .                                                          |  3.6G    .                                                         
  149M    ./boot                                                     |  150M    ./boot                                                    
  4.0K    ./boot/coreos                                              |  132M    ./boot/ostree                                             
  3.0K    ./boot/loader.1                                            |  132M    ./boot/ostree/fedora-coreos-a2102009fc58957d7daf010ca0db7c
  2.0K    ./boot/loader.1/entries                                    |  ------------------------------------------------------------------
  131M    ./boot/ostree                                              |  ------------------------------------------------------------------
  131M    ./boot/ostree/fedora-coreos-09e2567a4a559a26f3d9971d1999efe|  ------------------------------------------------------------------
  7.5M    ./boot/efi                                                 |  7.5M    ./boot/efi
  7.5M    ./boot/efi/EFI                                             |  7.5M    ./boot/efi/EFI
  1016K   ./boot/efi/EFI/BOOT                                        |  1016K   ./boot/efi/EFI/BOOT
  6.5M    ./boot/efi/EFI/fedora                                      |  6.5M    ./boot/efi/EFI/fedora
  -------------------------------------------------------------------|  3.0K    ./boot/loader.1                                           
  -------------------------------------------------------------------|  2.0K    ./boot/loader.1/entries                                   
  -------------------------------------------------------------------|  4.0K    ./boot/coreos                                             
  11M     ./boot/grub2                                               |  11M     ./boot/grub2
  3.0M    ./boot/grub2/i386-pc                                       |  ------------------------------------------------------------------
  5.3M    ./boot/grub2/locale                                        |  5.3M    ./boot/grub2/locale
  2.3M    ./boot/grub2/fonts                                         |  2.3M    ./boot/grub2/fonts
  -------------------------------------------------------------------|  3.0M    ./boot/grub2/i386-pc                                      
  12K     ./boot/lost+found                                          |  12K     ./boot/lost+found
  3.3G    ./ostree                                                   |  3.4G    ./ostree 

@dustymabe
Copy link
Member Author

(One minor question, not sure if can make the list sorted)

The answer to this is nuanced. We could potentially add a pipe here to sort the output, but with vimdiff (i.e. when using --difftool) it's so powerful because when the diff view comes up it's just vim so I can pipe what's in either pane through any program.

When I was doing this I'd do just that. I'd pipe the left and right panes through sort -k 2 and that would sort based on the filepath.

I'd also do things like remove the unique hashes from the output so that the diffs were normalized (i.e. removing unique things that aren't important to the diff).

So I view this as something that can be solved by using a powerful difftool rather than having to bake all that functionality directly into cosa diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants