Skip to content

add: gotree reformat #7052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

add: gotree reformat #7052

wants to merge 9 commits into from

Conversation

ammaraziz
Copy link
Contributor

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ammaraziz

Copy link
Contributor

@pvanheus pvanheus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the wrapper - I've noted some areas of suggested improvement (in addition to the .shed.yml that is missing)

@pvanheus
Copy link
Contributor

pvanheus commented Jun 15, 2025

On the question of "nextstrain format", it might be worth adding this to Galaxy. Here is a description of this format: https://docs.nextstrain.org/projects/auspice/en/stable/releases/v2.html and AuspiceJson would be a subclass of the Json datatype. On adding a datatype: https://docs.galaxyproject.org/en/master/dev/data_types.html

@pvanheus
Copy link
Contributor

pvanheus commented Jun 15, 2025

On the question of "nextstrain format", it might be worth adding this to Galaxy. Here is a description of this format: https://docs.nextstrain.org/projects/auspice/en/stable/releases/v2.html and AuspiceJson would be a subclass of the Json datatype. On adding a datatype: https://docs.galaxyproject.org/en/master/dev/data_types.html

Here's my PR for adding the "auspicejson" format: galaxyproject/galaxy#20466

@ammaraziz
Copy link
Contributor Author

Apologies, I forgot to switch branches when I committed the above fixes for another tool

@bgruening
Copy link
Member

@ammaraziz do you see our comments above?

@ammaraziz
Copy link
Contributor Author

@bgruening I do see them! and will work on them today. I made an error pushing code changes from another branch into this one causing issues I had to revert, thus the above comment (but it seems like my fixes remove the commit history, which is good).

@ammaraziz
Copy link
Contributor Author

ammaraziz commented Jun 20, 2025

On the question of "nextstrain format", it might be worth adding this to Galaxy. Here is a description of this format: https://docs.nextstrain.org/projects/auspice/en/stable/releases/v2.html and AuspiceJson would be a subclass of the Json datatype. On adding a datatype: https://docs.galaxyproject.org/en/master/dev/data_types.html

Here's my PR for adding the "auspicejson" format: galaxyproject/galaxy#20466

Amazing thanks Peter. I'll add nextstrain format as an output for now in anticipation of acceptance.

edit: scratch that, might be best to keep json format for now.

@ammaraziz
Copy link
Contributor Author

ammaraziz commented Jun 23, 2025

@pvanheus @bgruening @SaimMomin12 All comments incorporated.

Ready for final review?

Copy link
Contributor

@pvanheus pvanheus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just this one last change, and then I'm happy.

nexus) TREEFORMAT="nexus" ;;
phyloxml) TREEFORMAT="phyloxml" ;;
json) TREEFORMAT="nextstrain" ;;
*) echo "Unknown format: \$TREEFORMAT" && exit 1 ;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be echo "Unknown format: $format" because at this point TREEFORMAT has not been set.

Copy link
Contributor

@SaimMomin12 SaimMomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ammaraziz for amending changes. Some minor changes inline and I think then we are ready for the merge.

Comment on lines +1 to +3
---
description: gotree toolkit
categories:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
---
description: gotree toolkit
categories:
description: gotree toolkit
categories:

<param name="input_tree" value="test.nex" ftype="nex"/>
<param name="output_format" value="phyloxml" />
<output name="output_tree" file="newick-to.xml">
<assert_contents> <has_text text="phylogeny rooted="/> </assert_contents>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<assert_contents> <has_text text="phylogeny rooted="/> </assert_contents>
<assert_contents>
<has_text text="phylogeny rooted="/>
</assert_contents>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check indentation here

Comment on lines +80 to +82
<output name="output_tree" file="phylo-to.newick" >
<assert_contents> <has_text text="(((hCoV-19"/> </assert_contents>
</output>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<output name="output_tree" file="phylo-to.newick" >
<assert_contents> <has_text text="(((hCoV-19"/> </assert_contents>
</output>
<output name="output_tree" file="phylo-to.newick" >
<assert_contents>
<has_text text="(((hCoV-19"/>
</assert_contents>
</output>

<xml name="citations">
<citations>
<citation type="doi">10.1093/nargab/lqab075</citation>
<yield />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<yield />

Copy link
Member

@wm75 wm75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for contributing @ammaraziz 💛

esac &&

gotree reformat $output_format
--input $input_tree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--input $input_tree
--input '$input_tree'

gotree reformat $output_format
--input $input_tree
--input-format \$TREEFORMAT
--output ./output.tree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't you write to '$output_tree' directly and save the mv?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, more modern style: keep writing to ./output.tree, but let Galaxy handle the file transfer by using from_work_dir in <outputs><data>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see @bgruening 's original suggestion.

</xml>
<xml name="xrefs">
<xrefs>
<xref type="bio.tools">Gotree</xref>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<xref type="bio.tools">Gotree</xref>
<xref type="bio.tools">gotree</xref>

It's the unique bio.tools ID that you need to refer to, not the displayed name. No idea whether this is case-sensitive, but better play it safe.

<test expect_num_outputs="1">
<param name="input_tree" value="test.xml" ftype="phyloxml"/>
<param name="output_format" value="newick" />
<output name="output_tree" file="phylo-to.newick" >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-tests-test-output:

The functional test framework will execute the tool using the parameters defined in the tag sets and generate a temporary file, which will either be compared with the file named in the file attribute value or checked against assertions made by a child assert_contents tag

Since you're already doing a file-based exact comparison in this and all other tests, you shouldn't use any <assert_contents>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you could add, however, to <output> is the ftype attribute to check that not only the right content is produced but that the expected output format is set by Galaxy, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<output name="output_tree" file="phylo-to.newick" >
<output name="output_tree" file="phylo-to.newick" ftype="phyloxml" />

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw that you've been using sim_size before. So if an exact match is not what you want, you can remove the file check and keep only the <assert_contents> (just don't use both simultaneously).

You should then also remove the test files from the PR altogether since they won't be used anyway.

Depending on why exactly you needed sim_size a check with re_match_multiline as the comparison method may also be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants