-
Notifications
You must be signed in to change notification settings - Fork 24
Description
I've observed that defining molecule types with names ending in a zero (e.g., ZAP70) can lead to the generation of unexpected, separate nodes in visualization graphs (contact map, regulatory graph) where the trailing zero is truncated (e.g., a node ZAP7 appears alongside ZAP70). Though this node could be easily removed by editing the corresponding graphml files, I wanted to find why it was being created.
Steps to Reproduce & Observations:
- Define a molecule type
ZAP70(...)and a rule for its production, e.g.,Prod_ZAP70: 0 -> ZAP70(...). - Generate a contact map (
visualize({type=>"contactmap"})). - The resulting contact map shows two distinct nodes: one for
ZAP70and another forZAP7. - This behavior was replicated with other names:
ZAP700(molecule type) led to an additionalZAP70node.ZAP701(molecule type) behaved as expected (no truncation, no unexpected nodes).
Root Cause Analysis (based on debugging BioNetGen Perl scripts):
The issue appears to stem from the unprettify subroutine in Visualization/NetworkGraph.pm.
- During the processing of production rules, the molecule name is used to form a transformation string like
->ZAP70. - This string is then passed to
unprettifyby thegetReactantsProductssubroutine. - The
unprettifysubroutine contains the line:$string =~ s/0$//g;(with an associated comment "ATOM: take out trailing zero"). - This regex removes the trailing zero from
->ZAP70, changing it to->ZAP7. - Consequently,
getReactantsProductsextractsZAP7as the product, which then becomes a distinct "AtomicPattern" and subsequently a separate node in the contact map. - Degradation rules (e.g.,
ZAP70 -> 0) result in the transformation stringZAP70->. Thes/0$//g;does not affect this, as the0is not the absolute final character, soZAP70is correctly identified as the reactant. This leads to theZAP70node appearing as expected.
Fix Implemented Locally:
I was able to resolve this issue by commenting out the problematic line in Visualization/NetworkGraph.pm:
sub unprettify {
# ... other code ...
my $string = shift @_;
$string =~ s/\s//g;
$string =~ s/\(\)//g;
$string =~ s/^0//g;
# Original problematic line:
# $string =~ s/0$//g;
return $string;
}With this change, ->ZAP70 remains ->ZAP70, getReactantsProducts extracts ZAP70, and the extraneous ZAP7 node no longer appears in the contact map. This fix also correctly handles other names with trailing zeros like "ERK20" (preventing them from becoming "ERK2"). While commenting out s/0$//g; fixes the issue for molecule names ending in zero, I am unsure if this line had another intended purpose.
Additional Context on Running Modified Perl Scripts:
During debugging, I found that I needed to run BioNetGen by directly invoking perl BNG2.pl /path/to/model.bngl. This allowed my edits to the .pm files to take effect.
When attempting to run via the icon by the PyBioNetGen extension, which executes a command like bionetgen -req "0.5.0" run -i "...", my local modifications to the Perl scripts seemed to be disregarded. This suggests the bionetgen CLI wrapper might use a different mechanism than directly running perl BNG2.pl. This is just an observation from my debugging process and might be relevant for others trying to debug or modify the Perl backend.