Skip to content

Unintended Node Truncation in Contact Maps / Regulatory Graphs #276

@akutuva21

Description

@akutuva21

I've observed that defining molecule types with names ending in a zero (e.g., ZAP70) can lead to the generation of unexpected, separate nodes in visualization graphs (contact map, regulatory graph) where the trailing zero is truncated (e.g., a node ZAP7 appears alongside ZAP70). Though this node could be easily removed by editing the corresponding graphml files, I wanted to find why it was being created.

Steps to Reproduce & Observations:

  1. Define a molecule type ZAP70(...) and a rule for its production, e.g., Prod_ZAP70: 0 -> ZAP70(...).
  2. Generate a contact map (visualize({type=>"contactmap"})).
  3. The resulting contact map shows two distinct nodes: one for ZAP70 and another for ZAP7.
  4. This behavior was replicated with other names:
    • ZAP700 (molecule type) led to an additional ZAP70 node.
    • ZAP701 (molecule type) behaved as expected (no truncation, no unexpected nodes).

Root Cause Analysis (based on debugging BioNetGen Perl scripts):

The issue appears to stem from the unprettify subroutine in Visualization/NetworkGraph.pm.

  • During the processing of production rules, the molecule name is used to form a transformation string like ->ZAP70.
  • This string is then passed to unprettify by the getReactantsProducts subroutine.
  • The unprettify subroutine contains the line: $string =~ s/0$//g; (with an associated comment "ATOM: take out trailing zero").
  • This regex removes the trailing zero from ->ZAP70, changing it to ->ZAP7.
  • Consequently, getReactantsProducts extracts ZAP7 as the product, which then becomes a distinct "AtomicPattern" and subsequently a separate node in the contact map.
  • Degradation rules (e.g., ZAP70 -> 0) result in the transformation string ZAP70->. The s/0$//g; does not affect this, as the 0 is not the absolute final character, so ZAP70 is correctly identified as the reactant. This leads to the ZAP70 node appearing as expected.

Fix Implemented Locally:

I was able to resolve this issue by commenting out the problematic line in Visualization/NetworkGraph.pm:

sub unprettify {
    # ... other code ...
    my $string = shift @_;
    $string =~ s/\s//g;
    $string =~ s/\(\)//g;
    $string =~ s/^0//g;
    
    # Original problematic line:
    # $string =~ s/0$//g;

    return $string;
}

With this change, ->ZAP70 remains ->ZAP70, getReactantsProducts extracts ZAP70, and the extraneous ZAP7 node no longer appears in the contact map. This fix also correctly handles other names with trailing zeros like "ERK20" (preventing them from becoming "ERK2"). While commenting out s/0$//g; fixes the issue for molecule names ending in zero, I am unsure if this line had another intended purpose.

Additional Context on Running Modified Perl Scripts:

During debugging, I found that I needed to run BioNetGen by directly invoking perl BNG2.pl /path/to/model.bngl. This allowed my edits to the .pm files to take effect.

When attempting to run via the icon by the PyBioNetGen extension, which executes a command like bionetgen -req "0.5.0" run -i "...", my local modifications to the Perl scripts seemed to be disregarded. This suggests the bionetgen CLI wrapper might use a different mechanism than directly running perl BNG2.pl. This is just an observation from my debugging process and might be relevant for others trying to debug or modify the Perl backend.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions