Tuesday 24 July 2012

Week 8: updated Project plan and other enhancements

This week :

1. Earlier clicking on a result row in the "Interaction Comparison Results" would just highlight the interactions in each pathways (pathway1 and pathway2) without focusing onto the interaction. But now it also focuses onto the interaction and zooms-out if necessary for if the interaction is too big to fit into the view.

2. Highlighting all the interaction matches in "Interaction Comparison Results" is now possible. But zoom-out to fit all the highlighted interactions in the pathway does not happen yet.


3. Working (not yet finished ) on many-to-many mapping of the Datanode matches from pathway1 to pathway2. Earlier this was one-to-one mapping appearing as individual rows in "Datanode Comparison Results" table. 


Let me explain: Consider, Gene A, Gene B are two identical Datanodes in pathway1 (i.e they have equivalent Xrefs). Gene C, Gene D are identical Datanodes in pathway2. And the 2 Datanodes A,B in pathway1 and C,D in pathway2 match.

Earlier with one-to-one mapping, the comparison results looked like
Gene A -> Gene C
Gene A -> Gene D
Gene B -> Gene C
Gene B -> Gene D
And clicking on any of the results highlighted a Datanode in Pathway1 and the corresponding matching Datanode in Pathway2. For instance, clicking on row1 (Gene A -> Gene C) highlights Gene A in pathway1 and Gene C in pathway2.


 But in many-to-many  mapping of the Datanode matches, the four individual results above could simply be represented as one single individual result "Gene A, Gene B -> Gene C, Gene D". Clicking which should highlight datanodes Gene A and Gene B in pathway1 and Gene C, Gene D in pathway2. This is taking time since Interaction Comparison utilizes results from Datanode Comparison. So Interaction comparison results will also have to be modified. 


Also if there are multiple instances of a Datanode with same label i.e Gene A, Gene A in pathway1 (i.e There are two instances of GeneA in pathway1) and Gene B and Gene C in pathway2, then it would be represented as Gene A -> Gene B, Gene C

4. Storing the comparison results (Datanode Comparison and Interaction Comparison). For this, I was supposed to come up with a format (CSV, TSV etc) which would best represent the Comparison Results data to be stored in a file. I think for Interaction comparison results, we could just store the Datanodes' labels and graphIds (not sure if GraphId needs to be stored) for each interaction. Not sure if the  lines in the interaction (lines' GraphIds) are be stored as well. As lines don't have labels, storing  its GraphIds wouldn't provide any intelligence if we look at the file ourselves.


Delimiter format for storing Interaction Comparison results in a file: 
<DN1 Label> <colon separation: between a DN's Label and its GraphId> <DN1 GraphId> <comma> <DN2 Label>  <colon separation>  <DN2 GraphId>  <tab separation: between Interaction in pathway1 and its matching counter-part in pathway2> <DN3 Label> : <DN3 GraphId> , <DN4 Label> : <DN4 GraphId> <DN5 Label> : <DN5 GraphId> <new-line: between each Interaction Comparison result>


For DataNode comparison results, the format could be something similar, but I could come up with a format after many-to-many mapping of DataNode Comparison Results is finished.


Updated Project plan: 


1. Scoring system: Generate a score based on the comparison results which would indicate how similar are the two pathways being compared. Scoring would be based on results of Datanode Comparison or Interaction Comparison. A simple scoring system such as the one in org.pathvisio.core.gpmlDiff.BasicSim.java could be used. 


2. Considering Line Arrow types and their  in interactions : Right now type of the arrows at the line ends are ignored when comparing interactions in the pathways. But this might be considered for MIM line arrows. 


3. Integrating Comparison pop-up window inside PathVisio's main-view: This would probably be done after finishing up 1 and 2 above.

No comments:

Post a Comment