Monday, 14 May 2012

GSOC 2012 - Pathway Comparison - Project Idea



My ideas for the project
The project's goal is to create a plugin for PathVisio which would be responsible for comparing pathways based on Data nodes and their interactions.
Proposed working model:
The plugin will have options (File selection fields) that would allow users to load two pathway files into PathVisio and a button to compare them. Clicking on "Compare" will popup a Difference Viewer Window (this could probably be the workaround till its possible to load and display two pathways in Pathvisio's window) showing the 2 pathways sitting adjacent to each other. The 2 pathways could be drawn on two separate panels/windows inside the main Difference Viewer window (could use JSplitPane for the main window). There will also be another partition to the main window which would list the differences between the 2 pathways. I borrowed this idea of "including the Difference List* in the difference viewer" from Rianne Fitjen's (fellow GSOC applicant for this project) proposal. Earlier I thought to show the Difference List in the plugin's view itself, but including it in the Difference Viewer's window seems more natural and intuitive from a user's standpoint. Clicking on an item in the Difference List would highlight the respective difference in both the pathways.
*Difference List* : Although I am calling it so, its actually a list of data nodes and interactions that are commonly present in both pathways.
Currently, I am working on a prototype of the project, which as of now draws two pathways in two separate internal-windows (adjacent to each other) contained inside a main window. I could come up with an improved version of this prototype before the GSOC program starts.

Timeline for the project: (April 24 to August 13 ~ 16 weeks)
Week 1,2:
1. Load the two pathways from inside the plugin (drawing pathways is not required for this step) and get reference to the 2 Java Objects: VPathway and Pathway, for each of the 2 pathways. I have already looked into the PathVisio code for this and I should be able to do this in a day or two. 
VPathway Object:  SwingEngine.getEngine().getActiveVPathway() returns this Object, which represents the view (the Graphics) of the loaded pathway. This object could be used to draw pathways on the aforementioned Difference Viewer pop-up and also to highlight certain nodes/lines in the Pathway.
Pathway Object: SwingEngine.getEngine().getActivePathway() returns "Pathway" Object, which represents the GPML parsed Data Model that is used in PathVisio to represent pathway information. This object would be used when we do comparison of the pathways. 
2. Work on comparing the two pathways using the reference to the 2 "Pathway" objects, one from each of the pathways (outcome from step 1). The comparison would identify the DataNodes and interactions (lines connected to Datanodes) that are commonly present in both pathways. Comparing on DataNodes shouldn't be difficult whereas comparing the interactions in the two pathways might take a little extra time i.e it could extend into week 2.
3. A thing which has to be kept in mind (as suggested by Mentor Martina) is establishing the identifier mapping between the 2 pathways' Datanodes. i.e if the same gene is present in both of the pathways with a different ID (i.e one has Entrez Gene identifier while the other uses Ensembl id), then they should be recognized as the same. So we have to use BridgeDb to map the identifiers. Right Now, I am not entirely sure on how this could be done.
Here is what I have in mind: Even before we start the comparison, we should first identify such Datanodes which use different IDs in the two pathways, but actually mean the same . For this, we could run a BridgeDB mapping on genes/metabolites from pathway#1 to genes/metabolites on pathway#2 respectively, and then filter out the genes which are mapped to the same ID. These filtered datanodes will not undergo (i.e simply bypass) the comparison process and instead they will be added directly into the Difference List*.
A little bit of this could spill over to week three, as I would need to learn how to work with BridgeDB.
Week 3,4:
1. Once we have computed the Difference List, we could go ahead and focus on drawing the two loaded pathways onto the Difference Viewer, a window which shows the two pathways next to each other, along with another partition which shows the list of differences.
I have a partial prototype ready, as mentioned above in the proposal. So this part shouldn't be as much difficult as what I had thought earlier. Therefore during this time, I could also work on some additional features that would make the Difference Viewer's UI look better and more accessible to the user.
But these things are only important as long as they could be integrated into the PathVisio's main view, which would eventually be able to display 2 pathways in comparison-mode. So any improvements on the Difference Viewer should be made keeping this in mind.
Hence another option (instead of the option to work on Difference Viewer UI improvements) is to work on PathVisio's core to make it possible to load and display two pathways in PathVisio in comparison mode. I will require a lot of help from the mentor and the developers within the PathVisio community.
Week 5, 6:
1. Work on displaying the Difference List (comparison data) in a viewable-clickable format, such that they are displayed in a row-by-row alignment in the partition inside the Difference viewer. Also keep this flexible enough so that it could be easily shifted into plugin's tab view (JPanel) later on. This should help when PathVisio's main view is ready to display two pathways inside it in comparison mode.
2. Receiving click events from the Difference Viewer's partition that contains the Difference List (data). This means we will be extracting information from the item that was clicked in the Difference List and then propagate this information to the pathways drawn in the Difference Viewer to highlight the respective datanode/interaction in both the pathways.
Week 7,8,9:
Discuss among the PathVisio community about how to proceed with the coding on PathVisio's core source code, so as to provide PathVisio with the capability to draw two pathways inside PathVisio's main view. Currently the software allows loading and viewing of only 1 pathway at a time.
After this is done, the Difference List could then be displayed in the plugin's tab view itself. And the external Difference Viewer window (workaround until this point) will then no longer be necessary, although it can be used as a reference.
I have taken 3 weeks for this, as these changes will affect PathVisio's core. Hence it may require lot of discussion before hand and I have also taken into account the time to work on the side effects to the stable running code that may arise with the introduction of this feature at the core level. 
Week 10 to 14: 
Work on adding other advanced features to the Comparator.
Compensation for exams, other emergencies if any (1.5 weeks). 
Week 15,16:
Two weeks of Testing Time: one in the middle of the program and the other in the end so as to work on bug fixes, code improvements/optimization and Documentation. 

No comments:

Post a Comment