PREreview of Unified knowledge-driven network inference from omics data
- Published
- DOI
- 10.5281/zenodo.14291902
- License
- CC BY 4.0
This was reviewed by my lab group. Overall we really liked this paper. We have computational biology experience but not network biology experience.
We had some difficulty in tracking which algorithm or network solving case is currently considered across all the figures
It was unclear how nodes for input and output are selected in all cases as pictured in figure 2? and why would they be different across samples as shown in that figure
Supplementary Figure 1 seems to be redundantly named as "Figure 1"
How are termini decided throughout each example? Some cases stated this explicitly, especially figure 6, but other in many cases it was unclear
What is the value in Figure 4a?
use gene names instead of SGD codes
if they are trying to show that the conditions they are sampling across the diversity of responses that should be stated in the text
Figure 4b, I don't have good intuition about what these values of M/A mean, can they show z-scores instead?
callouts for fig 4b are also missing in the legend and text
should use gene names instead of SGD codes
Figure 4c, reporting the improvement in precision and recall seems to be hiding the absolute value in precision and recall, the actual metric values should be reported. For example a 30% improvement going from 10% precision to 13% may not be meaningful
Not clear how the 20% hold out experiment was performed, are they saying that if the selected network, then included the high abundance enzymes those were true positives and excluding the low expressed enzymes were true negatives, etc?
The metabolic interpretation of Figure 4d is lacking detail, for example, the claim that “Most of these genes seem to have an indirect connection with the metabolism of fatty acids.” For example ACO1 and PFK2 are essential glycolysis proteins
Somehow more directly show the nature of the true positives and false positives from those held out 20% sets
Figure 4: Confused about how the model is setup here mathematically, inputting quantities of enzymes in these pathways and using PKN with reactions, for example what are the input and output nodes?
The order of the inset vs heatmap in figure 5a doesn't match
Figure 5: We believe that the PKN contributes the network shape, but it's unclear to us if the signs of the connections are derived from the network as suggested by panel e
Fig 6 in general seems very preliminary with very little explanation, for the rest of the paper they were more systematic with comparisons to other tools
Figure 6: Unclear why the need to bring up Fragpipe and why they didn’t use the site level quantities already provided from the paper. State why they needed to re-analyze the raw proteomics data?
Figure 6 panels are explained out of order which is confusing. (a>b>e>d>c).
Figure 6e is missing a sense of what novel information is gained and how it may be different from PHONEMES.
Fig 6d “We selected the most variable proteins as terminal nodes”: State why is this a reasonable approach? We assume that these may be influenced by multiple factors.
Figure 6: unclear if all these nodes represent transcripts because only some nodes are labeled as genes
Figure 1 – Revise typo in diagram: Import Prior Knowledge Network
Figure 3A and 3B – Adjust y-axis up to 200 or input a title with a description like the figure legend
During optimization problems, how are the inverse relationships in down-regulated proteins when compared to increased mRNA levels (vise versa). Or can this be a source of false negatives/positives?
Compared to the shortest path supplement example, its unclear how node values would be used in this case. Are they multiplied by edge weights?
Competing interests
The authors declare that they have no competing interests.