Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-4.0
|
3 |
+
datasets:
|
4 |
+
- sfisch/DirectContacts2
|
5 |
+
pipeline_tag: tabular-classification
|
6 |
+
repo: https://github.com/KDrewLab/DirectContacts2_analysis.git
|
7 |
+
---
|
8 |
+
# # DirectContacts2: A network of direct physical protein interactions derived from high throughput mass spectrometry experiments
|
9 |
+
Proteins carry out cellular functions by self-assembling into functional complexes, a process that depends on direct physical interactions
|
10 |
+
between components. While tools like AlphaFold and RoseTTAFold have advanced structure prediction, they remain limited in scaling to the full
|
11 |
+
human proteome. DirectContacts2 addresses this challenge by integrating diverse large-scale protrin interaction datasets, including AP/MS (BioPlex1–3, Boldt et al., Hein et al.),
|
12 |
+
biochemical fractionation (Wan et al.), proximity labeling (Gupta et al., Youn et al.), and RNA pulldown (Treiber et al.), to predict whether ~26 million
|
13 |
+
human protein pairs interact directly or indirectly.
|
14 |
+
|
15 |
+
## Funding
|
16 |
+
NIH R00, NSF/BBSRC
|
17 |
+
|
18 |
+
## Citation
|
19 |
+
|
20 |
+
Erin R. Claussen, Miles D Woodcock-Girard, Samantha N Fischer, Kevin Drew
|
21 |
+
|
22 |
+
## References
|
23 |
+
Kevin Drew, Christian L. Müller , Richard Bonneau, Edward M. Marcotte (2017) Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets. PLOS Computational Biology 13(10): e1005625. https://doi.org/10.1371/journal.pcbi.1005625
|
24 |
+
Samantha N. Fischer, Erin R Claussen, Savvas Kourtis, Sara Sdelci, Sandra Orchard, Henning Hermjakob, Georg Kustatscher, Kevin Drew hu.MAP3.0: Atlas of human protein complexes by integration of > 25,000 proteomic experiments. Molecular Systems Biology 1–33 (2025) doi:10.1038/s44320-025-00121-5.
|
25 |
+
Erickson, Nick, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. "Autogluon-tabular: Robust and accurate automl for structured data." arXiv preprint arXiv:2003.06505 (2020).
|
26 |
+
Huttlin et al. Dual proteome-scale networks reveal cell-specific remodeling of the human interactome Cell. 2021 May 27;184(11):3022-3040.e28. doi: 10.1016/j.cell.2021.04.011.
|
27 |
+
Huttlin et al. Architecture of the human interactome defines protein communities and disease networks. Nature. 2017 May 25;545(7655):505-509. DOI: 10.1038/nature22366.
|
28 |
+
Treiber et al. A Compendium of RNA-Binding Proteins that Regulate MicroRNA Biogenesis.. Mol Cell. 2017 Apr 20;66(2):270-284.e13. doi: 10.1016/j.molcel.2017.03.014.
|
29 |
+
Boldt et al. An organelle-specific protein landscape identifies novel diseases and molecular mechanisms. Nat Commun. 2016 May 13;7:11491. doi: 10.1038/ncomms11491.
|
30 |
+
Youn et al. High-Density Proximity Mapping Reveals the Subcellular Organization of mRNA-Associated Granules and Bodies. Mol Cell. 2018 Feb 1;69(3):517-532.e11. doi: 10.1016/j.molcel.2017.12.020.
|
31 |
+
Gupta et al. A Dynamic Protein Interaction Landscape of the Human Centrosome-Cilium Interface. Cell. 2015 Dec 3;163(6):1484-99. doi: 10.1016/j.cell.2015.10.065.
|
32 |
+
Wan, Borgeson et al. Panorama of ancient metazoan macromolecular complexes. Nature. 2015 Sep 17;525(7569):339-44. doi: 10.1038/nature14877. Epub 2015 Sep 7.
|
33 |
+
Hein et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 2015 Oct 22;163(3):712-23. doi: 10.1016/j.cell.2015.09.053. Epub 2015 Oct 22.
|
34 |
+
Huttlin et al. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell. 2015 Jul 16;162(2):425-40. doi: 10.1016/j.cell.2015.06.043.
|
35 |
+
Reimand et al. g:Profiler-a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res. 2016 Jul 8;44(W1):W83-9. doi: 10.1093/nar/gkw199.
|
36 |
+
|
37 |
+
## Associated Code
|
38 |
+
Code examples using the DirectContacts2 model can be found on our
|
39 |
+
[GitHub](https://github.com/KDrewLab/DirectContacts2_analysis.git)
|
40 |
+
All feature matrices and associated files can be found in the [sfisch/DirectContacts2 datasets
|
41 |
+
repo](sfisch/DirectContacts2)
|
42 |
+
|
43 |
+
# Usage
|
44 |
+
|
45 |
+
## Accessing and using the model
|
46 |
+
DirectContacts2 was constructed using [AutoGluon](https://auto.gluon.ai/stable/index.html) an auto-ML tool. The module [TabularPredictor](https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.html)
|
47 |
+
is used to is used train, test, and make predictions with the model.
|
48 |
+
|
49 |
+
This can be downloaded using the following:
|
50 |
+
|
51 |
+
$ pip install autogluon==0.4.0
|
52 |
+
|
53 |
+
Then it can be imported as:
|
54 |
+
|
55 |
+
>>> from autogluon.tabular import TabularPredictor
|
56 |
+
Note that to perform operations with our model the **0.4.0 version** must be used
|
57 |
+
|
58 |
+
To use the model and make predictions, we show two full code examples using the [full feature matrix]()
|
59 |
+
and the [test feature matrix]() in jupyter notebooks.
|
60 |
+
|
61 |
+
All feature matrices can be pulled using the 'datasets' module from HuggingFace and examples of that are seen on our [GitHub]()
|
62 |
+
and on our [HuggingFace dataset repo: sfisch/DirectContacts2](https://huggingface.co/datasets/sfisch/DirectContacts2)
|
63 |
+
|
64 |
+
|
65 |
+
## Model card authors
|
66 |
+
Samantha Fischer ([email protected])
|