Tabular Classification
sfisch commited on
Commit
c6a4057
·
verified ·
1 Parent(s): 72f79fd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ datasets:
4
+ - sfisch/DirectContacts2
5
+ pipeline_tag: tabular-classification
6
+ repo: https://github.com/KDrewLab/DirectContacts2_analysis.git
7
+ ---
8
+ # # DirectContacts2: A network of direct physical protein interactions derived from high throughput mass spectrometry experiments
9
+ Proteins carry out cellular functions by self-assembling into functional complexes, a process that depends on direct physical interactions
10
+ between components. While tools like AlphaFold and RoseTTAFold have advanced structure prediction, they remain limited in scaling to the full
11
+ human proteome. DirectContacts2 addresses this challenge by integrating diverse large-scale protrin interaction datasets, including AP/MS (BioPlex1–3, Boldt et al., Hein et al.),
12
+ biochemical fractionation (Wan et al.), proximity labeling (Gupta et al., Youn et al.), and RNA pulldown (Treiber et al.), to predict whether ~26 million
13
+ human protein pairs interact directly or indirectly.
14
+
15
+ ## Funding
16
+ NIH R00, NSF/BBSRC
17
+
18
+ ## Citation
19
+
20
+ Erin R. Claussen, Miles D Woodcock-Girard, Samantha N Fischer, Kevin Drew
21
+
22
+ ## References
23
+ Kevin Drew, Christian L. Müller , Richard Bonneau, Edward M. Marcotte (2017) Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets. PLOS Computational Biology 13(10): e1005625. https://doi.org/10.1371/journal.pcbi.1005625
24
+ Samantha N. Fischer, Erin R Claussen, Savvas Kourtis, Sara Sdelci, Sandra Orchard, Henning Hermjakob, Georg Kustatscher, Kevin Drew hu.MAP3.0: Atlas of human protein complexes by integration of > 25,000 proteomic experiments. Molecular Systems Biology 1–33 (2025) doi:10.1038/s44320-025-00121-5.
25
+ Erickson, Nick, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. "Autogluon-tabular: Robust and accurate automl for structured data." arXiv preprint arXiv:2003.06505 (2020).
26
+ Huttlin et al. Dual proteome-scale networks reveal cell-specific remodeling of the human interactome Cell. 2021 May 27;184(11):3022-3040.e28. doi: 10.1016/j.cell.2021.04.011.
27
+ Huttlin et al. Architecture of the human interactome defines protein communities and disease networks. Nature. 2017 May 25;545(7655):505-509. DOI: 10.1038/nature22366.
28
+ Treiber et al. A Compendium of RNA-Binding Proteins that Regulate MicroRNA Biogenesis.. Mol Cell. 2017 Apr 20;66(2):270-284.e13. doi: 10.1016/j.molcel.2017.03.014.
29
+ Boldt et al. An organelle-specific protein landscape identifies novel diseases and molecular mechanisms. Nat Commun. 2016 May 13;7:11491. doi: 10.1038/ncomms11491.
30
+ Youn et al. High-Density Proximity Mapping Reveals the Subcellular Organization of mRNA-Associated Granules and Bodies. Mol Cell. 2018 Feb 1;69(3):517-532.e11. doi: 10.1016/j.molcel.2017.12.020.
31
+ Gupta et al. A Dynamic Protein Interaction Landscape of the Human Centrosome-Cilium Interface. Cell. 2015 Dec 3;163(6):1484-99. doi: 10.1016/j.cell.2015.10.065.
32
+ Wan, Borgeson et al. Panorama of ancient metazoan macromolecular complexes. Nature. 2015 Sep 17;525(7569):339-44. doi: 10.1038/nature14877. Epub 2015 Sep 7.
33
+ Hein et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell. 2015 Oct 22;163(3):712-23. doi: 10.1016/j.cell.2015.09.053. Epub 2015 Oct 22.
34
+ Huttlin et al. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell. 2015 Jul 16;162(2):425-40. doi: 10.1016/j.cell.2015.06.043.
35
+ Reimand et al. g:Profiler-a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res. 2016 Jul 8;44(W1):W83-9. doi: 10.1093/nar/gkw199.
36
+
37
+ ## Associated Code
38
+ Code examples using the DirectContacts2 model can be found on our
39
+ [GitHub](https://github.com/KDrewLab/DirectContacts2_analysis.git)
40
+ All feature matrices and associated files can be found in the [sfisch/DirectContacts2 datasets
41
+ repo](sfisch/DirectContacts2)
42
+
43
+ # Usage
44
+
45
+ ## Accessing and using the model
46
+ DirectContacts2 was constructed using [AutoGluon](https://auto.gluon.ai/stable/index.html) an auto-ML tool. The module [TabularPredictor](https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.html)
47
+ is used to is used train, test, and make predictions with the model.
48
+
49
+ This can be downloaded using the following:
50
+
51
+ $ pip install autogluon==0.4.0
52
+
53
+ Then it can be imported as:
54
+
55
+ >>> from autogluon.tabular import TabularPredictor
56
+ Note that to perform operations with our model the **0.4.0 version** must be used
57
+
58
+ To use the model and make predictions, we show two full code examples using the [full feature matrix]()
59
+ and the [test feature matrix]() in jupyter notebooks.
60
+
61
+ All feature matrices can be pulled using the 'datasets' module from HuggingFace and examples of that are seen on our [GitHub]()
62
+ and on our [HuggingFace dataset repo: sfisch/DirectContacts2](https://huggingface.co/datasets/sfisch/DirectContacts2)
63
+
64
+
65
+ ## Model card authors
66
+ Samantha Fischer ([email protected])