The protein sets which have been used for training and testing the HECTAR prediction method and the Support Vector Machine being used in HECTAR are listed here. Files can be downloaded by a right click on the specific link.
|
| Training protein sets |
| Protein sets which have been obtained by using the Sequence Retrieval System (SRS) via EBI: |
| Secretory pathway signal peptides |
| Mitochondrion targeted |
| No N-terminal target peptide possessing proteins (nucleus/cytoplasm) |
| Protein sets which have been experimentally verified: |
| Heterokont chloroplast targeted |
| Type II signal anchors |
|
| Most of the proteins within the heterokont chloroplast targeted set have been kindly provided by Peter Kroth (University of Konstanz/Germany). The reference article for these proteins is "Protein targeting into complex diatom plastids: functional characterisation of a specific targeting motif." (Gruber et al., Plant Mol. Biology, 2007). Please find detailed descriptions of how we have retrieved the training sets in the BMC Bioinformatics article "HECTAR: A Method to Predict Subcellular Targeting in Heterokonts" (Gschloessl et al., 2008; PubMed ID 18811941). |
|
| Additional test protein sets: |
| Fucus distichus |
| Oomycete/Cryptophytes |
|
| Support Vector Machine: |
| The Multi-class Support Vector Machine which is incorporated in HECTAR can be downloaded here. The algorithm is detailed in the article "Combining protein secondary structure prediction models with ensemble methods of optimal complexity." (Guermeur et al., Neurocomputing, 2004). |