Archaeon Extremophile Enzyme Discovery Pipeline

Report generated: 2026-03-31 00:02:45  |  Version 1.0.0  |  Sources: NCBI  |  Total candidates evaluated: 85

20
Candidates Ranked
94.8
Top IAS Score
77.0
Mean IAS Score
5
Structures Predicted
unknown
Top Predicted Family
Thermus thermophilus
Top Candidate Organism
Research Tool Disclaimer: Archaeon is a computational discovery pipeline. All scores and predictions are theoretical. Candidates require experimental validation before any industrial, clinical, or commercial use. Percent identity to reference enzymes does not guarantee functional equivalence.
Score Distribution and Analysis
Ranked Candidates
RankAccessionDescriptionOrganism BiomePredicted FamilyIAS BLAST IdentityLength (aa)
#1 WP_462843465.1 FAD-dependent oxidoreductase [Thermus thermophilus] Thermus thermophilus Thermus thermophilus unknown 94.79 100.00% 288
#2 WP_011277367.1 aminotransferase class I/II-fold pyridoxal phosphate-depende Sulfolobus acidocaldarius Sulfolobus acidocaldarius unknown 92.94 100.00% 317
#3 WP_462847055.1 ArsR family transcriptional regulator [Thermus thermophilus] Thermus thermophilus Thermus thermophilus unknown 92.83 100.00% 231
#4 WP_462843577.1 PhoU domain-containing protein [Thermus thermophilus] Thermus thermophilus Thermus thermophilus unknown 92.64 100.00% 489
#5 WP_010867187.1 pyridoxal-phosphate dependent enzyme [Pyrococcus abyssi] Pyrococcus abyssi Pyrococcus abyssi unknown 92.53 100.00% 330
#6 WP_010884169.1 pyridoxal-phosphate dependent enzyme [Pyrococcus horikoshii] Pyrococcus horikoshii Pyrococcus horikoshii unknown 72.39 325
#7 WP_462847087.1 MFS transporter [Thermus thermophilus] Thermus thermophilus Thermus thermophilus unknown 72.27 367
#8 VAY86272.1 Phenylalanyl-tRNA synthetase beta chain [hydrothermal vent m hydrothermal vent metagenome water unknown 72.10 775
#9 VAY86277.1 BatA (Bacteroides aerotolerance operon) [hydrothermal vent m hydrothermal vent metagenome water unknown 72.06 251
#10 WP_462847059.1 AAA family ATPase [Thermus thermophilus] Thermus thermophilus Thermus thermophilus unknown 72.02 372
#11 WP_462846924.1 ABC transporter permease [Thermus thermophilus] Thermus thermophilus Thermus thermophilus unknown 71.77 336
#12 WP_288073270.1 thiamine pyrophosphate-dependent enzyme [Pyrococcus sp.] Pyrococcus sp. Pyrococcus sp. unknown 71.58 288
#13 VAY86268.1 FIG00388595: hypothetical protein [hydrothermal vent metagen hydrothermal vent metagenome water unknown 71.57 513
#14 WP_230952556.1 aminotransferase class III-fold pyridoxal phosphate-dependen Sulfolobus acidocaldarius Sulfolobus acidocaldarius unknown 71.46 387
#15 VAY86280.1 2-amino-4-hydroxy-6-hydroxymethyldihydropteridinepyrophospho hydrothermal vent metagenome water unknown 71.40 161
#16 WP_014734129.1 aminotransferase class V-fold PLP-dependent enzyme [Pyrococc Pyrococcus sp. ST04 Pyrococcus sp. ST04 unknown 71.40 396
#17 WP_168064672.1 thiamine pyrophosphate-dependent enzyme [Sulfolobus sp. S-19 Sulfolobus sp. S-194 Sulfolobus sp. S-194 unknown 71.30 598
#18 WP_462843723.1 HoxN/HupN/NixA family nickel/cobalt transporter [Thermus the Thermus thermophilus Thermus thermophilus unknown 71.27 220
#19 WP_462843641.1 2Fe-2S iron-sulfur cluster-binding protein [Thermus thermoph Thermus thermophilus Thermus thermophilus unknown 71.27 327
#20 WP_011011122.1 MULTISPECIES: pyridoxal-phosphate dependent enzyme [Pyrococc Pyrococcus Pyrococcus unknown 71.18 329
Predicted Structures (ESMFold)

Structure: WP_462843465.1 (Rank #1)

Mean pLDDT: 89.5 | Very High: 65% | High: 32% | Low: 2% | Very Low: 1%
Very High (≥90) High (70-90) Low (50-70) Very Low (<50)

Structure: WP_011277367.1 (Rank #2)

Mean pLDDT: 76.8 | Very High: 18% | High: 54% | Low: 23% | Very Low: 5%
Very High (≥90) High (70-90) Low (50-70) Very Low (<50)

Structure: WP_462847055.1 (Rank #3)

Mean pLDDT: 86.7 | Very High: 58% | High: 31% | Low: 11% | Very Low: 0%
Very High (≥90) High (70-90) Low (50-70) Very Low (<50)

Structure: WP_462843577.1 (Rank #4)

Mean pLDDT: 92.3 | Very High: 75% | High: 25% | Low: 0% | Very Low: 0%
Very High (≥90) High (70-90) Low (50-70) Very Low (<50)

Structure: WP_010867187.1 (Rank #5)

Mean pLDDT: 93.3 | Very High: 85% | High: 15% | Low: 0% | Very Low: 0%
Very High (≥90) High (70-90) Low (50-70) Very Low (<50)

Methodology Notes

Data Sources: Protein sequences were retrieved from NCBI Entrez (protein database) and MGnify (metagenomic protein database) using targeted biome queries for extreme environments including hydrothermal vents, hot springs, hypersaline lakes, and acidic environments.

IAS Formula: IAS = 0.40 * Thermostability + 0.20 * Quality + 0.40 * BLAST. Thermostability component combines aliphatic index (30%), instability index (25%), charged residue fraction (25%), proline content (10%), and aromaticity (10%). BLAST component uses percent identity to the best match among 8 reference thermostable enzyme families.

Structure Prediction: ESMFold (Meta AI, Lin et al. 2022) via public REST API. pLDDT scores embedded in B-factor column of PDB output. Only sequences under 400 aa predicted (truncated otherwise).

Limitations: IAS scores are theoretical rankings, not experimental thermostability measurements. BLAST identity cutoff for reliable function annotation is approximately 40-60% depending on protein family. pLDDT scores are confidence estimates for structural coordinates, not thermostability predictions.