Key features of PLM-ARG Platform


Antibiotic resistance spread has been one of the most urgent health cries. Accurately identifying the antibiotic resistance genes is one of the effective manners to evaluate the spread risk of antibiotic resistance. However, the most widely used methods to identify ARGs were based on homolog detection yielding high false negative.

The PLM-ARG translates protein sequences to embedding representations using the pretrained ESM-1b model, and then achieve ARG identification and resistance category classification simultaneously with the XGBoost model. The PLM-ARG takes the gene sequence or predicted ORF as the input and output including both the resistance categories (if the query was classified as ARG) and the corresponding probability.

The PLM-ARG model was verified using an independent validation set and achieved an MCC of 0.838, outperforming other available ARG prediction tools with an improved improvement range of 51.8% ~ 107.9%.