IDENTIFICATION OF INTRINSICALLY DISORDERED PROTEINS AND REGIONS BY LENGTH-DEPENDENT PREDICTORS BASED ON CONDITIONAL RANDOM FIELDS

Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields

Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields

Blog Article

Accurate identification of intrinsically disordered proteins/regions (IDPs/IDRs) is critical for predicting protein structure and function.Previous studies have shown that IDRs of different lengths have different characteristics, and several classification-based predictors have been proposed for predicting different types of IDRs.Compared with these classification-based predictors, the previously proposed predictor IDP-CRF Wheelchair Cushion exhibits state-of-the-art performance for predicting IDPs/IDRs, which is a sequence labeling model based on conditional random fields (CRFs).

Motivated by these methods, we propose a predictor called IDP-FSP, which is an ensemble of three CRF-based predictors called IDP-FSP-L, IDP-FSP-S, and IDP-FSP-G.These three Insurance predictors are specially designed to predict long, short, and generic disordered regions, respectively, and they are constructed based on different features.To the best of our knowledge, IDP-FSP is the first predictor that combines a sequence labeling algorithm with IDRs of different lengths.

Experimental results using two independent test datasets show that IDP-FSP achieves better or at least comparable predictive performance with 26 existing state-of-the-art methods in this field, proving the effectiveness of IDP-FSP.Keywords: intrinsically disordered proteins/regions, ensemble predictor, length-dependent predictors, conditional random fields, CRFs.

Report this page