Faculty Publications

SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring

J.-R. Lin
Jianjun Hu, University of South Carolina - ColumbiaFollow

Document Type

Article

Subject Area(s)

Bioinformatics

Abstract

Nuclear localization signals (NLSs) are stretches of residues in proteins mediating their importing into the nucleus. NLSs are known to have diverse patterns, of which only a limited number are covered by currently known NLS motifs. Here we propose a sequential pattern mining algorithm SeqNLS to effectively identify potential NLS patterns without being constrained by the limitation of current knowledge of NLSs. The extracted frequent sequential patterns are used to predict NLS candidates which are then filtered by a linear motif-scoring scheme based on predicted sequence disorder and by the relatively local conservation (IRLC) based masking.

The experiment results on the newly curated Yeast and Hybrid datasets show that SeqNLS is effective in detecting potential NLSs. The performance comparison between SeqNLS with and without the linear motif scoring shows that linear motif features are highly complementary to sequence features in discerning NLSs. For the two independent datasets, our SeqNLS not only can consistently find over 50% of NLSs with prediction precision of at least 0.7, but also outperforms other state-of-the-art NLS prediction methods in terms of F1 score or prediction precision with similar or higher recall rates. The web server of the SeqNLS algorithm is available at http://mleg.cse.sc.edu/seqNLS.

Publication Info

Published in PLoS ONE, Volume 8, Issue 10, 2013, pages e76864-.

Lin, J.-R. & Hu, J. (2013). SeqNLS: nuclear localization signal prediction based on frequent pattern mining and linear motif scoring. PLoS ONE, 8(10), e76864.

http://dx.doi.org/10.1371/journal.pone.0076864

Link to License:

https://creativecommons.org/licenses/by/4.0/

Rights

Lin, J.-R. & Hu, J. (2013). SeqNLS: nuclear localization signal prediction based on frequent pattern mining and linear motif scoring. PLoS ONE, 8(10), e76864.

http://dx.doi.org/10.1371/journal.pone.0076864

Link to License:

https://creativecommons.org/licenses/by/4.0/

Download

Included in

Bioinformatics Commons

COinS

Faculty Publications

SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring

Document Type

Subject Area(s)

Abstract

Publication Info

Rights

Included in

Search

Browse

Submissions

Links

Faculty Publications

SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring

Author(s)

Document Type

Subject Area(s)

Abstract

Publication Info

Rights

Included in

Share

Search

Browse

Submissions

Links