Efficient known ncRNA search including pseudoknot structures

Cheng Yuan and Yanni Sun

Searching for members of known pseudoknot families is an important component of genome-scale ncRNA annotation. However, the state-of-the-art known ncRNA search method is based on context-free grammar (CFG), which cannot effectively model pseudoknots. Thus, existing CFG-based ncRNA identification tools usually ignore pseudoknots during search. As a result, dozens of sequences that do not contain the native pseudoknot structures are reported by these tools. When pseudoknot structures play important functional roles, these sequences might be false positive members. Therefore, there is a need for efficient detection of pseudoknot structures in large databases.

download source code

source code comes with models of all sub-structure.