3.6 Input sequence file name selection

Next: 3.7 Input save Up: 3. Methods 1 Previous: 3.5 N best

3.6 Input sequence file name selection

At this point, the user is prompted for the name of a file containing 1 or more sequences. These sequences can be in the GenBank [24], EMBL [25], PIR [26], IntelliGenetics [27], or GCG [22] formats. Sequences must use upper-case letters. The program recognizes ``A'', ``C'', ``G'' and ``U''. The letter ``T'' is treated as ``U''. In addition, the letters ``B'', ``Z'', ``H'' and ``V'' are recognized as ``A'', ``C'', ``G'' and ``U'', respectively. In this case, these bases are flagged by the program as being accessible to single-strand nuclease cleavage. Such a flagged base can base pair only if its 3′ neighbor is single-stranded. This prevents bases that are accessible to single-strand specific nucleases from being paired in the middle of helical regions. If other letters are used, they will not be allowed to base pair, nor will they be allowed to contribute to single base stacking energies. I do not advise the use of other letters. Bases can be prevented from pairing by selecting the ``single prohibit'' option (4.6) in the main menu.

Except in ``Multiple molecules'' mode, where each complete sequence is read automatically, the program will display the names of the sequences in the file. The user selects a sequence by number, and is then prompted for the 5′ and finally the 3′ ends of the portion to be folded. Program execution continues with energy file input (3.8).

Michael Zuker
Thu Nov 2 14:28:14 CST 1995