NAME

hybrid2.pl — simulate a one- or two-sequence ensemble of nucleic acids

SYNOPSIS

hybrid2.pl --A0=A0 --B0=B0 [OPTION]... FILE1 FILE2

DESCRIPTION

hybrid2.pl simulates an ensemble of two RNA or DNA sequences. Such an ensemble consists of five species - one heterodimer, two homodimers and two monomers - but species known to be irrelevant can be excluded to save time. In its default mode, hybrid2.pl operates by running hybrid to simulate each dimer and hybrid-ss to simulate each monomer. It then runs concentration and concentrations.pl to compute mole fractions of each species, followed by ensemble-dg and ensemble-ext to compute the free energy and UV absorption of the ensemble. Finally, hybrid2.pl runs dG2dH, dG2dS and dG2Cp on the ensemble free energy.

In addition to the output of each subprogram it runs, hybrid2.pl produces extra output in the form of Postscript plots produced wth gnuplot. The plots are named with a prefix consisting of the prefixes of each file concatenated with a hyphen. The fraction of each species present is plotted versus temperature in prefix.conc.ps, the ensemble heat capacity (with melting temperature indicated) in prefix.Cp.ps and the ensemble extinction in prefix.ext.ps. In each case, the file of gnuplot commands is also saved with an extension of .gp.

If FILE2 is not specified or is the same as FILE1, or if the two files contain the same sequence, hybrid2.pl simulates a one-sequence ensemble instead. In this case, there are two species (monomer and homodimer) and hybrid2.pl runs concentration-same, concentrations-same.pl, ensemble-dg-same and ensemble-ext-same in place of their two-sequence counterparts.

Under certain circumstances hybrid2.pl may substitute other programs for hybrid and hybrid-ss. If hybrid2.pl is invoked as hybrid2-2s.pl, hybrid-2s.pl and hybrid-ss-2s.pl replace hybrid and hybrid-ss, so that a generalized two-state computation is performed. Additionally, if hybrid2.pl is invoked with "-x" in the name (hybrid2-x.pl or hybrid2-2s-x.pl) all species except the heterodimer are excluded. Thus hybrid2-2s-x.pl performs a "traditional" two-state calculation.

Finally, hybridization with intramolecuar basepairs can be enabled by invoking hybrid2.pl as hybrid2-intra.pl. hybrid2-intra-x.pl, hybrid2-intra-2s.pl and hybrid2-intra-2s-x.pl also have their expected meanings.

OPTIONS

Most of the options below are passed to the subprograms to which they apply, but some (--fraction, --Tmelt, --parallel and --reuse) affect the behavior of hybrid2.pl directly.

-n, --NA=RNA2|RNA|DNA: set nucleic acid type to RNA or DNA. Default is RNA.
-t, --tmin=REAL: set minimum temperature to REAL °C. Default is 0.
-i, --tinc=REAL: set temperature increment to REAL C°. Default is 1.
-T, --tmax=REAL: set maximum temperature to REAL °C. Default is 100.
-N, --sodium=NUMBER: set Sodium ion concentration to NUMBER molar. Default is 1.
-M, --magnesium=NUMBER: set Magnesium ion concentration to NUMBER molar. Default is 0.
-p, --polymer: use salt corrections for polymers instead of oligomers (the default).
-A, --A0=REAL: set the total concentration of A present to REAL molar.
-B, --B0=REAL: set the total concentration of B present to REAL molar.
-E, --energyOnly: skip computation of probabilities and output only prefix.dG and prefix.run. This mode uses less time and memory.
-I, --noisolate: prohibit all isolated basepairs. Isolated basepairs are helices of length 1; that is, they do not stack on another basepair on either side. (See also the --prefilter and --nopostfilter options below.)
-F, --mfold=P,W,MAX: when used in combination with "two-state" mode, causes ybrid2.pl to compute an mfold-style set of structures for each species.
-m, --maxbp=NUMBER: bases farther apart than NUMBER cannot form. Default is no limit.
-x, --exclude=A|B|AA|BB: exclude the specified species from consideration. May be used more than once, to exclude multiple species.
--fraction=REAL: assign the fraction REAL of the stacking enthalpy for each sequence with its reverse complement to stacking in the single strands. Default is 0.1. To disable entirely, use --nofraction.
--nofraction: remove stacking in unfolded single strands from consideration.
--Tmelt=REAL: assign entropy to single strands so that melting temperature is REAL °C. Default is 50.
-P, --parallel: run the calculations for each species at the same time, rather than sequentially. This option results in a significant speedup on multiprocessor machines.
-r, --reuse: assume that hybrid and hybrid-ss have already been run, and run only the ensemble computations.
--title=STRING: use STRING as the title for plots.

OBSCURE OPTIONS

--allpairs: allow basepairs to form between any two nucleotides. When --allpairs is not specified, only Watson-Crick and wobble basepairs are allowed.
--maxloop=NUMBER: set the maximum size of bulge/interior loops to NUMBER. Default is 30.
--maxas=NUMBER: set the maximum asymmetry of bulge/interior loops to NUMBER. Default is 30.
--nodangle: remove single-base stacking from consideration.
--simple: make the penalty for multibranch loops constant rather than affine.
--single: only sum extinctions for each nucelotide, rather than for each dinucleotide.
--prefilter=value1[,value2]: sets the prefilter to filter out all basepairs except those in groups of value2 adjacent basepairs of which value1 can form. value2 is the same as value1 if unspecified. Default is 2 of 2. (See also the --noisolate option above.)
--nopostfilter: disable the postfilter. The postfilter, which is enabled by default, removes from consideration all structures that consist of only one basepair.

ENVIRONMENT

UNAFOLDDAT: an alternate location from which to read the energy rules. The default energy rules can be overridden with files in the current directory or in the directory pointed to by UNAFOLDDAT. hybrid2.pl looks for each file first in the current directory, then in the directory specified by UNAFOLDDAT and last in /usr/local/share/unafold (or wherever the energy rules were installed).

REFERENCES

Markham, N. R. and Zuker, M. (2008) UNAFold: software for nucleic acid folding and hybridization. In Keith, J. M., editor, Bioinformatics, Volume II. Structure, Functions and Applications, number 453 in Methods in Molecular Biology, chapter 1, pages 3-31. Humana Press, Totowa, NJ. ISBN 978-1-60327-428-9.

Other references which may be useful may be found at http://www.unafold.org/Dinamelt/dinamelt-references.php

AUTHORS

Nick Markham <markham@alum.rpi.edu> and Michael Zuker <zukerm@alum.mit.edu>

COPYRIGHT

AVAILABILITY

Both commercial and non-commercial use of UNAFold require a license from RPI; see https://ipo.rpi.edu/invention/unafold-version-40.