Models
We offer a catalogue of frequently used models that are already integrated with the framework and ready to be used. We maintain that if a model you'd like to use is not included in the list, you can swiftly define one yourself and leverage our framework nonetheless.
Discrete state models
Here we account for a typical set-up for a discrete state Partition:
MolecularEvolution.DiscretePartition Type
abstract type DiscretePartition <: MultiSitePartition endRepresents a states-by-sites matrix of probabilities. The following fields are loosely required:
state: A matrix of probabilities that are site-wise normalized.states: The number of states.sites: The number of sites.scaling: A vector of log-domain probability scaling factors, one per site.
And here's a list of simple concrete subtypes of DiscretePartition:
And then there are two typical BranchModels that will cooperate with this Partition:
Note
The two above can be regarded as special cases of the more general PModel, which just represents a P matrix.
A typical way of constructing your Q matrix in our ecosystem is by:
MolecularEvolution.reversibleQ Function
reversibleQ(param_vec,eq_freqs)Takes a vector of parameters and equilibrium frequencies and returns a reversible rate matrix. The parameters are the upper triangle of the rate matrix, with the diagonal elements omitted, and the equilibrium frequencies are multiplied column-wise.
sourceMolecularEvolution.nonreversibleQ Function
nonreversibleQ(param_vec)Takes a vector of parameters and returns a nonreversible rate matrix.
sourceCodon models
MolecularEvolution.CodonPartition Type
mutable struct CodonPartition <: DiscretePartitionConstructors
CodonPartition(sites; code = universal_code)
CodonPartition(state, states, sites, scaling; code = universal_code)Description
A DiscretePartition where every state represents a codon. Can be customized to use different genetic codes.
We offer constructors for the following Q matrix parameterizations:
MolecularEvolution.MG94_F3x4- for example usage, see Example 3: FUBARMolecularEvolution.MG94_F61MolecularEvolution.HB98_F61MolecularEvolution.HB98_AAfit
Use help mode, ?, in the REPL to find out more about the above.
Miscellaneous models
MolecularEvolution.InterpolatedDiscreteModel Type
Constructors
InterpolatedDiscreteModel(siz::Int64, generator, tvec::Vector{Float64})
InterpolatedDiscreteModel(Pvec::Array{Float64,3}, tvec::Vector{Float64})generator is a function that takes a time value t and returns a P matrix.
Description
Stores a number (siz) of P matrices, and the time values to which they correspond. For a requested t, the returned P matrix is (element-wise linearly) interpolated between it's two neighbours.
MolecularEvolution.PiQ Type
Constructors
PiQ(r::Float64,pi::Vector{Float64}; normalize=false)
PiQ(pi::Vector{Float64}; normalize=false)Description
The F81 substitution model, but for general dimensions. https://www.diva-portal.org/smash/get/diva2:1878793/FULLTEXT01.pdf
sourceContinuous models
The partition of interest is:
MolecularEvolution.GaussianPartition Type
Constructors
GaussianPartition(mean::Float64, var::Float64, norm_const::Float64)
GaussianPartition(mean::Float64, var::Float64) # norm_const defaults to 0.0
GaussianPartition() # mean, var, norm_const default to 0.0, 1.0, 0.0 respectivelyDescription
A partition representing a (not necessarily normalized) Gaussian distribution. norm_const is the log-domain normalization constant.
And then there are two BranchModels which are compatible with the above partition, namely:
MolecularEvolution.BrownianMotion Type
BrownianMotion(mean_drift::Float64, var_drift::Float64)A 1D continuous Brownian motion model with mean drift and variance drift.
sourceMolecularEvolution.ZeroDriftOrnsteinUhlenbeck Type
function ZeroDriftOrnsteinUhlenbeck(
mean::Float64,
volatility::Float64,
reversion::Float64,
)A 1D continuous Ornstein-Uhlenbeck process with mean, volatility, and reversion.
sourceCompound models
Branch-wise mixture
MolecularEvolution.BWMModel Type
mutable struct BWMModel{M} <: DiscreteStateModel where M <: DiscreteStateModelFields
models::Vector{<:M}: A vector of models.weights::Vector{Float64}: A vector of weights.
Description
Branch-wise mixture model.
Note
forward! and backward! are currently only defined for M<:PMatrixModel.
CAT
MolecularEvolution.CATModel Type
CATModel(models::Vector{<:BranchModel})CAT is something where you split the sites up, and assign each site to a different model (whose "data" gets stored in a contiguous block of memory).
sourceMolecularEvolution.CATPartition Type
Constructors
CATPartition(part_inds::Vector{Vector{Int}})
CATPartition(part_inds::Vector{Vector{Int}}, parts::Vector{PType})Description
A partition for the CATModel.
Covarion
MolecularEvolution.CovarionModel Type
Constructors
CovarionModel(models::Vector{<:DiscreteStateModel}, inter_transition_rates::Matrix{Float64})
CovarionModel(models::Vector{<:DiscreteStateModel}, inter_transition_rate::Float64)Description
The covarion model.
sourceMolecularEvolution.CovarionPartition Type
CovarionPartition(states,sites,models,t)A partition for the CovarionModel.
Site-wise mixture
See Example 2: GTR+Gamma for example usage.
MolecularEvolution.SWMModel Type
Constructors
SWMModel(models::Vector{<:BranchModel})
SWMModel(model::M, rs::Vector{Float64}) where {M <: BranchModel}Description
A site-wise mixture model, for site-to-site "random effects" rate variation.
sourceMolecularEvolution.SWMPartition Type
Constructors
SWMPartition(parts::Vector{PType}) where {PType <: MultiSitePartition}
SWMPartition(part::PType, n_parts::Int) where {PType <: MultiSitePartition}
SWMPartition(parts::Vector{PType}, weights::Vector{Float64}, sites::Int, states::Int, models::Int) where {PType <: MultiSitePartition}Description
A site-wise mixture partition for the SWMModel.
Lazy models
LazyPartition
MolecularEvolution.LazyPartition Type
Constructor
LazyPartition{PType}()Initialize an empty LazyPartition that is meant for wrapping a partition of type PType.
Description
With this data structure, you can wrap a partition of choice. The idea is that in some message passing algorithms, there is only a wave of partitions which need to actualize. For instance, a wave following a root-leaf path, or a depth-first traversal. In which case, we can be more economical with our memory consumption. With a worst case memory complexity of O(log(n)), where n is the number of nodes, functionality is provided for:
log_likelihood!felsenstein!sample_down!
Note
For successive felsenstein! calls, we need to extract the information at the root somehow after each call. This can be done with e.g. total_LL or site_LLs.
Further requirements
Suppose you want to wrap a partition of PType with LazyPartition:
If you're calling
log_likelihood!andfelsenstein!:obs2partition!(partition::PType, obs)that transforms an observation to a partition.
If you're calling
sample_down!:partition2obs(partition::PType)that returns the most likely state from a partition, invertsobs2partition!.
Examples
Example 1: Initializing for an upward pass
Now, we show how to wrap the CodonPartitions from Example 3: FUBAR with LazyPartition:
You simply go from initializing messages like this:
initial_partition = CodonPartition(Int64(length(seqs[1])/3))
initial_partition.state .= eq_freqs
populate_tree!(tree,initial_partition,seqnames,seqs)To this
initial_partition = CodonPartition(Int64(length(seqs[1])/3))
initial_partition.state .= eq_freqs
lazy_initial_partition = LazyPartition{CodonPartition}()
populate_tree!(tree,lazy_initial_partition,seqnames,seqs)
lazyprep!(tree, initial_partition)By this slight modification, we go from initializing and using 554 partitions to 6 during the subsequent log_likelihood! and felsenstein! calls. There is no significant decrease in performance recorded from this switch.
Example 2: Initializing for a downward pass
Now, we show how to wrap the GaussianPartitions from Quick example: Likelihood calculations under phylogenetic Brownian motion: with LazyPartition:
You simply go from initializing messages like this:
internal_message_init!(tree, GaussianPartition())To this (technically we only add 1 LOC)
initial_partition = GaussianPartition()
lazy_initial_partition = LazyPartition{GaussianPartition}()
internal_message_init!(tree, lazy_initial_partition)
lazyprep!(tree, initial_partition, direction=LazyDown(isleafnode))Note
Now, we provided a direction for lazyprep!. The direction is an instance of LazyDown, which was initialized with the isleafnode function. The function isleafnode dictates if a node saves its sampled observation after a down pass. If you use direction=LazyDown(), every node saves its observation.
Surrounding LazyPartition
MolecularEvolution.lazyprep! Function
lazyprep!(tree::FelNode, initial_message::Vector{<:Partition}; partition_list = 1:length(tree.message), direction::LazyDirection = LazyUp())Extra, intermediate step of tree preparations between initializing messages across the tree and calling message passing algorithms with LazyPartition.
Perform a
lazysort!on tree to obtain the optimal tree for a lazyfelsenstein!prop, or asample_down!.Fix
tree.parent_messageto an initial message.Preallocate sufficiently many inner partitions needed for a
felsenstein!prop, or asample_down!.Specialized preparations based on the direction of the operations (
forward!,backward!).LazyDownorLazyUp.
See also LazyDown, LazyUp.
MolecularEvolution.LazyUp Type
Constructor
LazyUp()Description
Indicate that we want to do an upward pass, e.g. felsenstein!.
MolecularEvolution.LazyDown Type
Constructors
LazyDown(stores_obs)
LazyDown() = LazyDown(x::FelNode -> true)Description
Indicate that we want to do a downward pass, e.g. sample_down!. The function passed to the constructor takes a node::FelNode as input and returns a Bool that decides if node stores its observations.