Full API
Index
MolecularEvolution.AbstractUpdate
MolecularEvolution.AminoAcidPartition
MolecularEvolution.BWMModel
MolecularEvolution.BWMModel
MolecularEvolution.BranchlengthSampler
MolecularEvolution.BrownianMotion
MolecularEvolution.CATModel
MolecularEvolution.CATPartition
MolecularEvolution.CladeStats
MolecularEvolution.CodonPartition
MolecularEvolution.CovarionModel
MolecularEvolution.CovarionPartition
MolecularEvolution.CustomDiscretePartition
MolecularEvolution.DiagonalizedCTMC
MolecularEvolution.DiscretePartition
MolecularEvolution.GappyAminoAcidPartition
MolecularEvolution.GappyNucleotidePartition
MolecularEvolution.GaussianPartition
MolecularEvolution.GeneralCTMC
MolecularEvolution.InterpolatedDiscreteModel
MolecularEvolution.LazyDown
MolecularEvolution.LazyPartition
MolecularEvolution.LazyUp
MolecularEvolution.NucleotidePartition
MolecularEvolution.PModel
MolecularEvolution.PiQ
MolecularEvolution.SWMModel
MolecularEvolution.SWMPartition
MolecularEvolution.StandardUpdate
MolecularEvolution.ZeroDriftOrnsteinUhlenbeck
Base.:==
MolecularEvolution.BayesUpdate
MolecularEvolution.HB98_AAfit
MolecularEvolution.HB98_F61
MolecularEvolution.HIPSTR
MolecularEvolution.HIPSTR
MolecularEvolution.MG94_F3x4
MolecularEvolution.MG94_F61
MolecularEvolution.MaxLikUpdate
MolecularEvolution.SWM_prob_grid
MolecularEvolution._mapreduce
MolecularEvolution.allocate!
MolecularEvolution.backward!
MolecularEvolution.backward!
MolecularEvolution.bfs_mapreduce
MolecularEvolution.branchlength_optim!
MolecularEvolution.branchlength_optim!
MolecularEvolution.branchlength_update!
MolecularEvolution.brents_method_minimize
MolecularEvolution.cascading_max_state_dict
MolecularEvolution.cascading_max_state_dict
MolecularEvolution.char_proportions
MolecularEvolution.collect_clades!
MolecularEvolution.collect_leaf_dists
MolecularEvolution.colored_seq_draw
MolecularEvolution.combine!
MolecularEvolution.combine!
MolecularEvolution.copy_tree
MolecularEvolution.count_F3x4
MolecularEvolution.count_F61
MolecularEvolution.deepequals
MolecularEvolution.dfs_mapreduce
MolecularEvolution.discrete_name_color_dict
MolecularEvolution.draw_example_tree
MolecularEvolution.endpoint_conditioned_sample_state_dict
MolecularEvolution.endpoint_conditioned_sample_state_dict
MolecularEvolution.expected_subs_per_site
MolecularEvolution.felsenstein!
MolecularEvolution.felsenstein_down!
MolecularEvolution.felsenstein_roundtrip!
MolecularEvolution.forward!
MolecularEvolution.forward!
MolecularEvolution.gappy_Q_from_symmetric_rate_matrix
MolecularEvolution.get_highlighter_legend
MolecularEvolution.get_max_depth
MolecularEvolution.get_phylo_tree
MolecularEvolution.get_phylo_tree
MolecularEvolution.golden_section_maximize
MolecularEvolution.hash_clade
MolecularEvolution.highlight_seq_draw
MolecularEvolution.highlighter_tree_draw
MolecularEvolution.internal_message_init!
MolecularEvolution.internal_message_init!
MolecularEvolution.internal_nodes
MolecularEvolution.istreeconsistent
MolecularEvolution.ladder_tree_sim
MolecularEvolution.lazyprep!
MolecularEvolution.lazysort!
MolecularEvolution.leaf_distmat
MolecularEvolution.leaf_names
MolecularEvolution.leaf_samples
MolecularEvolution.leaves
MolecularEvolution.linear_scale
MolecularEvolution.log_likelihood
MolecularEvolution.log_likelihood!
MolecularEvolution.longest_path
MolecularEvolution.marginal_state_dict
MolecularEvolution.marginal_state_dict
MolecularEvolution.matrix_for_display
MolecularEvolution.metropolis_sample
MolecularEvolution.metropolis_sample
MolecularEvolution.metropolis_step
MolecularEvolution.midpoint
MolecularEvolution.mix
MolecularEvolution.name2node_dict
MolecularEvolution.newick
MolecularEvolution.newick
MolecularEvolution.nni_optim!
MolecularEvolution.nni_optim!
MolecularEvolution.nni_update!
MolecularEvolution.node_distances
MolecularEvolution.node_names
MolecularEvolution.node_samples
MolecularEvolution.nodes
MolecularEvolution.nonreversibleQ
MolecularEvolution.parent_list
MolecularEvolution.partition2obs
MolecularEvolution.partition2obs
MolecularEvolution.plot_multiple_trees
MolecularEvolution.plot_multiple_trees
MolecularEvolution.populate_tree!
MolecularEvolution.populate_tree!
MolecularEvolution.promote_internal
MolecularEvolution.quadratic_CI
MolecularEvolution.quadratic_CI
MolecularEvolution.read_fasta
MolecularEvolution.read_fasta
MolecularEvolution.read_newick_tree
MolecularEvolution.read_newick_tree
MolecularEvolution.refresh!
MolecularEvolution.reversibleQ
MolecularEvolution.reversibleQ
MolecularEvolution.root2tip_distances
MolecularEvolution.root_optim!
MolecularEvolution.root_optim!
MolecularEvolution.root_update!
MolecularEvolution.sample_down!
MolecularEvolution.sample_down!
MolecularEvolution.sample_from_message!
MolecularEvolution.savefig_tweakSVG
MolecularEvolution.savefig_tweakSVG
MolecularEvolution.savefig_tweakSVG
MolecularEvolution.shortest_path_between_nodes
MolecularEvolution.sibling_inds
MolecularEvolution.siblings
MolecularEvolution.sim_tree
MolecularEvolution.sim_tree
MolecularEvolution.sim_tree
MolecularEvolution.simple_radial_tree_plot
MolecularEvolution.simple_tree_draw
MolecularEvolution.standard_tree_sim
MolecularEvolution.total_LL
MolecularEvolution.tree2distances
MolecularEvolution.tree2shared_branch_lengths
MolecularEvolution.tree_draw
MolecularEvolution.tree_draw
MolecularEvolution.tree_polish!
MolecularEvolution.tree_polish!
MolecularEvolution.unc2probvec
MolecularEvolution.unc2probvec
MolecularEvolution.univariate_maximize
MolecularEvolution.univariate_maximize
MolecularEvolution.univariate_sampler
MolecularEvolution.values_from_phylo_tree
MolecularEvolution.values_from_phylo_tree
MolecularEvolution.weightEM
MolecularEvolution.write_fasta
MolecularEvolution.write_fasta
MolecularEvolution.write_nexus
MolecularEvolution.write_nexus
Docstrings
MolecularEvolution.AbstractUpdate Type
Summary
abstract type AbstractUpdate <: Function
A callable type that typically takes (tree::FelNode, models; partition_list=1:length(tree.message))
, updates tree
and models
, and returns the updated tree
and models
.
Example
Define a new subtype, where foo
and bar
are arbitrary updating functions
struct MyUpdate <: AbstractUpdate end
function (update::MyUpdate)(tree::FelNode, models; partition_list=1:length(tree.message))
tree, models = foo(tree, models, partition_list=partition_list)
tree, models = BayesUpdate(nni=0)(tree, models, partition_list=partition_list)
tree, models = bar(tree, models, partition_list=partition_list)
return tree, models
end
See also: StandardUpdate
MolecularEvolution.AminoAcidPartition Type
mutable struct AminoAcidPartition <: DiscretePartition
Constructors
AminoAcidPartition(sites)
AminoAcidPartition(freq_vec::Vector{Float64}, sites::Int64)
AminoAcidPartition(state, states, sites, scaling)
Description
A DiscretePartition
for amino acid sequences with 20 states representing the standard amino acids.
MolecularEvolution.BWMModel Type
mutable struct BWMModel{M} <: DiscreteStateModel where M <: DiscreteStateModel
Fields
models::Vector{<:M}
: A vector of models.weights::Vector{Float64}
: A vector of weights.
Description
Branch-wise mixture model.
Note
forward!
and backward!
are currently only defined for M<:PMatrixModel
.
MolecularEvolution.BWMModel Method
BWMModel{M}(models::Vector{<:M}) where M <: DiscreteStateModel
Convenience constructor where the weights are uniform.
sourceMolecularEvolution.BranchlengthSampler Type
BranchlengthSampler
A type that allows you to specify a additive proposal function in the log domain and a prior distrubution over the log of the branchlengths. It also stores `acc_ratio` which is a tuple of `(ratio, total, #acceptances)`, where `ratio::Float64` is the acceptance ratio, `total::Int64` is the total number of proposals, and `#acceptances::Int64` is the number of acceptances.
MolecularEvolution.BrownianMotion Type
BrownianMotion(mean_drift::Float64, var_drift::Float64)
A 1D continuous Brownian motion model with mean drift and variance drift.
sourceMolecularEvolution.CATModel Type
CATModel(models::Vector{<:BranchModel})
CAT is something where you split the sites up, and assign each site to a different model (whose "data" gets stored in a contiguous block of memory).
sourceMolecularEvolution.CATPartition Type
Constructors
CATPartition(part_inds::Vector{Vector{Int}})
CATPartition(part_inds::Vector{Vector{Int}}, parts::Vector{PType})
Description
A partition for the CATModel
.
MolecularEvolution.CladeStats Type
Store statistics about a clade: its frequency and observed child pairs.
sourceMolecularEvolution.CodonPartition Type
mutable struct CodonPartition <: DiscretePartition
Constructors
CodonPartition(sites; code = universal_code)
CodonPartition(state, states, sites, scaling; code = universal_code)
Description
A DiscretePartition
where every state represents a codon. Can be customized to use different genetic codes.
MolecularEvolution.CovarionModel Type
Constructors
CovarionModel(models::Vector{<:DiscreteStateModel}, inter_transition_rates::Matrix{Float64})
CovarionModel(models::Vector{<:DiscreteStateModel}, inter_transition_rate::Float64)
Description
The covarion model.
sourceMolecularEvolution.CovarionPartition Type
CovarionPartition(states,sites,models,t)
A partition for the CovarionModel
.
MolecularEvolution.CustomDiscretePartition Type
mutable struct CustomDiscretePartition <: DiscretePartition
Constructors
CustomDiscretePartition(states, sites)
CustomDiscretePartition(freq_vec::Vector{Float64}, sites::Int64)
CustomDiscretePartition(state, states, sites, scaling)
Description
A general-purpose DiscretePartition
that can handle any number of states. Useful for custom discrete state spaces that don't fit into the predefined partition types.
MolecularEvolution.DiagonalizedCTMC Type
Constructors
DiagonalizedCTMC(Q::Array{Float64,2})
DiagonalizedCTMC(Qpre::Array{Float64,2}, pi::Vector{Float64})
Description
Takes in a Q matrix (which can be multiplied onto row-wise by pi
) and diagonalizes it. When computing
Warning
Construction fails if Q
has complex eigenvalues.
MolecularEvolution.DiscretePartition Type
abstract type DiscretePartition <: MultiSitePartition end
Represents a states
-by-sites
matrix of probabilities. The following fields are loosely required:
state
: A matrix of probabilities that are site-wise normalized.states
: The number of states.sites
: The number of sites.scaling
: A vector of log-domain probability scaling factors, one per site.
MolecularEvolution.GappyAminoAcidPartition Type
mutable struct GappyAminoAcidPartition <: DiscretePartition
Constructors
GappyAminoAcidPartition(sites)
GappyAminoAcidPartition(freq_vec::Vector{Float64}, sites::Int64)
GappyAminoAcidPartition(state, states, sites, scaling)
Description
A DiscretePartition
for amino acid sequences with 21 states (20 standard amino acids plus '-' for gaps).
MolecularEvolution.GappyNucleotidePartition Type
mutable struct GappyNucleotidePartition <: DiscretePartition
Constructors
GappyNucleotidePartition(sites)
GappyNucleotidePartition(freq_vec::Vector{Float64}, sites::Int64)
GappyNucleotidePartition(state, states, sites, scaling)
Description
A DiscretePartition
for nucleotide sequences with 5 states (A, C, G, T, -), where '-' represents a gap.
MolecularEvolution.GaussianPartition Type
Constructors
GaussianPartition(mean::Float64, var::Float64, norm_const::Float64)
GaussianPartition(mean::Float64, var::Float64) # norm_const defaults to 0.0
GaussianPartition() # mean, var, norm_const default to 0.0, 1.0, 0.0 respectively
Description
A partition representing a (not necessarily normalized) Gaussian distribution. norm_const
is the log-domain normalization constant.
MolecularEvolution.GeneralCTMC Type
function GeneralCTMC(Q::Array{Float64,2})
Wraps a Q matrix and will compute exp
.
MolecularEvolution.InterpolatedDiscreteModel Type
Constructors
InterpolatedDiscreteModel(siz::Int64, generator, tvec::Vector{Float64})
InterpolatedDiscreteModel(Pvec::Array{Float64,3}, tvec::Vector{Float64})
generator
is a function that takes a time value t
and returns a P matrix.
Description
Stores a number (siz
) of P matrices, and the time values to which they correspond. For a requested t, the returned P matrix is (element-wise linearly) interpolated between it's two neighbours.
MolecularEvolution.LazyDown Type
Constructors
LazyDown(stores_obs)
LazyDown() = LazyDown(x::FelNode -> true)
Description
Indicate that we want to do a downward pass, e.g. sample_down!
. The function passed to the constructor takes a node::FelNode
as input and returns a Bool
that decides if node
stores its observations.
MolecularEvolution.LazyPartition Type
Constructor
LazyPartition{PType}()
Initialize an empty LazyPartition
that is meant for wrapping a partition of type PType
.
Description
With this data structure, you can wrap a partition of choice. The idea is that in some message passing algorithms, there is only a wave of partitions which need to actualize. For instance, a wave following a root-leaf path, or a depth-first traversal. In which case, we can be more economical with our memory consumption. With a worst case memory complexity of O(log(n)), where n is the number of nodes, functionality is provided for:
log_likelihood!
felsenstein!
sample_down!
Note
For successive felsenstein!
calls, we need to extract the information at the root somehow after each call. This can be done with e.g. total_LL
or site_LLs
.
Further requirements
Suppose you want to wrap a partition of PType
with LazyPartition
:
If you're calling
log_likelihood!
andfelsenstein!
:obs2partition!(partition::PType, obs)
that transforms an observation to a partition.
If you're calling
sample_down!
:partition2obs(partition::PType)
that returns the most likely state from a partition, invertsobs2partition!
.
MolecularEvolution.LazyUp Type
Constructor
LazyUp()
Description
Indicate that we want to do an upward pass, e.g. felsenstein!
.
MolecularEvolution.NucleotidePartition Type
mutable struct NucleotidePartition <: DiscretePartition
Constructors
NucleotidePartition(sites)
NucleotidePartition(freq_vec::Vector{Float64}, sites::Int64)
NucleotidePartition(state, states, sites, scaling)
Description
A DiscretePartition
specifically for nucleotide sequences with 4 states (A, C, G, T).
MolecularEvolution.PiQ Type
Constructors
PiQ(r::Float64,pi::Vector{Float64}; normalize=false)
PiQ(pi::Vector{Float64}; normalize=false)
Description
The F81 substitution model, but for general dimensions. https://www.diva-portal.org/smash/get/diva2:1878793/FULLTEXT01.pdf
sourceMolecularEvolution.SWMModel Type
Constructors
SWMModel(models::Vector{<:BranchModel})
SWMModel(model::M, rs::Vector{Float64}) where {M <: BranchModel}
Description
A site-wise mixture model, for site-to-site "random effects" rate variation.
sourceMolecularEvolution.SWMPartition Type
Constructors
SWMPartition(parts::Vector{PType}) where {PType <: MultiSitePartition}
SWMPartition(part::PType, n_parts::Int) where {PType <: MultiSitePartition}
SWMPartition(parts::Vector{PType}, weights::Vector{Float64}, sites::Int, states::Int, models::Int) where {PType <: MultiSitePartition}
Description
A site-wise mixture partition for the SWMModel
.
MolecularEvolution.StandardUpdate Type
Summary
struct StandardUpdate <: AbstractUpdate
A standard update can be a family of calls to nni_update!
, branchlength_update!
, root_update!
, and model updates.
Constructor
StandardUpdate(
nni::Int,
branchlength::Int,
root::Int,
models::Int,
refresh::Bool,
nni_selection::Function,
branchlength_modifier::UnivariateModifier,
root_update::RootUpdate,
models_update::ModelsUpdate
)
Arguments
nni::Int
: the number of times to update the tree bynni_update!
branchlength::Int
: the number of times to update the tree bybranchlength_update!
root::Int
: the number of times to update the tree byroot_update!
models::Int
: the number of times to update the modelrefresh::Bool
: whether to refresh the messages in tree between update operations to ensure message consistencynni_selection::Function
: the function that selects between nni configurationsbranchlength_modifier::UnivariateModifier
: the modifier to update a branchlength bybranchlength_update!
root_update::RootUpdate
: updates the root byroot_update!
models_update::ModelsUpdate
: updates the model parameters
See also: BayesUpdate
, MaxLikUpdate
MolecularEvolution.ZeroDriftOrnsteinUhlenbeck Type
function ZeroDriftOrnsteinUhlenbeck(
mean::Float64,
volatility::Float64,
reversion::Float64,
)
A 1D continuous Ornstein-Uhlenbeck process with mean, volatility, and reversion.
sourceMolecularEvolution.BayesUpdate Method
BayesUpdate(;
nni = 1,
branchlength = 1,
root = 0,
models = 0,
refresh = false,
branchlength_sampler::UnivariateSampler = BranchlengthSampler(
Normal(0, 2),
Normal(-1, 1),
),
root_sampler::RootSample = StandardRootSample(1.0, 1),
models_sampler::ModelsUpdate = StandardModelsUpdate()
)
Convenience constructor for StandardUpdate
. The nni_selection
is fixed to softmax_sampler
. This constructor provides Bayesian updates by sampling from the posterior distribution.
MolecularEvolution.HB98_AAfit Method
function HB98_AAfit(alpha, nuc_matrix, AA_fitness; genetic_code = universal_code)
Compute the (Halpern and Bruno 1998) codon substitution Q matrix parameterized by alpha
, a nucleotide substitution matrix nuc_matrix
, and direct amino acid fitness AA_fitness
vector.
MolecularEvolution.HB98_F61 Method
function HB98_F61(alpha, nuc_matrix, F61; genetic_code = universal_code)
Compute the (Halpern and Bruno 1998) HB98-F61 codon substitution Q matrix parameterized by alpha
, a nucleotide substitution matrix nuc_matrix
, and the F61
codon frequency estimator (see ?MolecularEvolution.count_F61
).
MolecularEvolution.HIPSTR Method
HIPSTR(trees::Vector{FelNode}; set_branchlengths = true)
Construct a Highest Independent Posterior Subtree Reconstruction (HIPSTR) tree from a collection of trees.
Returns a single FelNode representing the HIPSTR consensus tree.
If set_branchlengths = true
, the branch length of a node in the HIPSTR tree will be set to the mean branch length of all nodes from the input trees that have the same clade. (By the same clade, we mean that the set of leaves below the node is the same.) Otherwise, the root branch length is 0.0 and the rest 1.0.
Source: https://www.biorxiv.org/content/10.1101/2024.12.08.627395v1.full.pdf
sourceMolecularEvolution.MG94_F3x4 Method
function MG94_F3x4(alpha, beta, nuc_matrix, F3x4; genetic_code = universal_code)
Compute the (Muse and Gaut 1994) MG94-F3x4 codon substitution Q matrix parameterized by alpha
, beta
, a nucleotide substitution matrix nuc_matrix
, and F3x4
(see ?MolecularEvolution.count_F3x4
).
MolecularEvolution.MG94_F61 Method
function MG94_F61(alpha, beta, nuc_matrix, F61; genetic_code = universal_code)
Compute the (Muse and Gaut 1994) MG94-F61 codon substitution Q matrix parameterized by alpha
, beta
, a nucleotide substitution matrix nuc_matrix
, and the F61
codon frequency estimator (see ?MolecularEvolution.count_F61
).
MolecularEvolution.MaxLikUpdate Method
MaxLikUpdate(;
nni = 1,
branchlength = 1,
root = 0,
models = 0,
refresh = false,
branchlength_optimizer::UnivariateOpt = GoldenSectionOpt(),
root_optimizer = StandardRootOpt(10),
models_optimizer::ModelUpdate = StandardModelUpdate()
)
Convenience constructor for StandardUpdate
. The nni_selection
is fixed to argmax
. This constructor provides Maximum Likelihood updates by optimizing parameters.
MolecularEvolution.SWM_prob_grid Method
SWM_prob_grid(part::SWMPartition{PType}) where {PType <: MultiSitePartition}
Returns a matrix of probabilities for each site, for each model (in the probability domain - not logged!) as well as the log probability offsets
sourceMolecularEvolution._mapreduce Method
Internal function. Helper for bfs_mapreduce and dfs_mapreduce
sourceMolecularEvolution.allocate! Method
allocate!(tree, partition_or_message)
Allocates initial messages for all nodes in the tree, copying the passed-in message template. If passed a partition, then this will assume the message template is a vector containing just that partition.
sourceMolecularEvolution.backward! Method
backward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)
Propagate the source partition backwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.
sourceMolecularEvolution.bfs_mapreduce Method
Performs a BFS map-reduce over the tree, starting at a given node For each node, map_reduce is called as: map_reduce(curr_node::FelNode, prev_node::FelNode, aggregator) where prev_node is the previous node visited on the path from the start node to the current node It is expected to update the aggregator, and not return anything.
Not exactly conventional map-reduce, as map-reduce calls may rely on state in the aggregator added by map-reduce calls on other nodes visited earlier.
sourceMolecularEvolution.branchlength_optim! Method
branchlength_optim!(tree::FelNode, models; <keyword arguments>)
Uses golden section search, or optionally Brent's method, to optimize all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another.
Keyword Arguments
partition_list=nothing
: (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize branch lengths with all models, the default option).tol=1e-5
: absolute tolerance for thebl_optimizer
.bl_optimizer::UnivariateModifier=GoldenSectionOpt()
: the algorithm used to optimize the log likelihood of a branch length. In addition to golden section search, Brent's method can be used by settingbl_optimizer=BrentsMethodOpt()
.sort_tree=false
: determines if alazysort!
will be performed, which can reduce the amount of temporary messages that has to be initialized.traversal=Iterators.reverse
: a function that determines the traversal, permutes an iterable.shuffle=false
: do a randomly shuffled traversal, overridestraversal
.
MolecularEvolution.branchlength_update! Method
branchlength_update!(bl_modifier::UnivariateModifier, tree::FelNode, models; <keyword arguments>)
A more general version of branchlength_optim!
. Here bl_modifier
can be either an optimizer or a sampler (or more generally, a UnivariateModifier).
Keyword Arguments
See branchlength_optim!
.
Note
bl_modifier
is a positional argument here, and not a keyword argument.
MolecularEvolution.brents_method_minimize Method
brents_method_minimize(f, a::Real, b::Real, transform, t::Real; ε::Real=sqrt(eps()))
Brent's method for minimization.
Given a function f with a single local minimum in the interval (a,b), Brent's method returns an approximation of the x-value that minimizes f to an accuaracy between 2tol and 3tol, where tol is a combination of a relative and an absolute tolerance, tol := ε|x| + t. ε should be no smaller 2*eps
, and preferably not much less than sqrt(eps)
, which is also the default value. eps is defined here as the machine epsilon in double precision. t should be positive.
The method combines the stability of a Golden Section Search and the superlinear convergence Successive Parabolic Interpolation has under certain conditions. The method never converges much slower than a Fibonacci search and for a sufficiently well-behaved f, convergence can be exptected to be superlinear, with an order that's usually atleast 1.3247...
Examples
julia> f(x) = exp(-x) - cos(x)
f (generic function with 1 method)
julia> m = brents_method_minimize(f, -1, 2, identity, 1e-7)
0.5885327257940255
From: Richard P. Brent, "Algorithms for Minimization without Derivatives" (1973). Chapter 5.
sourceMolecularEvolution.cascading_max_state_dict Method
cascading_max_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{<:Partition}}())
Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their inferred ancestors under the following scheme: the state that maximizes the marginal likelihood is selected at the root, and then, for each node, the maximum likelihood state is selected conditioned on the maximized state of the parent node and the observations of all descendents. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.
sourceMolecularEvolution.char_proportions Method
char_proportions(seqs, alphabet::String)
Takes a vector of sequences and returns a vector of the proportion of each character across all sequences. An example alphabet
argument is MolecularEvolution.AAstring
.
MolecularEvolution.collect_leaf_dists Method
collect_leaf_dists(trees::Vector{<:AbstractTreeNode})
Returns a list of distance matrices containing the distance between the leaf nodes, which can be used to assess mixing.
MolecularEvolution.colored_seq_draw Method
colored_seq_draw(x, y, str::AbstractString; color_dict=Dict(), font_size=8pt, posx=hcenter, posy=vcenter)
Draw an arbitrary sequence. color_dict
gives a mapping from characters to colors (default black). Default options for nucleotide colorings and amino acid colorings are given in the constants NUC_COLORS
and AA_COLORS
. This can be used along with compose_dict
for drawing sequences at nodes in a tree (see tree_draw
). Returns a Compose container.
MolecularEvolution.combine! Method
combine!(dest::P, src::P) where P<:Partition
Combines evidence from two partitions of the same type, storing the result in dest. Note: You should overload this for your own Partititon types.
sourceMolecularEvolution.copy_tree Function
function copy_tree(root::FelNode, shallow_copy=false)
Returns an untangled copy of the tree. Optionally, the flag `shallow_copy` can be used to obtain a copy of the tree with only the names and branchlengths.
MolecularEvolution.count_F3x4 Method
function count_F3x4(seqs::Array{String})
Compute the F3x4 matrix from a set of sequences.
sourceMolecularEvolution.count_F61 Method
function count_F61(seqs::Array{String}; code = universal_code)
Compute the F61 codon frequency vector from a set of sequences.
sourceMolecularEvolution.deepequals Method
deepequals(t1, t2)
Checks whether two trees are equal by recursively calling this on all fields, except :parent
, in order to prevent cycles. In order to ensure that the :parent
field is not hiding something different on both trees, ensure that each is consistent first (see: istreeconsistent
).
MolecularEvolution.dfs_mapreduce Method
Performs a DFS map-reduce over the tree, starting at a given node See bfs_mapreduce for more details.
sourceMolecularEvolution.discrete_name_color_dict Method
discrete_name_color_dict(newt::AbstractTreeNode,tag_func; rainbow = false, scramble = false, darken = true, col_seed = nothing)
Takes a tree and a tag_func, which converts the leaf label into a category (ie. there should be <20 of these), and returns a color dictionary that can be used to color the leaves or bubbles.
Example tag_func: function tag_func(nam::String) return split(nam,"_")[1] end
For prettier colors, but less discrimination: rainbow = true To randomize the rainbow color assignment: scramble = true col_seed is currently set to white, and excluded from the list of colors, to make them more visible.
Consider making your own version of this function to customize colors as you see fit.
Example use: num_leaves = 50 Ne_func(t) = 1*(e^-t).+5.0 newt = sim_tree(num_leaves,Ne_func,1.0,nstart = rand(1:num_leaves)); newt = ladderize(newt) tag_func(nam) = mod(sum(Int.(collect(nam))),7) dic = discrete_name_color_dict(newt,tag_func,rainbow = true); tree_draw(newt,line_width = 0.5mm,label_color_dict = dic)
sourceMolecularEvolution.draw_example_tree Method
draw_example_tree(num_leaves = 50)
Draws a tree and shows the code that draws it.
sourceMolecularEvolution.endpoint_conditioned_sample_state_dict Method
endpoint_conditioned_sample_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{<:Partition}}())
Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and draws samples under the model conditions on the leaf observations. These samples are stored in the node_message_dict, which is returned. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.
sourceMolecularEvolution.expected_subs_per_site Method
expected_subs_per_site(Q,mu)
Takes a rate matrix Q and an equilibrium frequency vector, and calculates the expected number of substitutions per site.
sourceMolecularEvolution.felsenstein! Method
felsenstein!(node::FelNode, models; partition_list = nothing)
Should usually be called on the root of the tree. Propagates Felsenstein pass up from the tips to the root. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.
sourceMolecularEvolution.felsenstein_down! Method
felsenstein_down!(node::FelNode, models; partition_list = 1:length(tree.message), temp_message = copy_message(tree.message))
Should usually be called on the root of the tree. Propagates Felsenstein pass down from the root to the tips. felsenstein!() should usually be called first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.
sourceMolecularEvolution.felsenstein_roundtrip! Method
felsenstein_roundtrip!(tree::FelNode, models; partition_list = 1:length(tree.message), temp_message = copy_message(tree.message[partition_list]))
Should usually be called on the root of the tree. First propagates Felsenstein pass up from the tips to the root, then propagates Felsenstein pass down from the root to the tips, with the direction of time reversed (i.e. forward! = backward!). This is useful when searching for the optimal root (see root_optim!
). models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.
MolecularEvolution.forward! Method
forward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)
Propagate the source partition forwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.
sourceMolecularEvolution.gappy_Q_from_symmetric_rate_matrix Method
gappy_Q_from_symmetric_rate_matrix(sym_mat, gap_rate, eq_freqs)
Takes a symmetric rate matrix and gap rate (governing mutations to and from gaps) and returns a gappy rate matrix. The equilibrium frequencies are multiplied on column-wise.
sourceMolecularEvolution.get_highlighter_legend Method
get_highlighter_legend(legend_colors)
Returns a Compose object given an input dictionary or pairs mapping characters to colors.
sourceMolecularEvolution.get_max_depth Method
get_max_depth(node,depth::Real)
Return the maximum depth of all children starting from the indicated node.
sourceMolecularEvolution.get_phylo_tree Method
get_phylo_tree(molev_root::FelNode; data_function = (x -> Tuple{String,Float64}[]))
Converts a FelNode tree to a Phylo tree. The data_function
should return a list of tuples of the form (key, value) to be added to the Phylo tree data
Dictionary. Any key/value pairs on the FelNode node_data
Dict will also be added to the Phylo tree.
MolecularEvolution.golden_section_maximize Method
Golden section search.
Given a function f with a single local minimum in the interval [a,b], gss returns a subset interval [c,d] that contains the minimum with d-c <= tol.
Examples
julia> f(x) = -(x-2)^2
f (generic function with 1 method)
julia> m = golden_section_maximize(f, 1, 5, identity, 1e-10)
2.0000000000051843
From: https://en.wikipedia.org/wiki/Golden-section_search
sourceMolecularEvolution.highlight_seq_draw Method
highlight_seq_draw(x, y, str::AbstractString, region, basecolor, hicolor; fontsize=8pt, posx=hcenter, posy=vcenter)
Draw a sequence, highlighting the sites given in region
. This can be used along with compose_dict
for drawing sequences at nodes in a tree (see tree_draw
). Returns a Compose container.
MolecularEvolution.highlighter_tree_draw Method
highlighter_tree_draw(tree, ali_seqs, seqnames, master;
highlighter_start = 1.1, highlighter_width = 1,
coord_width = highlighter_start + highlighter_width + 0.1,
scale_length = nothing, major_breaks = 1000, minor_breaks = 500,
tree_args = NamedTuple[], legend_padding = 0.5cm, legend_colors = NUC_colors)
Draws a combined tree and highlighter plot. The vector of seqnames must match the node names in tree
.
kwargs:
tree_args: kwargs to pass to
tree_draw()
legend_colors: Mapping of characters to highlighter colors (default NT_colors)
scale_length: Length of the scale bar
highlighter_start: Canvas start for the highlighter panel
highlighter_width: Canvas width for the highlighter panel
coord_width: Total width of the canvas
major_breaks: Numbered breaks for sequence axis
minor_breaks: Ticks for sequence axis
MolecularEvolution.internal_message_init! Method
internal_message_init!(tree::FelNode, partition::Partition)
Initializes the message template for each node in the tree, as an array of the partition.
MolecularEvolution.internal_message_init! Method
internal_message_init!(tree::FelNode, empty_message::Vector{<:Partition})
Initializes the message template for each node in the tree, allocating space for each partition.
MolecularEvolution.internal_nodes Method
internal_nodes(tree)
Returns the internal nodes of the tree (including the root), as a vector.
sourceMolecularEvolution.istreeconsistent Method
istreeconsistent(root)
Checks whether the :parent
field is set to be consistent with the :child
field for all nodes in the subtree.
MolecularEvolution.ladder_tree_sim Method
ladder_tree_sim(ntaxa)
Simulates a ladder-like tree, using constant population size but heterochronous sampling, under a coalescent model.
sourceMolecularEvolution.lazyprep! Method
lazyprep!(tree::FelNode, initial_message::Vector{<:Partition}; partition_list = 1:length(tree.message), direction::LazyDirection = LazyUp())
Extra, intermediate step of tree preparations between initializing messages across the tree and calling message passing algorithms with LazyPartition
.
Perform a
lazysort!
on tree to obtain the optimal tree for a lazyfelsenstein!
prop, or asample_down!
.Fix
tree.parent_message
to an initial message.Preallocate sufficiently many inner partitions needed for a
felsenstein!
prop, or asample_down!
.Specialized preparations based on the direction of the operations (
forward!
,backward!
).LazyDown
orLazyUp
.
See also LazyDown
, LazyUp
.
MolecularEvolution.lazysort! Method
Should be run on a tree containing LazyPartitions before running
felsenstein!
. Sorts for a minimal count of active partitions during a felsenstein!Returns the minimum length of memoryblocks (-1) required for a
felsenstein!
prop. We need a temporary memoryblock duringbackward!
, hence the '-1'.
Note
Since felsenstein! uses a stack, we want to avoid having long node.children[1].children[1]... chains
MolecularEvolution.leaf_distmat Method
leaf_distmat(tree)
Returns a matrix of the distances between the leaf nodes where the index on the columns and rows are sorted by the leaf names.
sourceMolecularEvolution.leaf_names Method
leaf_names(tree)
Returns the names of the leaves of the tree.
sourceMolecularEvolution.leaf_samples Method
leaf_samples(tree; partition_inds = 1)
Returns the result of partition2obs
for each leaf of the tree. Can be used eg. after sample_down!
is called. If using a eg. codon model, this will extract a string from the CodonPartition on each leaf. Acts upon the first partition by default, but this can be changed by setting partition_inds
, which can also be a vector of indices, in which case the result will be a vector for each leaf.
MolecularEvolution.linear_scale Method
linear_scale(val,in_min,in_max,out_min,out_max)
Linearly maps val which lives in [in_min,in_max] to a value in [out_min,out_max]
sourceMolecularEvolution.log_likelihood! Method
log_likelihood!(tree::FelNode, models; partition_list = nothing)
First re-computes the upward felsenstein pass, and then computes the log likelihood of this tree. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.
sourceMolecularEvolution.log_likelihood Method
log_likelihood(tree::FelNode, models; partition_list = nothing)
Computed the log likelihood of this tree. Requires felsenstein!() to have been run. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.
sourceMolecularEvolution.longest_path Method
Returns the longest path in a tree For convenience, this is returned as two lists of form: [leaf_node, parent_node, .... root] Where the leaf_node nodes are selected to be the furthest away
sourceMolecularEvolution.marginal_state_dict Method
marginal_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{<:Partition}}())
Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their marginal reconstructions (ie. P(state|all observations,model)). A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.
sourceMolecularEvolution.matrix_for_display Method
matrix_for_display(Q,labels)
Takes a numerical matrix and a vector of labels, and returns a typically mixed type matrix with the numerical values and the labels. This is to easily visualize rate matrices in eg. the REPL.
sourceMolecularEvolution.metropolis_sample Method
metropolis_sample(
update!::AbstractUpdate,
initial_tree::FelNode,
models,
num_of_samples;
partition_list = 1:length(initial_tree.message),
burn_in = 1000,
sample_interval = 10,
collect_LLs = false,
collect_models = false,
midpoint_rooting = false,
ladderize = false,
)
Samples tree topologies from a posterior distribution using a custom update!
function.
Arguments
update!
: A callable that takes (tree::FelNode, models) and updatestree
andmodels
. One call toupdate!
corresponds to one iteration of the Metropolis algorithm.initial_tree
: An initial tree topology with the leaves populated with data, for the likelihood calculation.models
: A list of branch models.num_of_samples
: The number of tree samples drawn from the posterior.partition_list
: (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to sample with all partitions, the default option).burn_in
: The number of samples discarded at the start of the Markov Chain.sample_interval
: The distance between samples in the underlying Markov Chain (to reduce sample correlation).collect_LLs
: Specifies if the function should return the log-likelihoods of the trees.collect_models
: Specifies if the function should return the models.midpoint_rooting
: Specifies whether the drawn samples should be midpoint rerooted (Important! Should only be used for time-reversible branch models starting in equilibrium).
Note
The leaves of the initial tree should be populated with data and felsenstein! should be called on the initial tree before calling this function.
Returns
samples
: The trees drawn from the posterior. Returns shallow tree copies, which needs to be repopulated before running felsenstein! etc.sample_LLs
: The associated log-likelihoods of the tree (optional).sample_models
: The models drawn from the posterior (optional). The models can be collapsed into it's parameters withcollapse_models
.
MolecularEvolution.metropolis_sample Method
metropolis_sample(
initial_tree::FelNode,
models::Vector{<:BranchModel},
num_of_samples;
bl_sampler::UnivariateSampler = BranchlengthSampler(Normal(0,2), Normal(-1,1))
burn_in=1000,
sample_interval=10,
collect_LLs = false,
midpoint_rooting=false,
)
A convenience method. One step of the Metropolis algorithm is performed by calling nni_update!
with softmax_sampler
and branchlength_update!
with bl_sampler
.
Additional Arguments
bl_sampler
: Sampler used to drawn branchlengths from the posterior.
MolecularEvolution.metropolis_step Method
metropolis_step(LL::Function, modifier, curr_value)
Does a standard metropolis step in an MCMC, i.e. proposes a candidate symmetrically and returns the next state in the chain, decided by the candidate being rejected or not.
Interface
You need a MySampler <: Any
to implement
proposal(modifier::MySampler, curr_value)
log_prior(modifier::MySampler, x)
apply_decision(modifier::MySampler, accept::Bool)
LL
is by default called on curr_value
and the returned value of proposal
. Although, it is possible to transform the current value before proposing a new value, and then take the inverse transform to match the argument LL
expects.
Extended interface
Hastings
To allow for asymmetric proposals, you must overload
log_proposal(modifier::MySampler, x, conditioned_on)
which returns a constant (0.0
in particular) by default.
Transformations
To make proposals in a transformed space, you overload
tr(modifier::MySampler, x)
invtr(modifier::MySampler, x)
which are identity transformations by default.
sourceMolecularEvolution.midpoint Method
Returns a midpoint as a node and a distance above it where the midpoint is
sourceMolecularEvolution.mix Method
mix(swm_part::SWMPartition{PType} ) where {PType <: MultiSitePartition}
mix
collapses a Site-Wise Mixture partition to a single component partition, weighted by the site-wise likelihoods for each component, and the init weights. Specifically, it takes a SWMPartition{Ptype}
and returns a PType
. You'll need to have this implemented for certain helper functionality if you're playing with new kinds of SWMPartitions that aren't mixtures of DiscretePartitions
.
MolecularEvolution.name2node_dict Method
name2node_dict(root)
Returns a dictionary of leaf nodes, indexed by node.name. Can be used to associate sequences with leaf nodes.
sourceMolecularEvolution.newick Method
newick(root)
Returns a newick string representation of the tree.
sourceMolecularEvolution.nni_optim! Method
nni_optim!(tree::FelNode, models; <keyword arguments>)
Considers local branch swaps for all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another.
Keyword Arguments
partition_list=nothing
: (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize tree topology with all models, the default option).selection_rule = x -> argmax(x)
: a function that takes the current and proposed log likelihoods and selects a nni configuration. Note that the current log likelihood is stored at x[1].sort_tree=false
: determines if alazysort!
will be performed, which can reduce the amount of temporary messages that has to be initialized.traversal=Iterators.reverse
: a function that determines the traversal, permutes an iterable.shuffle=false
: do a randomly shuffled traversal, overridestraversal
.
MolecularEvolution.nni_update! Method
nni_update!(selection_rule::Function, tree::FelNode, models; <keyword arguments>)
A more verbose version of nni_optim!
.
Keyword Arguments
See nni_optim!
.
Note
selection_rule
is a positional argument here, and not a keyword argument.
MolecularEvolution.node_distances Method
Compute the distance to all other nodes from a given node
sourceMolecularEvolution.node_names Method
node_names(tree)
Returns the names of the nodes of the tree.
sourceMolecularEvolution.node_samples Method
node_samples(tree; partition_inds = 1)
Returns the result of partition2obs
for each node of the tree (including internal nodes, and the root). Can be used eg. after sample_down!
is called. If using a eg. codon model, this will extract a string from the CodonPartition on each node. Acts upon the first partition by default, but this can be changed by setting partition_inds
, which can also be a vector of indices, in which case the result will be a vector for each node.
MolecularEvolution.nodes Method
nodes(tree)
Returns the nodes of the tree (including internal nodes and the root), as a vector.
sourceMolecularEvolution.nonreversibleQ Method
nonreversibleQ(param_vec)
Takes a vector of parameters and returns a nonreversible rate matrix.
sourceMolecularEvolution.parent_list Method
Provides a list of parent nodes nodes from this node up to the root node
sourceMolecularEvolution.partition2obs Method
partition2obs(part::Partition)
Extracts the most likely state from a Partition, transforming it into a convenient type. For example, a NucleotidePartition will be transformed into a nucleotide sequence of type String. Note: You should overload this for your own Partititon types.
sourceMolecularEvolution.plot_multiple_trees Method
plot_multiple_trees(trees, inf_tree; <keyword arguments>)
Plots multiple phylogenetic trees against a reference tree, inf_tree
. For each tree in trees
, a linear Weighted Least Squares (WLS) problem (parameterized by the weight_fn
keyword) is solved for the x-positions of the matching nodes between inf_tree
and tree.
Keyword Arguments
node_size=4
: the size of the nodes in the plot.line_width=0.5
: the width of the branches fromtrees
.font_size=10
: the font size for the leaf labels.margin=1.5
: the margin between a leaf node and its label.line_alpha=0.05
: the transparency level of the branches fromtrees
.y_jitter=0.0
: the standard deviation of the noise in the y-coordinate.weight_fn=n::FelNode -> ifelse(isroot(n), 1.0, 0.0))
: a function that assigns a weight to a node for the WLS problem.opt_scale=true
: whether to include a scaling parameter for the WLS problem.
MolecularEvolution.populate_tree! Method
populate_tree!(tree::FelNode, starting_message, names, data; init_all_messages = true, tolerate_missing = 1, leaf_name_transform = x -> x)
Takes a tree, and a starting_message
(which will serve as the memory template for populating messages all over the tree). starting_message
can be a message (ie. a vector of Partitions), but will also work with a single Partition (although the tree) will still be populated with a length-1 vector of Partitions. Further, as long as obs2partition
is implemented for your Partition type, the leaf nodes will be populated with the data from data
, matching the names on each leaf. When a leaf on the tree has a name that doesn't match anything in names
, then if
tolerate_missing = 0
, an error will be throwntolerate_missing = 1
, a warning will be thrown, and the message will be set to the uninformative message (requires identity!(::Partition) to be defined)tolerate_missing = 2
, the message will be set to the uninformative message, without warnings (requires identity!(::Partition) to be defined)
A renaming function that can eg. strip tags from the tree when matching leaf names with names
can be passed to leaf_name_transform
MolecularEvolution.promote_internal Method
promote_internal(tree::FelNode)
Creates a new tree similar to the given tree, but with 'dummy' leaf nodes (w/ zero branchlength) representing each internal node (for drawing / evenly spacing labels internal nodes).
sourceMolecularEvolution.quadratic_CI Method
quadratic_CI(f::Function,opt_params::Vector, param_ind::Int; rate_conf_level = 0.99, nudge_amount = 0.01)
Takes a NEGATIVE log likelihood function (compatible with Optim.jl), a vector of maximizing parameters, an a parameter index. Returns the quadratic confidence interval.
sourceMolecularEvolution.quadratic_CI Method
quadratic_CI(xvec,yvec; rate_conf_level = 0.99)
Takes xvec, a vector of parameter values, and yvec, a vector of log likelihood evaluations (note: NOT the negative LLs you) might use with Optim.jl. Returns the confidence intervals computed by a quadratic approximation to the LL.
sourceMolecularEvolution.read_fasta Method
read_fasta(filepath::String)
Reads in a fasta file and returns a tuple of (seqnames, seqs).
sourceMolecularEvolution.read_newick_tree Method
read_newick_tree(treefile)
Reads in a tree from a file, of type FelNode
sourceMolecularEvolution.refresh! Method
refresh!(tree::FelNode, models; partition_list = 1:length(tree.message))
Run felsenstein!
and felsenstein_down!
on tree
with models
to refresh messages.
MolecularEvolution.reversibleQ Method
reversibleQ(param_vec,eq_freqs)
Takes a vector of parameters and equilibrium frequencies and returns a reversible rate matrix. The parameters are the upper triangle of the rate matrix, with the diagonal elements omitted, and the equilibrium frequencies are multiplied column-wise.
sourceMolecularEvolution.root2tip_distances Method
root2tips(root::AbstractTreeNode)
Returns a vector of root-to-tip distances, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.
sourceMolecularEvolution.root_optim! Method
root_optim!(tree::FelNode, models; <keyword arguments>)
Optimizes the root position and root state of a tree. Returns the new, optimal root node. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another.
Keyword Arguments
partition_list=1:length(tree.message)
: (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize root position and root state with all models, the default option).root_LL!=default_root_LL_wrapper(tree.parent_message[partition_list])
: a function that takes a message and returns a (optimal) parent message and LL (log likelihood). The default option uses the constanttree.parent_message[partition_list]
as parent message for all root-candidates.K=10
: the number of equidistant root-candidate points along a branch. (only to be used in the frequentist framework!?)
MolecularEvolution.root_update! Method
root_update!(root_update::RootUpdate, tree::FelNode, models; partition_list = 1:length(tree.message))
A more general version of root_optim!
. Here root_update
can be either an optimization or a sampling (or more generally, a RootUpdate).
MolecularEvolution.sample_down! Method
sample_down!(root::FelNode,models,partition_list)
Generates samples under the model. The root.parent_message is taken as the starting distribution, and node.message contains the sampled messages. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.
sourceMolecularEvolution.sample_from_message! Method
sample_from_message!(message::Vector{<:Partition})
#Replaces an uncertain message with a sample from the distribution represented by each partition.
sourceMolecularEvolution.savefig_tweakSVG Method
savefig_tweakSVG(fname, plot::Context; width = 10cm, height = 10cm, linecap_round = true, white_background = true)
Saves a figure created using the Compose
approach, but tweaks the SVG after export.
eg. savefig_tweakSVG("export.svg",pl)
MolecularEvolution.savefig_tweakSVG Method
savefig_tweakSVG(fname, plot::Plots.Plot; hack_bounding_box = true, new_viewbox = nothing, linecap_round = true)
Note: Might only work if you're using the GR backend!! Saves a figure created using the Phylo
Plots
recipe, but tweaks the SVG after export. new_viewbox
needs to be an array of 4 numbers, typically starting at [0 0 plot_width*4 plot_height*4]
but this lets you add shifts, in case the plot is getting cut off.
eg. savefig_tweakSVG("export.svg",pl, new_viewbox = [-100, -100, 3000, 4500])
MolecularEvolution.shortest_path_between_nodes Method
Shortest path between nodes, returned as two lists, each starting with one of the two nodes, and ending with the common ancestor
sourceMolecularEvolution.sibling_inds Method
sibling_inds(node)
Returns logical indices of the siblings in the parent's child's vector.
sourceMolecularEvolution.sim_tree Method
sim_tree(add_limit::Int,Ne_func,sample_rate_func; nstart = 1, time = 0.0, mutation_rate = 1.0, T = Float64)
Simulates a tree of type FelNode{T}. Allows an effective population size function (Ne_func), as well as a sample rate function (sample_rate_func), which can also just be constants.
Ne_func(t) = (sin(t/10)+1)*100.0 + 10.0 root = sim_tree(600,Ne_func,1.0) simple_tree_draw(ladderize(root))
sourceMolecularEvolution.sim_tree Method
sim_tree(;n = 10)
Simulates tree with constant population size.
sourceMolecularEvolution.simple_radial_tree_plot Method
simple_radial_tree_plot(root::FelNode; canvas_width = 10cm, line_color = "black", line_width = 0.1mm)
Draws a radial tree. No frills. No labels. Canvas height is automatically determined to avoid distorting the tree.
newt = better_newick_import("((A:1,B:1,C:1,D:1,E:1,F:1,G:1):1,(H:1,I:1):1);", FelNode{Float64}); simple_radial_tree_plot(newt,line_width = 0.5mm,root_angle = 7/10)
sourceMolecularEvolution.simple_tree_draw Method
img = simple_tree_draw(tree::FelNode; canvas_width = 15cm, canvas_height = 15cm, line_color = "black", line_width = 0.1mm)
A line drawing of a tree with very few options.
img = simple_tree_draw(tree)
img |> SVG("imgout.svg",10cm, 10cm)
OR
using Cairo
img |> PDF("imgout.pdf",10cm, 10cm)
MolecularEvolution.standard_tree_sim Method
standard_tree_sim(ntaxa)
Simulates a tree with logistic population growth, under a coalescent model.
sourceMolecularEvolution.total_LL Method
total_LL(p::Partition)
If called on the root, it returns the log likelihood associated with that partition. Can be overloaded for complex partitions without straightforward site log likelihoods.
sourceMolecularEvolution.tree2distances Method
tree2distances(root::AbstractTreeNode)
Returns a distance matrix for all pairs of leaf nodes, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.
sourceMolecularEvolution.tree2shared_branch_lengths Method
tree2distances(root::AbstractTreeNode)
Returns a distance matrix for all pairs of leaf nodes, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.
sourceMolecularEvolution.tree_draw Method
tree_draw(tree::FelNode;
canvas_width = 15cm, canvas_height = 15cm,
stretch_for_labels = 2.0, draw_labels = true,
line_width = 0.1mm, font_size = 4pt,
min_dot_size = 0.00, max_dot_size = 0.01,
line_opacity = 1.0,
dot_opacity = 1.0,
name_opacity = 1.0,
horizontal = true,
dot_size_dict = Dict(), dot_size_default = 0.0,
dot_color_dict = Dict(), dot_color_default = "black",
line_color_dict = Dict(), line_color_default = "black",
label_color_dict = Dict(), label_color_default = "black",
nodelabel_dict = Dict(),compose_dict = Dict()
)
Draws a tree with a number of self-explanatory options. Dictionaries that map a node to a color/size are used to control per-node plotting options. compose_dict
must be a FelNode->function(x,y)
dictionary that returns a compose()
struct.
Example using compose_dict
str_tree = "(((((tax24:0.09731668728575642,(tax22:0.08792233964843627,tax18:0.9210388482867483):0.3200367900275155):0.6948314526087965,(tax13:1.9977212308725611,(tax15:0.4290074347886068,(tax17:0.32928401808187824,(tax12:0.3860215462534818,tax16:0.2197134841232339):0.1399122681886174):0.05744611946245004):1.4686085778061146):0.20724159879522402):0.4539334554156126,tax28:0.4885576926440158):0.002162260013924424,tax26:0.9451873777301325):3.8695419798779387,((tax29:0.10062813251515536,tax27:0.27653633028085006):0.04262434258357507,(tax25:0.009345653929737636,((tax23:0.015832941547076644,(tax20:0.5550597590956172,((tax8:0.6649025646927402,tax9:0.358506423199849):0.1439516404012261,tax11:0.01995439013213013):1.155181296134081):0.17930021667907567):0.10906638146207207,((((((tax6:0.013708993438720255,tax5:0.061144001556547097):0.1395453591567641,tax3:0.4713722705245479):0.07432598428904214,tax1:0.5993347898257291):1.0588025698844894,(tax10:0.13109032492533992,(tax4:0.8517302241963356,(tax2:0.8481963081549965,tax7:0.23754095940676642):0.2394313086297733):0.43596704123297675):0.08774657269409454):0.9345533723114966,(tax14:0.7089558245245173,tax19:0.444897137240675):0.08657675809803095):0.01632062723968511,tax21:0.029535281963725537):0.49502691718938285):0.25829576024240986):0.7339777396780424):4.148878039524972):0.0"
newt = gettreefromnewick(str_tree, FelNode)
ladderize!(newt)
compose_dict = Dict()
for n in getleaflist(newt)
#Replace the rand(4) with the frequencies you actually want.
compose_dict[n] = (x,y)->pie_chart(x,y,MolecularEvolution.sum2one(rand(4)),size = 0.03)
end
tree_draw(newt,draw_labels = false,line_width = 0.5mm, compose_dict = compose_dict)
img = tree_draw(tree)
img |> SVG("imgout.svg",10cm, 10cm)
OR
using Cairo
img |> PDF("imgout.pdf",10cm, 10cm)
MolecularEvolution.tree_polish! Method
tree_polish!(newt, models; tol = 10^-4, verbose = 1, topology = true)
Takes a tree and a model function, and optimizes branch lengths and, optionally, topology. Returns final LL. Set verbose=0
to suppress output. Note: This is not intended for an exhaustive tree search (which requires different heuristics), but rather to polish a tree that is already relatively close to the optimum.
MolecularEvolution.unc2probvec Method
unc2probvec(v)
Takes an array of N-1 unbounded values and returns an array of N values that sums to 1. Typically useful for optimizing over categorical probability distributions.
sourceMolecularEvolution.univariate_maximize Method
univariate_maximize(f, a::Real, b::Real, transform, optimizer::BrentsMethodOpt, t::Real; ε::Real=sqrt(eps))
Maximizes f(x)
using Brent's method. See ?brents_method_minimize
.
MolecularEvolution.univariate_maximize Method
univariate_maximize(f, a::Real, b::Real, transform, optimizer::GoldenSectionOpt, tol::Real)
Maximizes f(x)
using a Golden Section Search. See ?golden_section_maximize
.
Examples
julia> f(x) = -(x-2)^2
f (generic function with 1 method)
julia> m = univariate_maximize(f, 1, 5, identity, GoldenSectionOpt(), 1e-10)
2.0000000000051843
MolecularEvolution.univariate_sampler Method
univariate_sampler(LL, modifier::BranchlengthPeturbation, curr_branchlength)
A MCMC algorithm that draws the next sample of a Markov Chain that approximates the Posterior distrubution over the branchlengths.
MolecularEvolution.values_from_phylo_tree Method
values_from_phylo_tree(phylo_tree, key)
Returns a list of values from the given key in the nodes of the phylo_tree, in an order that is somehow compatible with the order the nodes get plotted in.
MolecularEvolution.weightEM Method
weightEM(con_lik_matrix::Array{Float64,2}, θ; conc = 0.0, iters = 500)
Takes a conditional likelihood matrix (#categories-by-sites) and a starting frequency vector θ (length(θ) = #categories) and optimizes θ (using Expectation Maximization. Maybe.). If conc > 0 then this gives something like variational bayes behavior for LDA. Maybe.
sourceMolecularEvolution.write_fasta Method
write_fasta(filepath::String, sequences::Vector{String}; seq_names = nothing)
Writes a fasta file from a vector of sequences, with optional seq_names.
sourceMolecularEvolution.write_nexus Method
write_nexus(fname::String,tree::FelNode)
Writes the tree as a nexus file, suitable for opening in eg. FigTree. Data in the node_data
dictionary will be converted into annotations. Only tested for simple node_data
formats and types.