DLProteinFormats
Documentation for DLProteinFormats.
DLProteinFormats.flatten
DLProteinFormats.sample_batched_inds
DLProteinFormats.unflatten
ProteinChains.writepdb
DLProteinFormats.flatten
— Methodflatten(rec::ProteinStructure; T = Float32)
Takes a ProteinStructure and returns a tuple of the translations, rotations, residue indices, and features for each chain.
DLProteinFormats.sample_batched_inds
— Methodsample_batched_inds(flatrecs; l2b = length2batch(1000, 1.9))
Takes a vector of (flattened) protein structures, and returns a vector of indices into the original array, with each batch containing a random sample of one protein from each cluster.
DLProteinFormats.unflatten
— Methodunflatten(locs, rots, seqints, chainids, resnums)
unflatten(locs, rots, seqhots, chainids, resnums)
unflatten(locs, rots, seq, chainids, resnums)
Converts flattened protein structure data back into ProteinChain objects.
Arguments
locs
: Array of translations/locations (3×1×L or 3×1×L×B for batched)rots
: Array of rotations (3×3×L or 3×3×L×B for batched)seqints
/seqhots
/seq
: Sequence data as integers, one-hot encoding, or generic sequencechainids
: Chain identifiers for each residueresnums
: Residue numbers for each position
Returns
- Vector of
ProteinChain
objects (or vector of vectors for batched input)
The function reconstructs protein chains from flattened representations, applying unit scaling to locations and converting sequence integers back to amino acid strings.
ProteinChains.writepdb
— Functionwritepdb(path, chains::AbstractVector{<:ProteinChains.ProteinChain})
Examples
using DLProteinFormats
data = DLProteinFormats.load(PDBSimpleFlat500);
flat_chains = data[1];
chains = DLProteinFormats.unflatten(
flat_chains.locs,
flat_chains.rots,
flat_chains.AAs,
flat_chains.chainids,
flat_chains.resinds) # unflatten the flat data
writepdb("chains-1.pdb", chains) # view in e.g. chimerax or vscode protein viewer extension