mdsa_tools.Data_gen_hbond
A module for creating and manipulating systems representations of molecular dynamics trajectories. Most MD groups typically have access to HPC resources, which makes tasks like high-dimensional clustering tractable. On standard workstations, we recommend down-sampling or masking datasets before use.
For AMBER users we would also recommend the CPPTRAJ_IMPORT module which can import the results of the hbond command in series form and maps atomic hbond counts to the residue level.
IMPORTANT NOTE is that this module expects your
See Also
mdsa_tools.Cpptraj_import.cpptraj_hbond_import
Classes
|
- class mdsa_tools.Data_gen_hbond.TrajectoryProcessor(trajectory_path=None, topology_path=None, one_indexed=None, preloaded_trajectory=None)
Bases:
object- Parameters:
- Trajectory_path:str
A path to a trajectory file in various formats admitted by mdtraj.
- Topology_path:str
A path to the topology pertaining to the trajectory you would like to load
- Attributes:
- system_representationnp.ndarray or None
Array of adjacency matrices representing residue–residue interactions for each frame of the trajectory. Shape = (n_frames, n_residues+1, n_residues+1). Initialized as
Noneuntilcreate_system_representationsis called.- filtered_representationnp.ndarray or None
Subset of the system representation containing only residues of interest. Generated by
create_filtered_representations. Useful for focused analyses.- feature_matrixnp.ndarray or None
Matrix representation of the system suitable for downstream dimensionality reduction and clustering workflows. Placeholder attribute, populated by analysis routines.
- topologymdtraj.Topology
MDTraj topology object corresponding to the loaded trajectory. Provides residue and atom indexing used throughout representation building.
Methods
Process_trajectory(trajectory, array_template)Processes an individual frame of template array and fills in hydrogen bonding values.
create_attributes(trajectory[, granularity, ...])returns atom to residue dictionary and template array for processing
create_filtered_representations(residues_to_keep)Filters arrray representations to contain only residues of interest
create_system_representations([trajectory, ...])Wraps operations for creating systems representations into a nice single method
Notes
Unless the file is in a trajectory format that includes its topology information please include it as its seperate argument
topor else it will throw an error.Examples
>>> tp = TrajectoryProcessor("traj.mdcrd", "topology.prmtop") >>> tp.trajectory <mdtraj.Trajectory with 1000 frames, 2000 atoms>
- Process_trajectory(trajectory, array_template, atom_to_residue=None, granularity=None, one_indexed=None) ndarray
Processes an individual frame of template array and fills in hydrogen bonding values.
- Parameters:
- trajectory:md.trajectory:
An MDTraj trajectory object that is used for computing adjacency matrices directly from trajectories.
- array_template:np.ndarray,shape=(n_residues,n_residues,n_frames)
This is an empty array of shape (n_residues,n_residues,n_frames) where we have n_frames worth of adjacency matrices of size n_residues*n_residues
- atom_to_residue:Dict, Dict[atom_index]=residue_index
Dictionary containing atom to residue index mappings
- Returns:
- array_template:np.ndarray,shape=(n_frames,n_residues,n_residues)
A reference to the original array. It is updating the same array in memory but, in theory it is done for throughness.
Examples
>>> tp = TrajectoryProcessor("traj.mdcrd", "topology.prmtop") >>> atom_to_residue, template = tp.create_attributes(tp.trajectory) >>> filled = tp.Process_trajectory(tp.trajectory, template, atom_to_residue) >>> filled.shape (1000, 495, 495)
- create_attributes(trajectory, granularity=None, one_indexed=None) Tuple[ndarray, Dict]
returns atom to residue dictionary and template array for processing
- Parameters:
- trajectory:mdtraj.Trajectory
- Returns:
- atom_to_residue:Dict, atom_to_residue[atom_index]=residue_index
Dictionary containing atom to residue mappings
- template_array: np.ndarray, shape=(n_frames,n_residues,n_residues)
returns array containing adjacency matrices for every frame. Shape is dependent on residues in trajectory and number of frames.
Notes
This atom to residue dictionary is important as the function we will use for extracting hydrogen bonding information returns hydrogen bonds at the atomic level, and we need it at the residue level for this particular “systems” representation.
The template array is so we only create one datastructure to modify later improving efficiency.
Examples
>>> tp = TrajectoryProcessor("traj.mdcrd", "topology.prmtop") >>> atom_to_residue, template = tp.create_attributes(tp.trajectory) >>> len(atom_to_residue) 2000 >>> template.shape (1000, 495, 495)
- create_filtered_representations(residues_to_keep, systems_representation=None)
Filters arrray representations to contain only residues of interest
- Parameters:
- systems_representation: np.ndarray, shape=(n_frames,n_residues,n_residues)
Array containing adjacency matrices for every frame. Shape is dependent on residues in trajectory and number of frames.
- res_of_interest:
An array detailing residues of interest
Examples
>>> tp = TrajectoryProcessor("traj.mdcrd", "topology.prmtop") >>> tp.create_system_representations() >>> filtered = tp.create_filtered_representations(residues_to_keep=[10, 20, 30]) >>> filtered.shape (1000, 4, 4)
- create_system_representations(trajectory=None, granularity=None)
Wraps operations for creating systems representations into a nice single method
- Parameters:
- trajectory:mdtraj.Trajectory:
An mdtraj trajectory object that should have in theory been created when you load in the class but, can also be included in the argument
- granularity:str,default=
- Returns:
- Systems: np.ndarray, shape=(n_frames,n_residues,n_residues)
returns array containing adjacency matrices for every frame. Shape is dependent on residues in trajectory and number of frames.
Examples
>>> tp = TrajectoryProcessor("traj.mdcrd", "topology.pdb") >>> systems = tp.create_system_representations() >>> systems.shape (1000, 495, 495)