mdsa_tools.Viz
The Visualization module stores all of the different operations for the various graphs we create.
It is not a larger object but a series of complimentary functions including helpers that make different color bars depending on values etc.
For the most part we expect a max of around 256 bins but, if you have more bins in any labelling (colorcoding)of your graphs it auto switches to a continous colorbar and goes based on sample index.
See Also
mdsa_tools.Analysis : A lot of the results you will probably visualize
mdsa_tools.Data_gen_hbond.create_system_representations : Build residue–residue H-bond adjacency matrices.
Functions
|
Add a continuous colorbar to a scatter plot. |
|
Add a discrete (categorical) colorbar to a scatter plot. |
|
Scatter "bubble grid" for a UMAP hyperparameter sweep. |
|
Plot a density contour map over 2-D embedding coordinates and save to disk. |
|
Produce a list of colors for 2-D scatter points given discrete labels. |
|
Create and save MD-circos diagrams for PC1 and PC2 magnitudes from a weights table. |
|
Parse a Systems Analysis weights table into residue IDs and per-PC weight mappings. |
|
Create chord endpoints anchored at the middle of a residue arc. |
|
Build a PyCircos Gcircle with arcs for the provided residues. |
|
Draw chords on a PyCircos circle from pairwise weights and save images. |
|
Plot inertia over k, estimate the elbow via the second derivative, and save. |
|
Plot silhouette scores over k, mark the maximum, and save. |
|
Plot a "replicate × frame" map of discrete labels and save to disk. |
|
Create a grouped line plot of RMSD (or similar metric) over a window variable. |
|
Set x and y ticks for an axis depending on range. |
|
Plot a 2-D embedding (e.g., PCA/UMAP) as a scatter with optional coloring and colorbar, and save the figure to disk. |
- mdsa_tools.Viz.add_continuous_colorbar(scatter, labels, cbar_label=None, ax=None, cmap=None, extend='neither', format=None, dpi=600)
Add a continuous colorbar to a scatter plot.
Works for numeric labels directly; for non-numeric labels, maps unique values to an ordinal numeric sequence and normalizes over that range.
- Parameters:
- scattermatplotlib.collections.PathCollection
The scatter object returned by Axes.scatter(…).
- labelsarray-like or None
Values to color by. If None, uses an index-based gradient.
- cbar_labelstr or None, default=None
Colorbar label.
- axmatplotlib.axes.Axes or None, default=None
Target axes. Defaults to plt.gca().
- cmapstr or matplotlib.colors.Colormap or None, default=None
Colormap name or object. Defaults to cm.inferno.
- extend{‘neither’, ‘both’, ‘min’, ‘max’}, default=’neither’
Colorbar extension behavior.
- formatstr or matplotlib.ticker.Formatter or None, default=None
Formatting for colorbar tick labels.
- dpiint, default=600
Unused in this helper (kept for API consistency with plotting functions).
- Returns:
- matplotlib.colorbar.Colorbar
The created colorbar.
Notes
This function also applies the computed Normalize and Colormap to the provided scatter so the colorbar reflects the actual plotted data.
- mdsa_tools.Viz.add_discrete_colorbar(scatter, labels, cbar_label=None, ax=None, cmap=None, dpi=600)
Add a discrete (categorical) colorbar to a scatter plot.
Maps the unique labels to integer IDs and shows a tick per category. For large cardinality (N > 100) it sparsifies ticks every 10 to improve readability.
- Parameters:
- scattermatplotlib.collections.PathCollection
The scatter object returned by Axes.scatter(…).
- labelsarray-like
Categorical labels per point. Converted to strings for tick labels.
- cbar_labelstr or None, default=None
Colorbar label.
- axmatplotlib.axes.Axes or None, default=None
Target axes. Defaults to plt.gca().
- cmapstr or matplotlib.colors.Colormap or None, default=None
Colormap name or object. Defaults to cm.inferno.
- dpiint, default=600
Unused in this helper (kept for API consistency with plotting functions).
- Returns:
- matplotlib.colorbar.Colorbar
The created colorbar.
Notes
Sets the scatter’s norm to a BoundaryNorm over integer bins matching the number of unique categories so colors align with discrete tick marks.
- mdsa_tools.Viz.bubble_grid_manifoldlearning(UMAP_opt_dataframe, xlabel=None, ylabel=None, title=None, cbar_label=None, cmap=None, color_palette=None, savepath=None, dpi=600)
Scatter “bubble grid” for a UMAP hyperparameter sweep.
- Parameters:
- UMAP_opt_dataframepandas.DataFrame
- Columns required:
‘n_neighbors’ (int)
‘min_dist’ (float)
‘pearson_r’ (float)
‘bubble_size’ (float): precomputed marker areas (points^2) to use for s=.
- xlabel, ylabel, titlestr or None
Axis/title labels. Defaults: ‘N. Neighbors’, ‘Min Dist’, ‘UMAP Hyperparameter Sweep’.
- cbar_labelstr or None
Colorbar label. Default ‘Pearson r’.
- cmapmatplotlib Colormap or sequence or None
Base colormap for coloring by ‘pearson_r’. Default is cm.magma_r. If a sequence (list/tuple/ndarray) is provided, it is converted to a colormap. Ignored if color_palette is provided.
- color_palettesequence or matplotlib Colormap, optional
Simple override for cmap. If given a list/tuple/array of colors, a colormap is constructed from it; if given a Colormap, it is used directly.
- savepathstr or None
Full path (including filename) to save the figure. Defaults to cwd.
- dpiint
Dots-per-inch for saving (default 600).
- Returns:
- None
- mdsa_tools.Viz.contour_embedding_space(outfile_path, embeddingspace_coordinates, levels=10, thresh=0, bw_adjust=0.5, title=None, xlabel=None, ylabel=None, gridvisible=False, dpi=600)
Plot a density contour map over 2-D embedding coordinates and save to disk.
- Parameters:
- outfile_pathstr or None
Output path (file name). If None, uses current working directory.
- embeddingspace_coordinatesarray-like of shape (n_samples, 2)
The 2-D embedding coordinates (e.g., PCA/UMAP).
- levelsint, default=10
Number of contour levels.
- threshfloat, default=0
Only draw contours where estimated density is greater than this threshold.
- bw_adjustfloat, default=0.5
Bandwidth adjustment factor for KDE.
- titlestr or None, default=None
Plot title.
- xlabel, ylabelstr or None, default=None
Axis labels.
- gridvisiblebool, default=False
Whether to show the background grid.
- dpiint, default=600
Dots-per-inch used when saving the figure.
- Returns:
- None
Saves the contour plot and closes the figure.
Notes
Convenience wrapper over sns.kdeplot with filled contours and colorbar.
- mdsa_tools.Viz.create_2d_color_mappings(labels=[80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 160, 160, 160, 160, 160, 160, 160, 160, 160, 160], colors_list=('purple', 'orange', 'green', 'yellow', 'blue', 'red', 'pink', 'cyan', 'grey', 'brown'), dpi=600)
Produce a list of colors for 2-D scatter points given discrete labels.
- Parameters:
- labelsarray-like of shape (n_samples,), default=([80]*20)+([160]*10)
Discrete labels per sample (e.g., cluster IDs).
- colors_listsequence of str, default=(‘purple’,’orange’,’green’,’yellow’,
‘blue’,’red’,’pink’,’cyan’,’grey’,’brown’)
Palette to cycle through for unique labels.
- dpiint, default=600
Unused in this helper (kept for API consistency with plotting functions).
- Returns:
- list[str]
A color per sample.
Examples
>>> labels = [0, 0, 1, 2, 2, 2] >>> colors = create_2d_color_mappings(labels)
- mdsa_tools.Viz.create_MDcircos_from_weightsdf(PCA_ranked_weights, outfilepath=None, dpi=600)
Create and save MD-circos diagrams for PC1 and PC2 magnitudes from a weights table.
- Parameters:
- PCA_ranked_weightspandas.DataFrame
Must include ‘Comparisons’, ‘PC1_magnitude’, and ‘PC2_magnitude’ columns.
- outfilepathstr or None, default=None
Output prefix directory/path. If None, uses os.getcwd(). The function appends the stems:
‘PC1_magnitudeviz’
‘PC2_magnitudeviz’
before adding file extensions.
- dpiint, default=600
Dots-per-inch used when saving the generated figures.
- Returns:
- None
Saves the figures and colorbars to disk.
Notes
Both PC1 and PC2 visualizations share the same arc layout constructed from the residue IDs present in ‘Comparisons’.
- mdsa_tools.Viz.extract_properties_from_weightsdf(pca_table)
Parse a Systems Analysis weights table into residue IDs and per-PC weight mappings.
- Parameters:
- pca_tablepandas.DataFrame
Must contain at least: - ‘Comparisons’ : str, residue pair keys like
'i-j'. - ‘PC1_magnitude’ : float - ‘PC2_magnitude’ : float- dpiint, default=600
Unused in this helper (kept for API consistency with plotting functions).
- Returns:
- residueslist of str
Unique residue IDs encountered in ‘Comparisons’, in first-appearance order.
- PC1_weight_dictdict[str, float]
Mapping
'i-j'-> PC1 magnitude.- PC2_weight_dictdict[str, float]
Mapping
'i-j'-> PC2 magnitude.
Notes
Keys are preserved exactly; inverse pairs (
'i-j'vs'j-i') are not merged.
- mdsa_tools.Viz.get_Circos_coordinates(residue, gcircle)
Create chord endpoints anchored at the middle of a residue arc.
- Parameters:
- residuestr or int
Residue arc identifier present in gcircle. If int, it will be looked up as a string (e.g., ‘42’).
- gcirclepy.Gcircle
A PyCircos Gcircle object that already contains arcs.
- dpiint, default=600
Unused in this helper (kept for API consistency with plotting functions).
- Returns:
- tuple
A 4-tuple (arc_id, start_pos, end_pos, radial) suitable for Gcircle.chord_plot(…), where start and end positions are the arc midpoint and the radial anchor is 550.
Notes
Assumes the arc exists in gcircle._garc_dict. This is a convenience wrapper to place chords at arc midpoints for a tidy symmetric look.
Examples
>>> arc = get_Circos_coordinates('45', circle) >>> # later: circle.chord_plot(arc, other_arc, linewidth=1.5, facecolor='k')
- mdsa_tools.Viz.make_MDCircos_object(residue_indexes)
Build a PyCircos Gcircle with arcs for the provided residues.
Arc sizing, label size, and figure size are coarsely adapted to the number of residues to keep visuals legible for both small and large sets.
- Parameters:
- residue_indexeslist of (str or int)
Residue identifiers to add as arcs. Stored as strings internally.
- dpiint, default=600
Unused here (figure dpi is not altered to avoid changing behavior).
- Returns:
- py.Gcircle
A Gcircle with arcs added and set_garcs() already called.
Notes
For small sets (<= 50 residues) a compact 6×6 figure is used; for larger sets a 10×10 figure with narrower labels and bigger arc sizes is used.
- mdsa_tools.Viz.mdcircos_graph(empty_circle, residue_dict, savepath='/home/docs/checkouts/readthedocs.org/user_builds/mdsa-tools/checkouts/latest/docs/sourcemdcircos_graph', scale_factor=5, colormap=matplotlib.cm.magma_r, dpi=600)
Draw chords on a PyCircos circle from pairwise weights and save images.
Creates a chord diagram on empty_circle using residue_dict where keys are residue pair strings
'i-j'and values are magnitudes (signed allowed). Saves the main diagram assavepath + '.png'and a separate colorbar image assavepath + '_colorbar.png'.- Parameters:
- empty_circlepy.Gcircle
A Gcircle that already has arcs for all residues referenced by keys in residue_dict.
- residue_dictdict[str, float]
Mapping from pair key
'i-j'to a numeric magnitude (used for both chord color and line width after normalization).- savepathstr, default=os.getcwd()+’mdcircos_graph’
Output prefix for image files.
- scale_factorfloat, default=5
Multiplier for the normalized chord linewidths.
- colormapstr or matplotlib.colors.Colormap, default=cm.magma_r
Colormap used for chord colors and the separate colorbar.
- dpiint, default=600
Dots-per-inch used when saving the figures.
- Returns:
- None
Saves figure(s) to disk and closes the colorbar figure.
Notes
Colors are min-max normalized over the raw (signed) values; widths use min-max over absolute values for aesthetics.
Pair keys are split on the first ‘-’ to look up per-residue arc anchors.
Examples
>>> circle = make_MDCircos_object(['10','20','30']) >>> weights = {'10-20': 0.4, '20-30': -0.2} >>> mdcircos_graph(circle, weights, savepath='/tmp/example')
- mdsa_tools.Viz.plot_elbow_scores(cluster_range, inertia_scores, outfile_path=None, title=None, xlabel=None, ylabel=None, dpi=600)
Plot inertia over k, estimate the elbow via the second derivative, and save.
- Parameters:
- cluster_rangearray-like
Candidate k values.
- inertia_scoresarray-like
KMeans inertia per k (same length/order as cluster_range).
- outfile_pathstr, default=’elbow_method.png’
Path prefix or filename to save the figure. The code appends the suffix
'elbow_plot'to this string.- title, xlabel, ylabelstr or None
Optional figure/axis labels.
- dpiint, default=600
Dots-per-inch used when saving the figure.
- Returns:
- int
Estimated elbow k (argmin of the second difference + 1).
- mdsa_tools.Viz.plot_sillohette_scores(cluster_range, silhouette_scores, outfile_path=None, title=None, xlabel=None, ylabel=None, dpi=600)
Plot silhouette scores over k, mark the maximum, and save.
- Parameters:
- cluster_rangearray-like
Candidate k values.
- silhouette_scoresarray-like
Silhouette score per k (same length/order as cluster_range).
- outfile_pathstr, default=’sillohette_method.png’
Path prefix or filename to save the figure. The code appends the suffix
'sillohuette_plot'(note spelling) to this string.- title, xlabel, ylabelstr or None
Optional figure/axis labels.
- dpiint, default=600
Dots-per-inch used when saving the figure.
- Returns:
- int
k with maximum silhouette score.
Notes
The filename suffix used in saving is
'sillohuette_plot'for historical reasons.
- mdsa_tools.Viz.replicatemap_from_labels(labels, frame_list, savepath=None, title=None, xlabel=None, ylabel=None, cbar_label=None, cmap=None, dpi=600) None
Plot a “replicate × frame” map of discrete labels and save to disk.
- Parameters:
- labelsarray-like of shape (n_total_frames,)
Label per frame (e.g., k-means cluster or any discrete annotation), concatenated across replicates in the same order as frame_list.
- frame_listarray-like of shape (n_replicates,)
Number of frames in each replicate, in the exact concatenation order used to build labels.
- savepathstr or None, default=None
Directory or path prefix where the plot is saved. If None, uses os.getcwd(). The file name appended is
'replicate_map.png'(i.e., saved atf"{savepath}replicate_map.png").- titlestr or None, default=None
Figure title; if None, a default is used.
- xlabel, ylabelstr or None, default=None
Axis labels. If omitted, defaults are used.
- cbar_labelstr or None, default=None
Label for the colorbar.
- cmapstr or matplotlib.colors.Colormap or None, default=None
Colormap for the label values. Defaults to
cm.magma_r.- dpiint, default=600
Dots-per-inch used when saving the figure.
- Returns:
- None
The figure is saved to disk and closed. Nothing is returned.
Notes
Uses a small square marker per (replicate, frame) and a discrete colorbar for low/medium cardinality labels; switches to a continuous colorbar when unique label count is very large (>= 1000).
Replicate index is placed on the y-axis (top row is replicate 0) and the axis is inverted for a top-down visual.
Examples
>>> labels = [0]*100 + [1]*120 + [2]*90 >>> frames = [100, 120, 90] >>> replicatemap_from_labels(labels, frames, savepath="/tmp/")
- mdsa_tools.Viz.rmsd_lineplots(pandasdf=None, title='RMSD plot', xgroupvar='window', ygroupvar='rmsd', xlab='window', ylab='rmsd', groupingvar='cluster', cmap=matplotlib.cm.inferno_r, cmap_is_colormap=True, legendtitle='Cluster', outfilepath='/home/docs/checkouts/readthedocs.org/user_builds/mdsa-tools/checkouts/latest/docs/source', dpi=600)
Create a grouped line plot of RMSD (or similar metric) over a window variable.
- Parameters:
- pandasdfpandas.DataFrame or None, default=None
Data with at least the columns specified by xgroupvar, ygroupvar, and groupingvar.
- titlestr, default=’RMSD plot’
Plot title.
- xgroupvarstr, default=’window’
Column used for the x-axis.
- ygroupvarstr, default=’rmsd’
Column used for the y-axis.
- xlabstr, default=’window’
X-axis label.
- ylabstr, default=’rmsd’
Y-axis label.
- groupingvarstr, default=’cluster’
Column used to form separate lines (and legend entries).
- cmapstr or matplotlib.colors.Colormap, default=cm.inferno_r
Palette/colormap to use.
- cmap_is_colormapbool, default=True
If True, interpret cmap as a Matplotlib colormap.
- legendtitlestr, default=’Cluster’
Legend title.
- outfilepathstr, default=os.getcwd()
Output prefix for the saved figure. The function appends
'_rmsdlineplot'.- dpiint, default=600
Dots-per-inch used when saving the figure.
- Returns:
- None
Saves the line plot and closes the figure.
- mdsa_tools.Viz.set_ticks(ax=None, dpi=600)
Set x and y ticks for an axis depending on range.
If the axis span exceeds 100 units, ticks are placed every 10 units; otherwise, Matplotlib’s default tick locator is preserved.
- Parameters:
- axmatplotlib.axes.Axes or None, default=None
Axis to apply tick settings. Defaults to the current axis.
- dpiint, default=600
Unused in this helper (kept for API consistency with plotting functions).
- Returns:
- None
Modifies the axis in place.
- mdsa_tools.Viz.visualize_reduction(embedding_coordinates, cbar_type=None, color_mappings=None, savepath='/home/docs/checkouts/readthedocs.org/user_builds/mdsa-tools/checkouts/latest/docs/source', title=None, cmap=None, axis_one_label=None, axis_two_label=None, cbar_label=None, gridvisible=False, color_palette=None, dpi=600)
Plot a 2-D embedding (e.g., PCA/UMAP) as a scatter with optional coloring and colorbar, and save the figure to disk.
- Parameters:
- embedding_coordinatesarray-like of shape (n_samples, 2)
The 2D coordinates to plot.
- cbar_type{‘discrete’, ‘continuous’} or None, default=None
Desired colorbar behavior. If None, defaults to ‘discrete’. When ‘discrete’ is selected but the number of unique values in color_mappings is large (>= 250), the function automatically falls back to a continuous colorbar.
- color_mappingsarray-like of shape (n_samples,) or None, default=None
Values used to color points. - If provided (non-empty) and treated as categorical (i.e., cbar_type=’discrete’
and < 250 unique values), a discrete colorbar is drawn.
If provided but either cbar_type=’continuous’ or >= 250 unique values, a continuous colorbar is drawn.
If None or empty, points are colored by their index (0..n_samples-1) with a continuous colorbar.
- savepathstr, default=os.getcwd()
Full output path including filename. No extension is appended automatically. The figure is saved at 500 DPI.
- titlestr or None, default=’Dimensional Reduction of Systems’
Figure title.
- cmapstr or matplotlib.colors.Colormap or sequence, default=cm.magma_r
Base colormap. If a sequence is passed, it is converted to a Colormap. Ignored when color_palette is provided.
- axis_one_labelstr or None, default=’Embedding Space Axis 1’
X-axis label.
- axis_two_labelstr or None, default=’Embedding Space Axis 2’
Y-axis label.
- cbar_labelstr or None, default=’Value’
Colorbar label.
- gridvisiblebool, default=False
If True, show a background grid.
- color_palettesequence of color specs or matplotlib.colors.Colormap, default=None
User-supplied palette that overrides cmap. - With categorical coloring: builds a ListedColormap from the sequence. - With continuous coloring or when color_mappings is None: builds a
LinearSegmentedColormap from the sequence.
If a Colormap object is supplied, it is used directly.
- dpiint, default=600
Dots-per-inch used for the created figure and when saving.
- Returns:
- None
Saves the plot to savepath and closes the figure.
Notes
Figure size is 16×12 inches at 300 DPI (saved at 500 DPI).
Axes spines are hidden; tick density is coarsened via set_ticks.
Automatically switches from discrete to continuous colorbar when unique categories >= 250 to keep the legend readable.