chromo package

This package contains the main components of the polymer system being simulated. Each module represents a different scale of chromatin. At the largest scale are fields, which are implicit representations of polymers and their interactions. Taking a field theoretic approach, we can efficiently simulate whole chromosomes at nucleosome-scale resolution. Next, the mixture module deals with explicit interactions between beads on multiple polymer chains. The polymer module provides classes for different polymer modles and methods for evaluating elastic energy. The beads module provides classes to represent the discrete monomeric units of the polymer with various levels of detail. Finally, at the smallest scale, the binders module provides chemical properties of proteins or other bound material that dictate their interaction with the polymer and with each other.

chromo.fields

Fields discretize space to efficiently calculate change in binder energy.

Notes

Creates a field object that contains parameters for the field calculations and functions to generate the densities.

Cythonized to accomodate expensive compute_dE method.

To profile runtime of cython functions, add the following comment to the top of this file: # cython: profile=True

class chromo.fields.FieldBase

Bases: object

A discretization of space for computing energies.

Notes

Must be subclassed to be useful.

As implemented so far, this codebase accepts only one polymer in the field

polymers

Polymers contained in the field; for now, this codebase accepts only one polymer in the field

Type:

List[PolymerBase]

n_polymers

Number of polymers in the field

Type:

long

binders

Table containing reader proteins bound to the polymer and their relevant properties

Type:

pd.DataFrame

name

Print the name of the field.

Notes

For now, there’s only one field per sim, so classname works.

class chromo.fields.NullField

Bases: FieldBase

A field with no energy contributions.

from_file()

Load Field description.

to_file()

Save Field description.

class chromo.fields.Reconstructor(cls, **kwargs)

Bases: object

Defer defining Field until after PolymerBase/Binder instances.

Notes

Constructs a kwargs object that can be re-passed to the appropriate Field constructor when they become available.

Parameters:
  • field_constructor (cls) – Class from which the field will be instantiated

  • kwargs (Dict) – Keyword arguments used to instantiate the field

finalize(polymers, binders) FieldBase

Finish construction of appropriate Field object.

Parameters:
  • polymers (List[PolymerBase]) – List of polymers contained in the field

  • binders (pd.DataFrame) – Table representing reader proteins bound to polymers in the field

Returns:

Field representation of discretized space containing polymers

Return type:

FieldBase

classmethod from_file(path: Path) FieldBase

Assume class name is encoded in file name.

Parameters:

path (Path) – File path object directed to the file defining the field

Returns:

Field representation of discretized space containing polymers

Return type:

FieldBase

class chromo.fields.UniformDensityField

Bases: FieldBase

Rectilinear discretization of space as a rectangular box.

Notes

Each bead contributes “mass” to the eight nearest voxels containing it. The mass is linearly interpolated based on the distance of the bead from the center of each voxel. This is much more stable numerically than just using the voxels as “bins,” as would be done in e.g., finite-difference discretization.

Many attributes are updated during each calculation of field energy change. We choose to track and update these attributes to avoid costly array initialization during each iteration of the MC simulation. Affected attributes are identified with the “TEMP” flag in their description.

_field_descriptors

Names of attributes which describe the geometric layout of the field as a grid of voxels

Type:

List[str]

x_width, y_width, z_width

Dimensions of the discretized field in the x, y, and z directions

Type:

double

width_xyz

Vectorized representation of the x, y, and z widths of the discretized field

Type:

double[:]

nx, ny, nz

Number of voxels in each of the x, y, and z directions

Type:

long

dx, dy, dz

Width of each voxel in the x, y, and z directions

Type:

double

dxyz

Vectorized representation of the voxel widths in the x, y, and z directions

Type:

double[:]

n_bins

Number of voxels in the field

Type:

long

vol_bin

Volume of each voxel; equal to dx*dy*dz

Type:

double

bin_index

For each of the nx*ny*nz voxels in the field, stores indices identifying eight vertices at the corners of the voxel

Type:

long[:, ::1]

nbr_indices_with_trial

Super-indices of eight neighboring voxels containing a bead’s position in space (dim1) from the polymer’s current configuration (dim1=0) and proposed configuration (dim1=1); TEMP

Type:

long[:, ::1]

nbr_inds

Super-indices of eight neighboring voxels containing a bead’s position in space; TEMP

Type:

long[:]

index_xyz_with_trial

Vector containing a bead’s bin index in the x, y, and z directions (dim1) for the polymer’s current configuration (dim0=0) and trial configuration (dim0=1); TEMP

Type:

long[:, ::1]

index_xyz

Vector containing a bead’s bin index in the x, y, z directions; TEMP

Type:

long[:]

wt_vec_with_trial

Bead’s weight linearly interpolated between the eight nearest voxels surrounding it (dim1) for a polymer’s current configuration (dim0=0) and trial configuration (dim0=1); TEMP

Type:

double [:]

wt_vec

Vector containing a bead’s weight linearly interpolated between the eight nearest voxels surrounding it; TEMP

Type:

double[:]

xyz_with_trial

Position of the bead in the x, y, z directions (dim1), shifted during linear interpolation of voxel weights for the polymer’s current configuration (dim0=0) and trial configuration (dim0=1); TEMP

Type:

double[:, ::1]

xyz

Position of the bead in the x, y, z directions, shifted during linear interpolation of voxel weights; TEMP

Type:

double[:]

weight_xyz_with_trial

Vector of a bead’s linearly interpolated weights in a voxel in the x, y, z directions as determined for the voxel surrounding the bead with the lowest x, y, z indices (dim1) for the polymer’s current configuration (dim0=0) and tiral configuration (dim0=1); TEMP

Type:

double[:, ::1]

weight_xyz

Vector of a bead’s linearly interpolated weights in a voxel in the x, y, z directions as determined for the voxel surrounding the bead with the lowest x, y, z indices; TEMP

Type:

double[:]

num_binders

Number of reader proteins bound to the polymer in the simulation

Type:

long

doubly_bound, doubly_bound_trial

Vectors indicating whether or not a bead is doubly bound by each tracked reader protein in the current polymer configuration (doubly_bond) and trial configuration (doubly_bound_trial); TEMP

Type:

long[:]

confine_type

Name of the confining boundary around the polymer; if the polymer is unconfined, confine_type is a blank string

Type:

str

confine_length

Length scale of the confining boundary

Type:

double

density, density_trial

Current (density) and proposed change (density_trial) in density of beads (column 0) and each reader protein (columns 1…) in each voxel of the discretized space; voxels are sorted by super-index and arranged down the rows the the density arrays

Type:

double[:, ::1]

access_vols

Mapping of voxel super-index (keys) to volume of the voxel inside the confining boundary (values).

Type:

Dict[long, double]

chi

Negative local Flory-Huggins parameter dictating non-specific bead interaction, in units of (kT / bead vol. nondim. by bin vol.)

Type:

double

dict_

Dictionary of key attributes defining the field and their values

Type:

dict

binder_dict

Dictionary representation of each reader protein and their properties

Type:

dict

half_width_xyz

Half the width of the full discretized space in the x, y, and z directions; equivalent to np.array([self.x_width/2, self.y_width/2, self.z_width/2])

Type:

double[:]

half_step_xyz

Half the length of the voxel in single x, y, and z directions; equivalent to np.array([self.dx/2, self.dy/2, self.dz/2])

Type:

double[:]

n_xyz_m1

One less than the number of voxels that fit in the x, y, and z directions of the field.

Type:

long[:]

inds_xyz_to_super

Array containing the super-index of each voxel (stored in the array values) for each combination of x-voxel position (dim0), y-voxel position (dim1) and z-voxel position (dim2).

Type:

long[:, :, ::1]

vf_limit

Volume fraction limit in a voxel.

Type:

float

fast_field

If this value is 1, a coarse-grained density field will be evaluated using pre-computed sub-bin weightings; otherwise, bin weightings will be interpolated from bead positions during each iteration

Type:

bint

n_points

Number of sub-bins to precompute in each dimension when the ``fast field’’ is active; should be an even number

Type:

long

compute_E()

Compute total field energy for the current polymer configuration.

Notes

Load the positions and states of the entire polymer. Then determine the number density of beads in each bin of the field. Use these densities and binder states on each bead to determine an overall field energy.

Parameters:

poly (poly.PolymerBase) – Polymer object undergoing transformation in field

Returns:

Total energy associated with the field

Return type:

double

from_file()

Recover field saved with .to_file().

Notes

The zeroths column is an index column, and the first column stores the data specifying the field.

Parameters:
  • path (str) – Path to the CSV file representing the field

  • polymers (List[PolymerBase]) – Polymers contained in the field

  • binders (pd.DataFrame) – Table describing reader proteins affecting the field

Returns:

Discrete density field represented by the CSV file

Return type:

UniformDensityField

get_accessible_volumes()

Numerically find accessible volume of voxels at confinement edge.

Notes

Right now, accessible volume calculations are only relevant to spherical confinements.

Parameters:
  • n_side (long) – When a voxel is determined to be partially contained by the confinement, we will numerically determine the accessible volume in that voxel. To do this, we will place a cubic grid of points within the voxel, and we will count the number of grid points within the confinement. The parameter n_side gives the number of grid points to include on each side of the cubic grid. For example, if n_side = 10, then we will place a 10x10x10 grid of points in every partially contained voxel to determine the accessible volume of that voxel.

  • assume_fully_accessible (bint) – Flag indicating whether to assume all voxels are fully accessible. Assume voxels are fully accessible if the voxel volumes are far less than the confinement volume. Value of 1 indicates that all voxels are assumed to be fully accessible, bypassing the calculation of accessible volume.

Returns:

access_vol – Mapping of bin super index to accessible volume inside the confining boundary

Return type:

dict

get_dict()

Dictionary representation of the UniformDensityField object.

Returns:

Dictionary of key attributes representing the field

Return type:

dict

init_field_energy_prefactors()

Initialize the field energy prefactor for each reader protein.

init_grid()

Initialize the discrete grid containing the field.

Notes

The field will be represented by a grid of rectangular prism (typically cubical) voxels, each of dimensions dx by dy by dz. To calculate interaction energies, density will be determined inside each voxel and a mean field approach will be applied.

nonspecific_interact_E()

Get nonspecific interaction energy for the full polymer.

Notes

This method assumes that all beads are of the same volume.

Parameters:

poly (poly.PolymerBase) – Polymer affected by the current MC move

Returns:

Absolute energy associated with the nonspecific interactions between segments of the polymer chain

Return type:

double

to_file()

Save Field description + Polymer/ReaderProtein names to CSV.

Parameters:

path (str) – File path at which to save the CSV file representing the field

Returns:

Returns None if path is not None; otherwise returns the CSV representing the field as a string

Return type:

None or str

update_all_densities()

Update the density of the field for a single polymer.

Notes

Updates the voxel densities stored in the field object. See notes for self.get_change_in_density() for details on implementation.

Parameters:
  • poly (poly.PolymerBase) – Polymer for which densities are calculated

  • inds (long[:] by reference) – Array of bead indices for the entire polymer (to avoid re-computing this)

  • n_inds (long) – Number of beads in the polymer

update_all_densities_for_all_polymers()

Update the density map for every polymer in the field.

Notes

Updates the voxel densities stored in the field object. See notes for self.get_change_in_density() for details on implementation.

Requires that binders are listed in the same order on each polymer, as listed in self.binders.

chromo.fields.assign_beads_to_bins()

Create a mapping from bin indices to list of associated beads.

Parameters:
  • r_poly (double[:, ::1]) – Array of bead positions

  • n_inds (long) – Number of beads - must be equivalent in length to r_poly

  • nx (long) – Number of bins in the x direction

  • ny (long) – Number of bins in the y direction

  • nz (long) – Number of bins in the z direction

  • x_width (double) – Widths of space in the x direction

  • y_width (double) – Widths of space in the y direction

  • z_width (double) – Widths of space in the z direction

Returns:

Mapping of bin indices to associated beads, where keys indicate the bin indices and values are lists of beads contained in the bin

Return type:

dict

chromo.fields.get_blocks()

Coarse grain individual beads into blocks of linearly adjacent beads.

Notes

Used to improve computational tractibility of contact map generation.

Parameters:
  • num_beads (long) – Number of beads in full-resolution polymer

  • block_size (long) – Number of beads to place into a single block in course-grained model

Returns:

Mapping of each bead ID to its block.

Return type:

dict[long, long]

chromo.fields.get_neighboring_bins()

Generate map of bin indices to all immediately neighboring bin indices.

Parameters:
  • nx (long) – Number of bins in the x direction

  • ny (long) – Number of bins in the y direction

  • nz (long) – Number of bins in the z direction

Returns:

Dictionary where keys indicate bin indices and values provide a list of all immediately neighboring bin indices, including the key.

Return type:

Dict[long, np.ndarray(ndims=1, dtype=long)]

chromo.fields.get_neighbors_at_ind()

Identify the bin indices immediately neighboring a specific bin index.

Parameters:
  • nx (long) – Number of bins in the x direction

  • ny (long) – Number of bins in the y direction

  • nz (long) – Number of bins in the z direction

  • ind (long) – Super-index of bin for which neighbors are desired

  • num_bins (long) – Number of bins contained in the field

Returns:

Array of nine bin super-indices immediately neighboring and containing the bin specified by ind

Return type:

np.ndarray(ndims=1, dtype=long)

chromo.fields.super_ind_to_inds()

Calculate the three-dimensional indices from a super-index.

Notes

Given the super-index of a voxel and the number of voxels in the x and y directions, calculate the three-dimensional indices of the voxel.

Parameters:
  • super_ind (long) – Super-index of a voxel

  • nx (long) – Number of bins in the x direction

  • ny (long) – Number of bins in the y direction

Returns:

long[ – Indices of bin in x, y, and z directions

Return type:

]

chromo.mixtures

Mixtures represent collections of interacting polymers.

class chromo.mixtures.Mixture(polymers: List[PolymerBase])[source]

Bases: object

Class representation of a mixture of multiple polymers.

count_beads() int[source]

Calculate the total number of nucleosomes in all polymers.

Returns:

count – Number of nucleosomes in all polymers

Return type:

int

get_neighbors(radius: float) List[Tuple[Bead, Bead]][source]

Get neighboring beads in polymer mixture.

NOTE: Determination of pairwise neighbors requires pairwise calculation of distances between all beads on all polymers. As such, this method is computationally intensive and recommended for use only on polymers with low numbers of total beads.

Parameters:

radius (float) – Cut-off distance used to specify a neighboring bead pair

Returns:

List of neighboring bead pairs falling in specified distance

Return type:

List[Tuple[Bead, Bead]]

chromo.polymers

Polymers that will make up our simulation.

class chromo.polymers.Chromatin

Bases: SSWLC

Stretchable, sharable WLC model of chromatin.

Notes

The persistence length of DNA is 53 nm. Each base pair of DNA is 0.34 nm along the double helix.

Please see documentation for PolymerBase and SSWLC for parameter descriptions.

TODO: This should be instantiated directly from SSWLC. This does not have the same level of abstraction as SSTWLC, which is also a subclass of SSWLC. Subclasses at the same level should have the same level of abstraction. To implement chromatin as an instance of the SSWLC, we need to adjust the construct_beads method to accept an arbitrary bead type.

class chromo.polymers.DetailedChromatin

Bases: SSTWLC

Class representation of a chromatin fiber with detailed nucleosomes.

class chromo.polymers.DetailedChromatin2

Bases: DetailedChromatin

Chromatin fiber with detailed nucleosomes without nucleosome diameter.

Notes

This class is identical to DetailedChromatin except that the nucleosome diameter is not included in the energy calculations. This is useful for comparison with theoretical end-to-end distances of a kinked wormlike chain.

Parameters:
  • parameter (See documentation for DetailedChromatin class for details and) –

  • descriptions.

class chromo.polymers.DetailedChromatinWithSterics

Bases: DetailedChromatin

Chromatin fiber with detailed nucleosomes and steric interactions.

Notes

Because we are computing pairwise distances explicitly, we can evaluate binder protein interactions specifically. For this reason, we do not need to use a field. The field is never active. We can specify a null field as a place-holder. Beware that the field will not contribute to the energy associated with an instance of this class.

check_steric_clashes()

Check for steric clashes between nucleosomes.

Returns:

Number of Steric clashes.

Return type:

long

compute_E()

Compute the overall polymer energy at the current configuration.

Notes

This method loops over each bond in the polymer and calculates the energy change of that bond.

Returns:

Configurational energy of the polymer.

Return type:

double

compute_E_detailed()

Compute the overall polymer energy at the current configuration.

Notes

This method loops over each bond in the polymer and calculates the energy change of that bond. The method returns the energy as a dictionary with the elastic, steric, and interaction components of the total energy.

Returns:

Dictionary of the elastic, steric, interaction, and total energy of a configuration.

Return type:

Dict[str, float]

compute_E_detailed_reverse()

Compute the overall polymer energy at the current configuration.

Notes

This method computes the total energy of a configuration in the opposite direction of compute_E_detailed. It is intended just to show that the CHANGE in energy is dependent on the way that energy is calculated. We need to calculate total energy with the same indexing scheme that we use to calculate the change in energy with each MC step.

Returns:

Dictionary of the elastic, steric, interaction, and total energy of a configuration.

Return type:

Dict[str, float]

compute_dE()

Compute change in energy for proposed transformation.

Notes

This method accounts for the elastic energy of the polymer, the steric interactions between nucleosomes, and the interactions between binders.

Parameters:
  • move_name (str) – Name of the MC move for which the energy change is being calculated

  • inds (array_like (N,)) – Indices of N beads affected by the MC move

  • n_inds (long) – Number of beads affected by the MC move

Returns:

delta_energy_poly – Change in polymer elastic energy associated with the trial move

Return type:

double

eval_E_steric_clashes()

Compute the Energy associated with steric clashes.

Returns:

Energy associated with steric clashes in the current configuration.

Return type:

double

eval_delta_steric_clashes()

Get the change in steric clashes associated with a move.

Notes

This method computes the steric clashes, but does so only by checking the beads that were affected by an MC move. There is not need to re-check for steric clashes between unmoved beads.

Returns:

Change in the number of steric clashes associated with the move.

Return type:

double

evaluate_binder_interactions()

Evaluate change in energy associated with binding states.

Notes

Pairwise distances should be updated before running this method. To avoid double counting cross-interactions, the cross-interaction energy associated with two binders should only be listed in the properties of one of the binders. The cross-interaction energy should be listed as 0 for the pair in the properties of the other binder.

Returns:

Change in energy associated with binding states and reader protein interactions.

Return type:

double

get_E_bind()

Evaluate energy associated with current binding states.

Notes

Pairwise distances should be updated before running this method. To avoid double counting cross-interactions, the cross-interaction energy associated with two binders should only be listed in the properties of one of the binders. The cross-interaction energy should be listed as 0 for the pair in the properties of the other binder.

Returns:

Change in energy associated with binding states and reader protein interactions.

Return type:

double

get_delta_distances()

Get changes in distances, rather than every distance

Notes

This function operates a lot like get_distances, but only computes the pairwise distances only between beads affected by a move and all other beads in the system. This is useful for computing the change in energy associated with a proposed move in a more efficient manner.

This method only updates the distances_trial attribute of the class.

Parameters:
  • ind0 (long) – Indices of the first and one-past-the-last beads affected by the move

  • indf (long) – Indices of the first and one-past-the-last beads affected by the move

get_distances()

Get the distances between all pairs of nucleosomes.

is_field_active()

The field is never active for this class.

Notes

All interactions are handled using explicit pairwise interactions.

class chromo.polymers.LoopedSSTWLC

Bases: SSTWLC

Class representation of a looped SSTWLC.

looped_confined_gaussian_walk()

Initialize a polymer to a looped, confined Gaussian random walk.

Parameters:
  • name (str) – Name of the polymer

  • num_beads (int) – Number of monomer units on the polymer

  • bead_length (float) – The amount of polymer path length between this bead and the next bead. For now, a constant value is assumed (the first value if an array is passed).

  • confine_type (str) – Name of the confining boundary. To indicate model w/o confinement, enter a blank string for this argument

  • confine_length (double) – The lengthscale associated with the confining boundary. Length representation specified in function associated w/ confine_type

Returns:

Object representing a SSTWLC initialized as a confined, looped Gaussian random walk

Return type:

SSTWLC

class chromo.polymers.PolymerBase

Bases: TransformedObject

Base class representation of an arbitrary polymer.

Notes

See TransformedObject documentation for additional notes and attributes.

name

A name of the polymer for convenient representation; should be a valid filename

Type:

str

log_path

Path to configuration tracker log (if empty, flags for no configuration log to be generated)

Type:

str

beads

Monomeric unit forming the polymer

Type:

beads.Bead

num_beads

Number of monomeric units forming the polymer

Type:

long

configuration_tracker

Object tracking configuration of the polymer over the course of the MC simulation, used for purposes of tracking simulation convergence

Type:

mc_stat.ConfigurationTracker

lp

Persistence length of the polymer

Type:

double

num_binders

Number of reader protein types bound to the polymer which are tracked by the simulator

Type:

long

n_binders_p1

One more than num_binders to avoid repeated calculation

Type:

long

binder_names

Name of M bound chemical states tracked in states; length of binder_names must match number of columns in states

Type:

array_like (M,) of str

all_inds

Indices corresponding to each bead in the polymer

Type:

array_like (N,) of long

r, t3, t2

Current position (r), tangent vector (t3) and material normal (t2) in global coordinates (x, y, z) for each of N beads in the polymer

Type:

array_like (N, 3) of double

r_trial, t3_trial, t2_trial

Proposed position (r), tangent vector (t3) and material normal (t2) in global coordinates (x, y, z) for each of N beads in the polymer; evaluated for acceptance using energy change determined by compute_dE

Type:

array_like (N, 3) of double

states

Current state of each of the M reader proteins being tracked for each of the N beads in the polymer

Type:

array_like (N, M) of long

states_trial

Trial state of each of the M reader proteins being tracked for each of the N beads in the polymer

Type:

array_like (N, M) of long

chemical_mods

States of M chemical modifications (e.g., counts of H3K9me3) on each of N beads in the polymer

Type:

array_like (N, M) of long

chemical_mod_names

Name of M chemical modifications made to the polymer tracked for each bead, corresponding to columns in chemical_mods

Type:

array_like (M,) of str

direction

Vector describing the direction of a transformation during a move in the MC simulation; stored as attribute to avoid costly array generation during inner loop of simulation

Type:

array_like (3,) of double

point

Used to describe a reference point (typically a fulcrum) for a move in the MC simulation; stored as attribute to avoid costly array generation during inner loop of simulation

Type:

array_like (3,) of double

last_amp_move

Most recent move amplitude during simulation step

Type:

double

last_amp_bead

Most recent bead amplitude during simulation step

Type:

long

required_attrs, _arrays, _3d_arrays

Minimal required attributes (required_attrs), arrays tracked (_arrays), and multi-index arrays tracked (_3d_arrays) by the polymer

Type:

array_like (L,) of str

dr, dr_perp, bend

Change in position (dr), perpendicular component of the change in position (dr_perp), and bend vector (bend) used to compute the elastic energy of a bond in a polymer; stored as attribute to avoid costly array generation during inner loop of simulation

Type:

array_like (3,) of double

dr_test, dr_perp_test, bend_test

Change in position (dr_test), perpendicular component of the change in position (dr_perp_test), and bend vector (bend_test) of proposed configuration used to compute the elastic energy of a bond in a polymer; stored as attribute to avoid costly array generation during inner loop of simulation

Type:

array_like (3,) of double

densities_temp

Densities for current (dim0 = 0) and trial (dim0 = 1) bins containing bead during MC move; the densities are tracked for the bead itself and each bound reader proteins (stored in dim1); the densities are tracked for 8 bins containing the bead before and after the move (stored in dim2); stored as attribute to avoid costly array generation during inner loop of simulation

Type:

array_like (2, n_binders_p1, 8) of double

max_binders

Indicates the maximum number of total binders of any type that can bind a single bead. This attribute accounts for steric limitations to the total number of binders that can attach to any give bead in the polymer. A value of -1 is reserved for no limit to the total number of binders bound to a bead. Any move which proposes more than the maximum number of binders is assigned an energy cost of 1E99 kT per binder beyond the limit.

Type:

long

Notes

Do not instantiate instances of PolymerBase – this class functions as a cython equivalent of an abstract base class. PolymerBase specifies the attributes and methods required to define a new polymer class. The class also implements utility methods common to all polymer subclasses.

check_attrs()

Check that the required attributes are specified in the class.

Notes

The required attributes of any polymer are stored in required_attrs. All required attributes must be initialized for any polymer to be instantiated. This method is run during the initialization of the polymer to verify that all required attributes are specified.

fill_missing_arguments()

Fill empty arguments in __init__ call.

Fills t3, t2, states, binder_names, chemical_mods, chemical_mod_names if not defined during polymer initialization.

from_csv()

Construct Polymer from CSV file.

Parameters:

csv_file (str) – Path to CSV file from which to construct polymer

Returns:

Object representation of a polymer

Return type:

PolymerBase

from_dataframe()

Construct Polymer object from DataFrame; inverts .to_dataframe.

Notes

TODO: For the DetailedChromatinWithSterics class, this method does not TODO: load binders from file. Binders must be specified separately as TODO: kwargs. Perhaps a binders file can be handled in the future.

Parameters:
  • df (pd.DataFrame) – Data frame representation of the polymer

  • name (Optional str) – Name of the polymer to be formed (default is None)

Returns:

Polymer object reflecting the dataframe

Return type:

PolymerBase

from_file()

Construct Polymer object from string representation.

Notes

TODO: For the DetailedChromatinWithSterics class, this method does not TODO: load binders from file. Binders must be specified separately as TODO: kwargs. Perhaps a binders file can be handled in the future.

Parameters:
  • path (str) – Path to CSV file representing the polymer

  • name (Optional str) – Name of the polymer to be formed (default is None)

Returns:

Polymer object reflecting the dataframe

Return type:

PolymerBase

get_all()

Get some bead property value from all beads.

Notes

Generates an array from a specified attribute separately stored in each bead object representing the polymer. This utility function is useful for accessing attributes which are separately stored in bead objects and not aggregated in the polymer object.

Parameters:

prop (str) – Name of the property to obtain from all beads

Returns:

Data frame of property values from all beads (default = None)

Return type:

np.ndarray

get_num_beads()

Return number of beads in the polymer.

Returns:

Number of beads forming the polymer

Return type:

long

get_num_binders()

Return number of states tracked per bead.

Returns:

Number of binder types bound to the polymer

Return type:

long

get_prop()

Get specified property of beads at listed indices.

Parameters:
  • inds (array_like (L,) of long) – Bead indices at which to isolate multi-indexed attributes (e.g., position, orientation) of the polymer

  • prop (str) – Property of the bead to return at specified indices

Returns:

Specified property of beads at specified indices

Return type:

np.ndarray (M, 3)

is_field_active()

Evaluate if the polymer is affected by a field.

Notes

In the future, you can make the code more efficient by activating the field only when necessary. The field is necessary only when the polymer is bound by reader proteins OR is in a confinement OR has a volume fraction limit OR has a non-specific chi interaction. For now, to be safe, we will assume that the field always needs to be assessed.

Returns:

Flag indicating whether the polymer is affected by the field (True) or is agnostic to the field (False)

Return type:

bint

static load_seqs()

Load sequence of all chemical modifications on beads of polymer.

Parameters:

paths_to_seqs (array_like (M,) of str) – Paths to the sequence of chemical modifications or binders on the beads. A single path should be provided per epigenetic mark.

Returns:

States of M chemical modifications (e.g., counts of H3K9me3) on each of N beads in the polymer

Return type:

array_like (N, M) of long

to_csv()

Save Polymer object to CSV file as DataFrame.

Parameters:

path (str) – Path to file at which to save CSV representation of the polymer

Returns:

Data frame representation of the polymer, as recorded in the CSV file that was generated

Return type:

pd.DataFrame

to_dataframe()

Write canonical CSV representation of the Polymer to file.

Notes

The top-level multi-index values for the columns should correspond to the kwargs, to simplify unpacking the structure (and to make for easy accessing of the dataframe).

First get a listing of all the parts of the polymer that have to be saved. These are defined in the class’s _arrays attribute.

Next, separate out the arrays into two types: “vector” arrays are composed of multiple columns (like r) and so we need to build a multi- index for putting them into the data frame correctly.

All other arrays can be added as a single column to the data frame.

ReaderProtein names is a special case because it does not fit with dimensions of the other properties, which are saved per bead.

Construct the parts of the DataFrame that need a multi-index.

TODO: remove “list” after numpy fixes issue 17305: https://github.com/numpy/numpy/issues/17305

Replace:

vector_arr = np.concatenate(list(vector_arrs.values()), axis=1)

with:

vector_arr = np.concatenate(vector_arrs.values(), axis=1)

After adding multi-index properties, add remaining arrays one-by-one.

Returns:

Data frame representation of the polymer, as recorded in the CSV file that was generated

Return type:

pd.DataFrame

to_file()

Synonym for to_csv to conform to make_reproducible spec.

Notes

See documentation for to_csv for details.

update_log_path()

Update the path to the configuration tracker log.

Notes

The configuration tracker is used during the simulation to evaluate structural convergence of the polymer.

Parameters:

log_path (str) – Path to the configuration tracker log. If None, no configuration tracker will be initialized. (default = None)

update_prop()

Update bead property for each bead in polymer.

Notes

This utility function copies an attribute of the polymer to each bead forming the polymer.

Parameters:

prop (str) – Name of the bead property being updated

class chromo.polymers.Rouse

Bases: PolymerBase

Class representation of Rouse polymer.

Notes

The Rouse model represents flexible chains of beads connected by harmonic springs. The model inherits from the PolymerBase class, which provides the basic attributes and functionality shared by all polymers. Please see PolymerBase documentation for descriptions of attributes.

class chromo.polymers.SSTWLC

Bases: SSWLC

Class representation of stretchable, shearable wormlike chain w/ twist.

Notes

Please see documentation for PolymerBase and SSWLC for shared parameter descriptions.

TODO: It does not make sense for Chromatin to be on the same level of abstraction as a SSTWLC.

Parameters:
  • lt (double) – Twist persistence length of the polymer (in nm)

  • eps_twist (double) – Twist modulus of the polymer, divided by dimensionless delta, as appears in the equation for SSTWLC elastic energy (units of kbT)

compute_E()

Compute the overall polymer energy at the current configuration.

Notes

This method loops over each bond in the polymer and calculates the energy change of that bond.

Returns:

Configurational energy of the polymer.

Return type:

double

compute_E_no_twist()

Compute the overall polymer energy at the current configuration.

Notes

This method loops over each bond in the polymer and calculates the energy change of that bond.

Returns:

Configurational energy of the polymer.

Return type:

double

class chromo.polymers.SSWLC

Bases: PolymerBase

Class representation of a stretchable, shearable wormlike chain.

Notes

The SSWLC class describes a discrete, semiflexible polymer with attributes for position, orientation, chemical modification, and reader protein binding state.

The polymer carries around a set of coordinates r of shape (num_beads, 3), sets of material normals t3 and t2, some number of chemical modifications, and some number of chemical binding states per bead. A third set of material normals t1 can be derived from the cross-product of the t2 and t3.

Polymer also carries around a dictionary of beads of length num_beads. Each bead has a set of coordiantes r of length (3,). The beads also have material normals t3 and t2. The beads optionally carry around some number of chemical modifications and binding states.

If this codebase is used to simulate DNA, information about the chemical properties of each reader protein are stored in ReaderProtein objects. To model arbitrary SSWLC’s, the different types of Binder objects may be specified to characterize bound components.

See documentation for PolymerBase class for additional paramete details.

TODO: allow differential discretization, and decay to constant discretization naturally.

Parameters:
  • delta (double) – Spacing between beads of the polymer, non-dimensionalized by persistence length

  • eps_bend (double) – Bending modulus of polymer (obtained from _find_parameters)

  • eps_par (double) – Stretch modulus of polymer (obtained from _find_parameters)

  • eps_perp (double) – Shear modulus of polymer (obtained from _find_parameters)

  • gamma (double) – Ground-state segment compression (obtained from _find_parameters)

  • eta (double) – Bend-shear coupling (obtained from _find_parameters)

  • bead_length (np.ndarray (N-1,) of float) – Dimensional spacing between beads of the discrete polymer (in nm)

  • bead_rad (double) – Dimensional radius of each bead in the polymer (in nm)

arbitrary_path_in_x_y()

Construct a polymer initialized as y = f(x) from x = 0.

Notes

As is, this code does not support variable linker lengths.

Parameters:
  • name (str) – Name of the polymer

  • num_beads (int) – Number of monomer units on the polymer

  • bead_length (float) – Dimensional distance between subsequent beads of the polymer (in nm)

  • shape_func (Callable[[float], float]) – Shape of the polymer where z = 0 and y = f(x)

  • step_size (float) – Step size for numerical evaluation of contour length when building the polymer path

Returns:

Object representing a polymer following path y = f(x)

Return type:

SSWLC

arbitrary_path_in_x_y_z()

Construct a polymer initialized as y = f(x) from x = 0.

Notes

As is, this code does not support variable linker lengths.

Parameters:
  • name (str) – Name of the polymer

  • num_beads (int) – Number of monomer units on the polymer

  • bead_length (float) – Dimensional distance between subsequent beads of the polymer (in nm)

  • shape_func_x (Callable[[float], float]) – Parametric functions to obtain the x coordinates of the path

  • shape_func_y (Callable[[float], float]) – Parametric functions to obtain the y coordinates of the path

  • shape_func_z (Callable[[float], float]) – Parametric functions to obtain the z coordinates of the path

  • step_size (float) – Step size for numerical evaluation of contour length when building the polymer path

Returns:

Object representing a polymer following path x = X(t), y = Y(t), z = Z(t)

Return type:

SSWLC

compute_E()

Compute the overall polymer energy at the current configuration.

Notes

This method loops over each bond in the polymer and calculates the energy change of that bond.

Returns:

Configurational energy of the polymer.

Return type:

double

confined_gaussian_walk()

Initialize a polymer to a confined Gaussian random walk.

Parameters:
  • name (str) – Name of the polymer

  • num_beads (int) – Number of monomer units on the polymer

  • step_lengths (np.ndarray (N-1,) float) – Dimensional distance between subsequent beads of the polymer (in nm)

  • confine_type (str) – Name of the confining boundary; to indicate model w/o confinement, enter a blank string for this argument

  • confine_length (double) – The lengthscale associated with the confining boundary. Length representation specified in function associated w/ confine_type

Returns:

Object representing a polymer initialized as a confined Gaussian random walk

Return type:

Polymer

confined_uniform_random_walk()

Initialize a polymer to a confined uniform random walk.

Parameters:
  • name (str) – Name of the polymer

  • num_beads (int) – Number of monomer units on the polymer

  • step_lengths (float) – Dimensional distance between subsequent beads of the polymer (in nm)

  • confine_type (str) – Name of the confining boundary; to indicate model w/o confinement, enter a blank string for this argument

  • confine_length (double) – The lengthscale associated with the confining boundary. Length representation specified in function associated w/ confine_type

Returns:

Object representing a polymer initialized as a confined uniform random walk

Return type:

Polymer

gaussian_walk_polymer()

Initialize a polymer to a Gaussian random walk.

Parameters:
  • name (str) – Name of the polymer

  • num_beads (int) – Number of monomer units on the polymer

  • step_lengths (np.ndarray (N-1,) float) – Dimensional distance between subsequent beads of the polymer (in nm)

Returns:

Object representing a polymer initialized as Gaussian random walk

Return type:

SSWLC

straight_line_in_x()

Construct polymer initialized uniformly along the positve x-axis.

Parameters:
  • name (str) – Name of polymer being constructed

  • step_sizes (np.ndarray (N-1,) of float) – Dimensional distance between subsequent beads of the polymer (in nm) represented as an (N-1,) array where N is the number of beads in the polymer.

Returns:

Object representation of a polymer currently configured as a straight line

Return type:

SSWLC

class chromo.polymers.TransformedObject

Bases: object

Represents any object undergoing physical transformations during MC sim.

Notes

This class provides the transformation matrix attribute. The transformation matrix is updated with each iteration of the MC simulation. It’s inclusion avoids costly matrix initialization with each iteration, drastically improving runtime.

transformation_mat

Homogeneous transformation matrix for an arbitrary transformation in the MC simulation; stored as attribute to avoid costly array generation during inner loop of simulation

Type:

array_like (4, 4) of double

chromo.polymers.compute_twist_angle_omega()

Compute twist angle between two sets of orientation vectors.

Parameters:
  • t2_0 (double[:] of shape (3,)) – T2 and T3 orientation vectors of the first bead

  • t3_0 (double[:] of shape (3,)) – T2 and T3 orientation vectors of the first bead

  • t2_1 (double[:] of shape (3,)) – T2 and T3 orientation vectors of the second bead

  • t3_1 (double[:] of shape (3,)) – T2 and T3 orientation vectors of the second bead

Returns:

Twist angle (in radians)

Return type:

double

chromo.polymers.helix_parametric_x()

Parametric equation for x-coordinates of a helix.

Parameters:

t (double) – Parameter input to the shape function

Returns:

Output to the shape function

Return type:

double

chromo.polymers.helix_parametric_y()

Parametric equation for y-coordinates of a helix.

Parameters:

t (double) – Parameter input to the shape function

Returns:

Output to the shape function

Return type:

double

chromo.polymers.helix_parametric_z()

Parametric equation for z-coordinates of a helix.

Parameters:

t (double) – Parameter input to the shape function

Returns:

Output to the shape function

Return type:

double

chromo.polymers.sin_func()

Sine function to which the polymer will be initialized.

Parameters:

x (double) – Input to the shape function

Returns:

Output to the shape function

Return type:

double

chromo.beads

Beads represent monomeric units forming the polymer.

class chromo.beads.Bead(id_: int, r: ndarray, t3: Optional[ndarray] = None, t2: Optional[ndarray] = None, states: Optional[ndarray] = None, binder_names: Optional[Sequence[str]] = None)[source]

Bases: ABC

Abstract class representation of a bead of a polymer.

id

Bead identifier; typically the beads index position in the chain

Type:

int

r

Position of the bead

Type:

array_like (3,) of double

t3

Tangential orientation vector defining the bead

Type:

array_like (3,) of double

t2

Tangential orientation vector defining the bead; orthogonal to t3

Type:

array_like (3,) of double

states

State of each bound protein on the bead

Type:

array_like (M,) of int

binder_names

Name of each bound protein on the bead

Type:

array_like (M,) of str

abstract print_properties()[source]

Print properties of the bead.

abstract test_collision(**kwargs)[source]

Test collisions with another bead or a confinement.

class chromo.beads.DetailedNucleosome(id_: int, r: ndarray, *, t3: ndarray, t2: ndarray, bp_wrap: int, states: Optional[ndarray] = None, binder_names: Optional[Sequence[str]] = None)[source]

Bases: Nucleosome

A nucleosome with fixed entry/exit positions and orientations.

Notes

The entry and exit positions and orientations are defined relative to the position and t3/t2 vectors of the nucleosome. We assume at this stage that the relative entry and exit positions and orientations are fixed for all nucleosomes.

The nucleosome is defined such that the orientation of the entering DNA is fixed by the t3 orientation. The orientation of the exiting DNA is dictated by the amount of DNA wrapping around the nucleosome. This class includes methods that return the positions and orientations for entering and exiting DNA strands based on the current position and orientation of the nucleosome.

get_entry_exit_orientations()[source]

Get entry and exit orientations of DNA in global reference frame.

get_entry_exit_positions()[source]

Get entry and exit positions of DNA in the global reference frame.

update_configuration(r, t3, t2)[source]

Update the position and orientations of the nucleosome.

Parameters:
  • r (np.ndarray (3,) of float) – Position of nucleosome in global reference frame

  • t3 (np.ndarray (3,) of float) – Orthogonal unit vectors defining the orientation of the nucleosome in the global reference frame.

  • t2 (np.ndarray (3,) of float) – Orthogonal unit vectors defining the orientation of the nucleosome in the global reference frame.

Returns:

  • r_enter (np.ndarray (3,) of float) – Entry position of DNA defined in global reference frame

  • r_exit (np.ndarray (3,) of float) – Exit position of DNA defined in global reference frame

  • t3 (np.ndarray (3,) of float) – t3 tangent of entering DNA in global reference frame

  • t3_exit (np.ndarray (3,) of float) – t3 tangent of exiting DNA in global reference frame

  • t2 (np.ndarray (3,) of float) – t2 tangent of entering DNA in global reference frame

  • t2_exit (np.ndarray (3,) of float) – t2 tangent of exiting DNA in global reference frame

class chromo.beads.GhostBead(id_: int, r: ndarray, *, t3: Optional[ndarray] = None, t2: Optional[ndarray] = None, states: Optional[ndarray] = None, binder_names: Optional[Sequence[str]] = None, rad: Optional[float] = 5, **kwargs)[source]

Bases: Bead

Class representation of a “ghost bead” for which collisions are ignored.

Notes

See documentation for Bead class for additional attribute definitions.

rad

Radius of the spherical ghost bead

Type:

double

vol

Volume of the spherical ghost bead

Type:

double

kwargs

Additional named properties of the ghost bead (for later use)

Type:

Dict

print_properties()[source]

Print properties of the ghost bead.

test_collision()[source]

Collisions are neglected for a ghost bead.

class chromo.beads.Nucleosome(id_: int, r: ndarray, *, t3: ndarray, t2: ndarray, rad: Optional[float] = 5, states: Optional[ndarray] = None, binder_names: Optional[Sequence[str]] = None)[source]

Bases: Bead

Class representation of a nucleosome bead.

Notes

See documentation for Bead class for additional attribute definitions.

rad

Radius of spherical excluded volume around nucleosome

Type:

double

vol

Volume of spherical excluded volume around nucleosome

Type:

double

print_properties()[source]

Print properties of the current nucleosome.

test_collision(point: ndarray) bool[source]

Test collision with the nucleosome bead.

Parameters:

point (array_like (3,) of double) – Point at which to test for collision with the nucleosomes

Returns:

Flag for collision with the nucleosome (True = collision, False = no collision)

Return type:

bool

class chromo.beads.Prism(id_: int, r: ndarray, *, t3: ndarray, t2: ndarray, vertices: ndarray, states: Optional[ndarray] = None, binder_names: Optional[Sequence[str]] = None)[source]

Bases: Bead

Class representation of prism-shaped monomer, enabling sterics.

Notes

The Prism objects allow for careful evaluation of collisions between nucleosome beads.

This class is not yet tested.

See documentation for Bead class for additional attribute definitions.

vertices

Vertices representing a mesh of the nucleosome bead. The verticies are defined around the origin, with orientations such that t3 coincides with the positive x axis and t2 coincides with the positive z axis.

Type:

array_like (M, 3)

classmethod construct_nucleosome(id_: int, r: ndarray, *, t3: ndarray, t2: ndarray, num_sides: int, width: float, height: float, states: Optional[ndarray] = None, binder_names: Optional[Sequence[str]] = None)[source]

Construct nucleosome as a prism w/ specified position & orientation.

Parameters:
  • id (int) – Identifier for the nucleosome

  • r (np.ndarray (3,)) – Coordinates of the nucleosome in form (x, y, z)

  • t3 (np.ndarray (3,)) – Tangent vector defining orientation of nucleosome

  • t2 (np.ndarray (3,)) – Tangent vector orthogonal to t3 defining orientation of nucleosome; a third orthogonal tangent vector can be obtained from the cross product of t2 and t3

  • num_sides (int) – Number of sides on the face of the prism used to represent the nucleosome’s geometry; this determines the locations of verticies of the Prism

  • width (float) – Determines the shape of the prism defining the location of the nucleosome’s verticies. The width gives the diameter of the circle circumscribing the base of the prism in the simulation’s distance units.

  • height (float) – Determines the shape of the prism defining the location of the nucleosome’s verticies. The height gives the height of the prism in the simulation’s distance units.

  • states (Optional[np.ndarray] (M,) of int) – State of each of the M epigenetic binders being tracked

  • binder_names (Optional[Sequence[str]] (M, )) – The name of each bound component tracked in states; binder names are how the properties of the binders are loaded, default = None

Returns:

Instance of a prism matching included specifications

Return type:

Prism

print_properties()[source]

Print properties of the current nucleosome.

test_collision(vertices: ndarray, max_iters: int) bool[source]

Test collision with the current nucleosome.

Parameters:
  • vertices (np.ndarray (M, 3)) – Vertices representing a mesh of the nucleosome bead

  • max_iters (int) – Maximum iterations of the GJK algorithm to evaluate when testing for collision

Returns:

Flag for collision with the nucleosome (True = collision, False = no collision)

Return type:

bool

transform_vertices() ndarray[source]

Transform the verticies of the nucleosome based on r, t2, & t3.

Notes

Begin by translating the position of the verticies to match the position of the nucleosome in space.

Then rotate the nucleosome so that the x-axis on which the verticies are defined aligns with the t3 tangent of the nucleosome.

Finally, rotate the nucleosome so that the z-axis on which the vertices are defined aligns with the t2 tangent of the nucleosome.

Returns:

Transformed vertices representing a mesh of the nucleosome bead positioned and oriented in space.

Return type:

np.ndarray (M, 3)

chromo.beads.get_verticies_rot_mat(current_vec, target_vec, fulcrum)[source]

Rotate verticies defined relative to current_vec.

Parameters:
  • current_vec (array_like (3,) of double) – Current vector relative to which verticies are defined

  • target_vec (array_like (3,) of double) – Target vector relative to which verticies should be defined

  • fulcrum (array_like (3,) of double) – Point about which rotation will take place

Returns:

Homogeneous rotation matrix with which to rotate verticies

Return type:

array_like (4, 4) of double

chromo.binders

Modifications to polymer’s reader protein binding state that affect energy.

Notes

Each simulation will typically involve a polymer bound by some Binder.

For common, physically characterized binding components, such as those bound to known epigenetic modifications or well-characterized modifications on other real polymers, the best known parameters for that binder should be documented here as instances of the appropriate Binder subclass.

class chromo.binders.Binder

Bases: object

Class representation of an arbitrary components binding to a polymer.

Notes

For this code to work, all binders need a string name.

name

Name of the binding component

Type:

str

sites_per_bead

Number of sites to which the binding component can bind on each monomeric unit of the polymer

Type:

int

class chromo.binders.ReaderProtein

Bases: Binder

Information about the chemical properties of a reader protein.

dict()

Represent a class instance as a dictionary (exclude derived values).

Returns:

Dictionary with key attributes representing the reader protein

Return type:

dict

chromo.binders.get_by_name()

Look up saved reader protein by name.

Parameters:

name (str) – Name of the pre-defined reader protein to retrieve

Returns:

Object representing the reader protein that was queried.

Return type:

Binder

chromo.binders.make_binder_collection()

Construct summary DataFrame from sequence of binders.

Parameters:

binders (Binder or str or Sequence[Binder] or Sequence[str]) – The binders to be summarized by the DataFrame.

Returns:

Columns are the properties of each Binders.

Return type:

pd.DataFrame

chromo.__init__