mira.rp.NITE_Model#

class mira.rp.NITE_Model(*, expr_model, accessibility_model, genes, learning_rate=1, counts_layer=None, initialization_model=None, search_reps=1)#

Container for multiple regulatory potential (RP) NITE models. NITE models learn a relationship between a gene’s expression and accessibility in nearby cis-regulatory elements (CRE), and the cell-wide chromatin landscape.

The predictive capacity of local vs. cell-wide chromatin in predicting a gene’s expression state determines a gene’s NITE Score, and edulicates whether that gene is primarily regulated by local or nonlocal mechanisms.

Parameters

expr_model: mira.topics.ExpressionTopicModel: Trained MIRA expression topic model.
accessibility_modelmira.topics.AccessibilityTopicModel: Trained MIRA accessibility topic model.
genesnp.ndarray[str], list[str]: List of genes for which to learn RP models.
learning_ratefloat>0: Learning rate for L-BGFS optimizer.
counts_layerstr, default=None: Layer in AnnData that countains raw counts for modeling.
initialization_modelmira.rp.LITE_Model, mira.rp.NITE_Model, None: Initialize parameters of RP model using the provided model before further optimization with L-BGFS. This is used when training the NITE model, which is initialized with the LITE model parameters learned for the same genes, then retrained to optimized the NITE model’s extra parameters. This procedure speeds training.

Examples

Setup requires RNA and ATAC AnnData objects with shared cell barcodes and trained topic models for both modes:

>>> rp_args = dict(expr_adata = rna_data, atac_adata = atac_data)

Instantiating a NITE model (local chromatin accessibility only):

>>> nitemodel = mira.rp.NITE_Model(
...     expr_model = rna_model, 
...     accessibility_model = atac_model,
...     counts_layer = 'counts',
...     genes = litemodel.genes,
...     instantiation_model = litemodel
... )
>>> nitemodel.fit(**rp_args)

Attributes

genesnp.ndarray[str]: Array of gene names for models
featuresnp.ndarray[str]: Array of gene names for models
modelslist[mira.rp.GeneModel]: List of trained RP models
model_type{“NITE”, “LITE”}

Methods

`fit`([callback, n_workers, atac_topic_comps_key])	Optimize parameters of RP models to learn cis-regulatory relationships.
`get_model`(gene)	Gets model for gene
`join`(rp_model)	Merge RP models from two model containers.
`load`(prefix)	Load RP models saved with prefix.
`load_dir`([counts_layer])	Load directory of RP models.
`predict`(*, expr_adata, atac_adata[, ...])	Predicts the expression of genes given their cis-accessibility state.
`probabilistic_isd`([n_samples, checkpoint, ...])	For each gene, calcuate association scores with each transcription factor.
`save`(prefix)	Save RP models.
`subset`(genes)	Return a subset container of RP models.

classmethod load_dir(counts_layer=None, *, expr_model, accessibility_model, prefix)#

Load directory of RP models. Adds all available RP models into a container.

Parameters

expr_model: mira.topics.ExpressionTopicModel: Trained MIRA expression topic model.
accessibility_modelmira.topics.AccessibilityTopicModel: Trained MIRA accessibility topic model.
counts_layerstr, default=None: Layer in AnnData that countains raw counts for modeling.
prefixstr: Prefix under which RP models were saved.

Examples

>>> litemodel = mira.rp.LITE_Model.load_dir(
...     counts_layer = 'counts',
...     expr_model = rna_model, 
...     accessibility_model = atac_model,
...     prefix = 'path/to/rpmodels/'
... )

subset(genes)#

Return a subset container of RP models.

Parameters

genesnp.ndarray[str], list[str]: List of genes to subset from RP model

Examples

>>> less_models = litemodel.subset(['LEF1','WNT3'])

join(rp_model)#

Merge RP models from two model containers.

Parameters

rp_modelmira.rp.LITE_Model, mira.rp.NITE_Model: RP model container from which to append new RP models

Examples

>>> model1.genes
... ['LEF1','WNT3']
>>> model2.genes
... ['CTSC','EDAR']
>>> merged_model = model1.join(model2)
>>> merged_model.genes
... ['LEF1','WNT3','CTSC','EDAR']

__getitem__(gene)#

Alias for get_model(gene).

Examples

>>> rp_model["LEF1"]
... <mira.rp_model.rp_model.GeneModel at 0x7fa07af1cf10>

save(prefix)#

Save RP models.

Parameters

prefixstr: Prefix under which to save RP models. May be filename prefix or directory. RP models will save with format: {prefix}_{LITE/NITE}_{gene}.pth

get_model(gene)#

Gets model for gene

Parameters

genestr: Fetch RP model for this gene

load(prefix)#

Load RP models saved with prefix.

Parameters

prefixstr: Prefix under which RP models were saved.

fit(callback=None, *, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions')#

Optimize parameters of RP models to learn cis-regulatory relationships.

Parameters

expr_adataanndata.AnnData: AnnData of expression features
atac_adataanndata.AnnData: AnnData of accessibility features. Must be annotated with mira.tl.get_distance_to_TSS.

Returns

rp_modelmira.rp.LITE_Model, mira.rp.NITE_Model: RP model with optimized parameters

predict(*, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions')#

Predicts the expression of genes given their cis-accessibility state. Also evaluates the probability of that prediction for LITE/NITE evaluation.

Parameters

expr_adataanndata.AnnData: AnnData of expression features
atac_adataanndata.AnnData: AnnData of accessibility features. Must be annotated with mira.tl.get_distance_to_TSS.

Returns

anndata.AnnData

.layers[‘LITE_prediction’] or .layers[‘NITE_prediction’]: np.ndarray[float] of shape (n_cells, n_features): Predicted relative frequencies of features using LITE or NITE model, respectively
.layers[‘LTIE_logp’] or .layers[‘NITE_logp’] : np.ndarray[float] of shape (n_cells, n_features): Probability of observed expression given posterior predictive estimate of LITE or NITE model, respectively.

probabilistic_isd(n_samples=1500, *, checkpoint=None, hits_matrix, metadata, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions', factor_type='motifs')#

For each gene, calcuate association scores with each transcription factor. Association scores detect when a TF binds within cis-regulatory elements (CREs) that are influential to expression predictions for that gene. CREs that influence the RP model expression prediction are nearby a gene’s TSS and have accessibility that correlates with expression. This model assumes these attributes indicate a factor is more likely to regulate a gene.

Parameters

expr_adataanndata.AnnData: AnnData of expression features
atac_adataanndata.AnnData: AnnData of accessibility features. Must be annotated with TSS and factor binding data using mira.tl.get_distance_to_TSS and mira.tl.get_motif_hits_in_peaks/mira.tl.get_CHIP_hits_in_peaks.
n_samplesint>0, default=1500: Downsample cells to this amount for calculations. Speeds up computation time. Cells are sampled by stratifying over expression levels.
checkpointstr, default = None: Path to checkpoint h5 file. pISD calculations can be slow, and saving a checkpoint ensures progress is not lost if calculations are interrupted. To resume from a checkpoint, just pass the path to the h5.

Returns

anndata.AnnData

.varm[‘motifs-prob_deletion’] or .varm[‘chip-prob_deletion’]: np.ndarray[float] of shape (n_genes, n_factors): Association scores for each gene-TF combination. Higher scores indicate greater predicted association/regulatory influence.

property parameters_#: Returns parameters of all contained RP models.

mira.rp.LITE_Model

mira.rp_model.rp_model.GeneModel