mira.rp.NITE_Model#

class mira.rp.NITE_Model(*, expr_model, accessibility_model, genes, learning_rate=1, counts_layer=None, initialization_model=None, search_reps=1)#

Container for multiple regulatory potential (RP) NITE models. NITE models learn a relationship between a gene’s expression and accessibility in nearby cis-regulatory elements (CRE), and the cell-wide chromatin landscape.

The predictive capacity of local vs. cell-wide chromatin in predicting a gene’s expression state determines a gene’s NITE Score, and edulicates whether that gene is primarily regulated by local or nonlocal mechanisms.

Parameters
expr_model: mira.topics.ExpressionTopicModel

Trained MIRA expression topic model.

accessibility_modelmira.topics.AccessibilityTopicModel

Trained MIRA accessibility topic model.

genesnp.ndarray[str], list[str]

List of genes for which to learn RP models.

learning_ratefloat>0

Learning rate for L-BGFS optimizer.

counts_layerstr, default=None

Layer in AnnData that countains raw counts for modeling.

initialization_modelmira.rp.LITE_Model, mira.rp.NITE_Model, None

Initialize parameters of RP model using the provided model before further optimization with L-BGFS. This is used when training the NITE model, which is initialized with the LITE model parameters learned for the same genes, then retrained to optimized the NITE model’s extra parameters. This procedure speeds training.

Examples

Setup requires RNA and ATAC AnnData objects with shared cell barcodes and trained topic models for both modes:

>>> rp_args = dict(expr_adata = rna_data, atac_adata = atac_data)

Instantiating a NITE model (local chromatin accessibility only):

>>> nitemodel = mira.rp.NITE_Model(
...     expr_model = rna_model, 
...     accessibility_model = atac_model,
...     counts_layer = 'counts',
...     genes = litemodel.genes,
...     instantiation_model = litemodel
... )
>>> nitemodel.fit(**rp_args)
Attributes
genesnp.ndarray[str]

Array of gene names for models

featuresnp.ndarray[str]

Array of gene names for models

modelslist[mira.rp.GeneModel]

List of trained RP models

model_type{“NITE”, “LITE”}

Methods

fit([callback, n_workers, atac_topic_comps_key])

Optimize parameters of RP models to learn cis-regulatory relationships.

get_model(gene)

Gets model for gene

join(rp_model)

Merge RP models from two model containers.

load(prefix)

Load RP models saved with prefix.

load_dir([counts_layer])

Load directory of RP models.

predict(*, expr_adata, atac_adata[, ...])

Predicts the expression of genes given their cis-accessibility state.

probabilistic_isd([n_samples, checkpoint, ...])

For each gene, calcuate association scores with each transcription factor.

save(prefix)

Save RP models.

subset(genes)

Return a subset container of RP models.

classmethod load_dir(counts_layer=None, *, expr_model, accessibility_model, prefix)#

Load directory of RP models. Adds all available RP models into a container.

Parameters
expr_model: mira.topics.ExpressionTopicModel

Trained MIRA expression topic model.

accessibility_modelmira.topics.AccessibilityTopicModel

Trained MIRA accessibility topic model.

counts_layerstr, default=None

Layer in AnnData that countains raw counts for modeling.

prefixstr

Prefix under which RP models were saved.

Examples

>>> litemodel = mira.rp.LITE_Model.load_dir(
...     counts_layer = 'counts',
...     expr_model = rna_model, 
...     accessibility_model = atac_model,
...     prefix = 'path/to/rpmodels/'
... )
subset(genes)#

Return a subset container of RP models.

Parameters
genesnp.ndarray[str], list[str]

List of genes to subset from RP model

Examples

>>> less_models = litemodel.subset(['LEF1','WNT3'])
join(rp_model)#

Merge RP models from two model containers.

Parameters
rp_modelmira.rp.LITE_Model, mira.rp.NITE_Model

RP model container from which to append new RP models

Examples

>>> model1.genes
... ['LEF1','WNT3']
>>> model2.genes
... ['CTSC','EDAR']
>>> merged_model = model1.join(model2)
>>> merged_model.genes
... ['LEF1','WNT3','CTSC','EDAR']
__getitem__(gene)#

Alias for get_model(gene).

Examples

>>> rp_model["LEF1"]
... <mira.rp_model.rp_model.GeneModel at 0x7fa07af1cf10>
save(prefix)#

Save RP models.

Parameters
prefixstr

Prefix under which to save RP models. May be filename prefix or directory. RP models will save with format: {prefix}_{LITE/NITE}_{gene}.pth

get_model(gene)#

Gets model for gene

Parameters
genestr

Fetch RP model for this gene

load(prefix)#

Load RP models saved with prefix.

Parameters
prefixstr

Prefix under which RP models were saved.

fit(callback=None, *, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions')#

Optimize parameters of RP models to learn cis-regulatory relationships.

Parameters
expr_adataanndata.AnnData

AnnData of expression features

atac_adataanndata.AnnData

AnnData of accessibility features. Must be annotated with mira.tl.get_distance_to_TSS.

Returns
rp_modelmira.rp.LITE_Model, mira.rp.NITE_Model

RP model with optimized parameters

predict(*, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions')#

Predicts the expression of genes given their cis-accessibility state. Also evaluates the probability of that prediction for LITE/NITE evaluation.

Parameters
expr_adataanndata.AnnData

AnnData of expression features

atac_adataanndata.AnnData

AnnData of accessibility features. Must be annotated with mira.tl.get_distance_to_TSS.

Returns
anndata.AnnData
.layers[‘LITE_prediction’] or .layers[‘NITE_prediction’]: np.ndarray[float] of shape (n_cells, n_features)

Predicted relative frequencies of features using LITE or NITE model, respectively

.layers[‘LTIE_logp’] or .layers[‘NITE_logp’] : np.ndarray[float] of shape (n_cells, n_features)

Probability of observed expression given posterior predictive estimate of LITE or NITE model, respectively.

probabilistic_isd(n_samples=1500, *, checkpoint=None, hits_matrix, metadata, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions', factor_type='motifs')#

For each gene, calcuate association scores with each transcription factor. Association scores detect when a TF binds within cis-regulatory elements (CREs) that are influential to expression predictions for that gene. CREs that influence the RP model expression prediction are nearby a gene’s TSS and have accessibility that correlates with expression. This model assumes these attributes indicate a factor is more likely to regulate a gene.

Parameters
expr_adataanndata.AnnData

AnnData of expression features

atac_adataanndata.AnnData

AnnData of accessibility features. Must be annotated with TSS and factor binding data using mira.tl.get_distance_to_TSS and mira.tl.get_motif_hits_in_peaks/mira.tl.get_CHIP_hits_in_peaks.

n_samplesint>0, default=1500

Downsample cells to this amount for calculations. Speeds up computation time. Cells are sampled by stratifying over expression levels.

checkpointstr, default = None

Path to checkpoint h5 file. pISD calculations can be slow, and saving a checkpoint ensures progress is not lost if calculations are interrupted. To resume from a checkpoint, just pass the path to the h5.

Returns
anndata.AnnData
.varm[‘motifs-prob_deletion’] or .varm[‘chip-prob_deletion’]: np.ndarray[float] of shape (n_genes, n_factors)

Association scores for each gene-TF combination. Higher scores indicate greater predicted association/regulatory influence.

property parameters_#

Returns parameters of all contained RP models.