mira.rp.NITE_Model#
- class mira.rp.NITE_Model(*, expr_model, accessibility_model, genes, learning_rate=1, counts_layer=None, initialization_model=None, search_reps=1)#
Container for multiple regulatory potential (RP) NITE models. NITE models learn a relationship between a gene’s expression and accessibility in nearby cis-regulatory elements (CRE), and the cell-wide chromatin landscape.
The predictive capacity of local vs. cell-wide chromatin in predicting a gene’s expression state determines a gene’s NITE Score, and edulicates whether that gene is primarily regulated by local or nonlocal mechanisms.
- Parameters
- expr_model: mira.topics.ExpressionTopicModel
Trained MIRA expression topic model.
- accessibility_modelmira.topics.AccessibilityTopicModel
Trained MIRA accessibility topic model.
- genesnp.ndarray[str], list[str]
List of genes for which to learn RP models.
- learning_ratefloat>0
Learning rate for L-BGFS optimizer.
- counts_layerstr, default=None
Layer in AnnData that countains raw counts for modeling.
- initialization_modelmira.rp.LITE_Model, mira.rp.NITE_Model, None
Initialize parameters of RP model using the provided model before further optimization with L-BGFS. This is used when training the NITE model, which is initialized with the LITE model parameters learned for the same genes, then retrained to optimized the NITE model’s extra parameters. This procedure speeds training.
Examples
Setup requires RNA and ATAC AnnData objects with shared cell barcodes and trained topic models for both modes:
>>> rp_args = dict(expr_adata = rna_data, atac_adata = atac_data)
Instantiating a NITE model (local chromatin accessibility only):
>>> nitemodel = mira.rp.NITE_Model( ... expr_model = rna_model, ... accessibility_model = atac_model, ... counts_layer = 'counts', ... genes = litemodel.genes, ... instantiation_model = litemodel ... ) >>> nitemodel.fit(**rp_args)
- Attributes
- genesnp.ndarray[str]
Array of gene names for models
- featuresnp.ndarray[str]
Array of gene names for models
- modelslist[mira.rp.GeneModel]
List of trained RP models
- model_type{“NITE”, “LITE”}
Methods
fit([callback, n_workers, atac_topic_comps_key])Optimize parameters of RP models to learn cis-regulatory relationships.
get_model(gene)Gets model for gene
join(rp_model)Merge RP models from two model containers.
load(prefix)Load RP models saved with prefix.
load_dir([counts_layer])Load directory of RP models.
predict(*, expr_adata, atac_adata[, ...])Predicts the expression of genes given their cis-accessibility state.
probabilistic_isd([n_samples, checkpoint, ...])For each gene, calcuate association scores with each transcription factor.
save(prefix)Save RP models.
subset(genes)Return a subset container of RP models.
- classmethod load_dir(counts_layer=None, *, expr_model, accessibility_model, prefix)#
Load directory of RP models. Adds all available RP models into a container.
- Parameters
- expr_model: mira.topics.ExpressionTopicModel
Trained MIRA expression topic model.
- accessibility_modelmira.topics.AccessibilityTopicModel
Trained MIRA accessibility topic model.
- counts_layerstr, default=None
Layer in AnnData that countains raw counts for modeling.
- prefixstr
Prefix under which RP models were saved.
Examples
>>> litemodel = mira.rp.LITE_Model.load_dir( ... counts_layer = 'counts', ... expr_model = rna_model, ... accessibility_model = atac_model, ... prefix = 'path/to/rpmodels/' ... )
- subset(genes)#
Return a subset container of RP models.
- Parameters
- genesnp.ndarray[str], list[str]
List of genes to subset from RP model
Examples
>>> less_models = litemodel.subset(['LEF1','WNT3'])
- join(rp_model)#
Merge RP models from two model containers.
- Parameters
- rp_modelmira.rp.LITE_Model, mira.rp.NITE_Model
RP model container from which to append new RP models
Examples
>>> model1.genes ... ['LEF1','WNT3'] >>> model2.genes ... ['CTSC','EDAR'] >>> merged_model = model1.join(model2) >>> merged_model.genes ... ['LEF1','WNT3','CTSC','EDAR']
- __getitem__(gene)#
Alias for get_model(gene).
Examples
>>> rp_model["LEF1"] ... <mira.rp_model.rp_model.GeneModel at 0x7fa07af1cf10>
- save(prefix)#
Save RP models.
- Parameters
- prefixstr
Prefix under which to save RP models. May be filename prefix or directory. RP models will save with format: {prefix}_{LITE/NITE}_{gene}.pth
- get_model(gene)#
Gets model for gene
- Parameters
- genestr
Fetch RP model for this gene
- load(prefix)#
Load RP models saved with prefix.
- Parameters
- prefixstr
Prefix under which RP models were saved.
- fit(callback=None, *, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions')#
Optimize parameters of RP models to learn cis-regulatory relationships.
- Parameters
- expr_adataanndata.AnnData
AnnData of expression features
- atac_adataanndata.AnnData
AnnData of accessibility features. Must be annotated with mira.tl.get_distance_to_TSS.
- Returns
- rp_modelmira.rp.LITE_Model, mira.rp.NITE_Model
RP model with optimized parameters
- predict(*, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions')#
Predicts the expression of genes given their cis-accessibility state. Also evaluates the probability of that prediction for LITE/NITE evaluation.
- Parameters
- expr_adataanndata.AnnData
AnnData of expression features
- atac_adataanndata.AnnData
AnnData of accessibility features. Must be annotated with mira.tl.get_distance_to_TSS.
- Returns
- anndata.AnnData
- .layers[‘LITE_prediction’] or .layers[‘NITE_prediction’]: np.ndarray[float] of shape (n_cells, n_features)
Predicted relative frequencies of features using LITE or NITE model, respectively
- .layers[‘LTIE_logp’] or .layers[‘NITE_logp’] : np.ndarray[float] of shape (n_cells, n_features)
Probability of observed expression given posterior predictive estimate of LITE or NITE model, respectively.
- probabilistic_isd(n_samples=1500, *, checkpoint=None, hits_matrix, metadata, expr_adata, atac_adata, n_workers=1, atac_topic_comps_key='X_topic_compositions', factor_type='motifs')#
For each gene, calcuate association scores with each transcription factor. Association scores detect when a TF binds within cis-regulatory elements (CREs) that are influential to expression predictions for that gene. CREs that influence the RP model expression prediction are nearby a gene’s TSS and have accessibility that correlates with expression. This model assumes these attributes indicate a factor is more likely to regulate a gene.
- Parameters
- expr_adataanndata.AnnData
AnnData of expression features
- atac_adataanndata.AnnData
AnnData of accessibility features. Must be annotated with TSS and factor binding data using mira.tl.get_distance_to_TSS and mira.tl.get_motif_hits_in_peaks/mira.tl.get_CHIP_hits_in_peaks.
- n_samplesint>0, default=1500
Downsample cells to this amount for calculations. Speeds up computation time. Cells are sampled by stratifying over expression levels.
- checkpointstr, default = None
Path to checkpoint h5 file. pISD calculations can be slow, and saving a checkpoint ensures progress is not lost if calculations are interrupted. To resume from a checkpoint, just pass the path to the h5.
- Returns
- anndata.AnnData
- .varm[‘motifs-prob_deletion’] or .varm[‘chip-prob_deletion’]: np.ndarray[float] of shape (n_genes, n_factors)
Association scores for each gene-TF combination. Higher scores indicate greater predicted association/regulatory influence.
- property parameters_#
Returns parameters of all contained RP models.