mira.tl.get_ChIP_hits_in_peaks#
- mira.tl.get_ChIP_hits_in_peaks(adata, chrom='chr', start='start', end='end', species='mm10', *, factor_type='chip')#
Find ChIP hits that overlap with accessible regions using CistromeDB’s catalogue of publically-available datasets.
- Parameters
- adataanndata.AnnData
AnnData of accessibility features
- species{“hg38”, “mm10”}
Organism. CistromeDB’s catalogue contains samples for hg38 and mm10.
- chromstr, default = “chr”
The column in adata.var corresponding to the chromosome of peaks
- startstr, defualt = “start”
The column in adata.var corresponding to the start coordinate of peaks
- endstr, default = “end”
The column in adata.var corresponding to the end coordinate of peaks
- Returns
- adataanndata.AnnData
- .varm[“chip_hits”]scipy.spmatrix[float] of shape (n_motifs, n_peaks)
Called ChIP hits for each peak. Non-significant hits are left empty in the sparse matrix.
- .uns[‘chip’]dict of type {strlist}
Dictionary of metadata for ChIP samples. Each key is an attribute. Attributes recorded for each motif are the ID, name, parsed factor name (for lookup in expression data), and whether expression data exists for that factor. The columns are labeled id, name, parsed_name, and in_expr_data, respectively.
Note
To retrieve the metadata for ChIP, one may use the method mira.utils.fetch_factor_meta(adata, factor_type = “chip”). Methods that interact with binding site data always have a factor_type parameter. This parameter defaults to “motifs”, so when using ChIP data, specify factory_type = “chip”.
Examples
>>> atac_data.var ... chr start end ... chr1:9778-10670 chr1 9778 10670 ... chr1:180631-181281 chr1 180631 181281 ... chr1:183970-184795 chr1 183970 184795 ... chr1:190991-191935 chr1 190991 191935 >>> mira.tl.get_ChIP_hits_in_peaks(atac_data, ... chrom = "chr", start = "start", end = "end", ... species = "hg38") ... Grabbing hg38 data (~15 minutes): ... Downloading from database ... Done ... Loading gene info ... ... Validating user-provided regions ... ... WARNING: 71 regions encounted from unknown chromsomes: KI270728.1,GL000194.1,GL000205.2,GL000195.1,GL000219.1,KI270734.1,GL000218.1,KI270721.1,KI270726.1,KI270711.1,KI270713.1 ... INFO:mira.adata_interface.regulators:Added key to varm: chip_hits ... INFO:mira.adata_interface.regulators:Added key to uns: chip