CRISPR Screen Assays

Assays module

class screenpro.assays.GImaps[source]: Bases: object

class screenpro.assays.PooledScreens(adata, test='ttest', n_reps=3, verbose=False)[source]

Bases: object

pooledScreens class for processing CRISPR screen datasets

Parameters

adata (AnnData) – AnnData object with adata.X as a matrix of sgRNA counts
test (str) – statistical test to use for calculating phenotype scores
n_reps (int) – number of replicates to use for calculating phenotype scores
verbose (bool) – whether to print verbose output

buildPhenotypeData(run_name='auto', db_rate_col='pop_doubling', **kwargs)[source]

calculateDrugScreen(score_level: Literal['compare_reps', 'compare_guides'], untreated: str, treated: str, t0: Optional[str] = None, db_rate_col: str = 'pop_doubling', run_name: Optional[str] = None, count_filter_threshold: int = 40, count_filter_type: Literal['mean', 'both', 'either'] = 'mean', **kwargs)[source]

Calculate gamma, rho, and tau phenotype scores for a drug screen dataset in a given score_level. This function is a wrapper around runPhenoScore. Check the args of runPhenoScore carefully before using it.

For a given phenotype score, runPhenoScore implements a count filter threshold. By default this threshold changes any guide or target whose mean count across replicates being compared is <40 to NAs. Because this can lead to unexpected behavior when the user relies on filterLowCounts, we specify both the count_filter_threshold and count_filter_type arguments of runPhenoScore explicitly here.

self.adata.obs must have a ‘condition’ column. If doubling infomation is provided, it also needs a ‘replicate’ column.

Parameters

score_level (str) – name of the score level. Must be “compare_reps” or “compare_guides”
untreated (str) – name of the untreated condition
treated (str) – name of the treated condition
t0 (str) – name of the untreated condition
db_rate_col (str) – column name for the doubling rate, default is ‘pop_doubling’
run_name (str) – name for the phenotype calculation run
count_filter_threshold (int) – filter threshold for counts across compared replicates. Default is 40.
count_filter_type (str) – type of filter for counts across replicates. Default is ‘mean.’
**kwargs – additional arguments to pass to runPhenoScore

calculateDrugScreenDESeq(untreated, treated, t0=None, run_name='pyDESeq2', **kwargs)[source]

Calculate DESeq2 results for a given drug screen dataset.

Parameters

design (str) – design matrix for DESeq2-based analysis
untreated (str) – name of the untreated condition
treated (str) – name of the treated condition
t0 (str) – name of the untreated condition
run_name (str) – name for the phenotype calculation run
**kwargs – additional arguments to pass to runDESeq

calculateFlowBasedScreen(low_bin, high_bin, score_level, run_name=None, **kwargs)[source]

Calculate phenotype scores for a flow-based screen dataset.

Parameters

low_bin (str) – name of the low bin condition
high_bin (str) – name of the high bin condition
score_level (str) – name of the score level
run_name (str) – name for the phenotype calculation run
**kwargs – additional arguments to pass to runPhenoScore

copy()[source]

countNormalization(pseudo_count_value=0.5)[source]

Preprocess and normalize the counts data in adata.X

Steps:

Add pseudocount to counts
Normalize counts by sequencing depth

drawVolcano(ax, phenotype_name, threshold, dot_size=1, run_name='auto', score_col='score', pvalue_col='pvalue', xlabel='auto', ylabel='-log10(pvalue)', xlims='auto', ylims='auto', ctrl_label='negative_control', resistance_hits=None, sensitivity_hits=None, size_txt=None, t_x=0, t_y=0, **args)[source]

filterLowCounts(filter_type='all', minimum_reads=1)[source]: Filter low counts in adata.X

getPhenotypeScores(phenotype_name, threshold, run_name='auto', **kwargs)[source]

Get phenotype scores for a given phenotype_name

Parameters

phenotype_name (str) – name of the phenotype score
run_name (str) – name of the phenotype calculation run to retrieve

listPhenotypeScores(run_name='auto')[source]

List available phenotype scores for a given run_name

Parameters: run_name (str) – name of the phenotype calculation run to retrieve