CRISPR Screen Assays

Assays module

class screenpro.assays.GImaps[source]

Bases: object

class screenpro.assays.PooledScreens(adata, test='ttest', n_reps=3, verbose=False)[source]

Bases: object

pooledScreens class for processing CRISPR screen datasets

Parameters
  • adata (AnnData) – AnnData object with adata.X as a matrix of sgRNA counts

  • test (str) – statistical test to use for calculating phenotype scores

  • n_reps (int) – number of replicates to use for calculating phenotype scores

  • verbose (bool) – whether to print verbose output

buildPhenotypeData(run_name='auto', db_rate_col='pop_doubling', **kwargs)[source]
calculateDrugScreen(score_level: Literal['compare_reps', 'compare_guides'], untreated: str, treated: str, t0: Optional[str] = None, db_rate_col: str = 'pop_doubling', run_name: Optional[str] = None, count_filter_threshold: int = 40, count_filter_type: Literal['mean', 'both', 'either'] = 'mean', **kwargs)[source]

Calculate gamma, rho, and tau phenotype scores for a drug screen dataset in a given score_level. This function is a wrapper around runPhenoScore. Check the args of runPhenoScore carefully before using it.

For a given phenotype score, runPhenoScore implements a count filter threshold. By default this threshold changes any guide or target whose mean count across replicates being compared is <40 to NAs. Because this can lead to unexpected behavior when the user relies on filterLowCounts, we specify both the count_filter_threshold and count_filter_type arguments of runPhenoScore explicitly here.

self.adata.obs must have a ‘condition’ column. If doubling infomation is provided, it also needs a ‘replicate’ column.

Parameters
  • score_level (str) – name of the score level. Must be “compare_reps” or “compare_guides”

  • untreated (str) – name of the untreated condition

  • treated (str) – name of the treated condition

  • t0 (str) – name of the untreated condition

  • db_rate_col (str) – column name for the doubling rate, default is ‘pop_doubling’

  • run_name (str) – name for the phenotype calculation run

  • count_filter_threshold (int) – filter threshold for counts across compared replicates. Default is 40.

  • count_filter_type (str) – type of filter for counts across replicates. Default is ‘mean.’

  • **kwargs – additional arguments to pass to runPhenoScore

calculateDrugScreenDESeq(untreated, treated, t0=None, run_name='pyDESeq2', **kwargs)[source]

Calculate DESeq2 results for a given drug screen dataset.

Parameters
  • design (str) – design matrix for DESeq2-based analysis

  • untreated (str) – name of the untreated condition

  • treated (str) – name of the treated condition

  • t0 (str) – name of the untreated condition

  • run_name (str) – name for the phenotype calculation run

  • **kwargs – additional arguments to pass to runDESeq

calculateFlowBasedScreen(low_bin, high_bin, score_level, run_name=None, **kwargs)[source]

Calculate phenotype scores for a flow-based screen dataset.

Parameters
  • low_bin (str) – name of the low bin condition

  • high_bin (str) – name of the high bin condition

  • score_level (str) – name of the score level

  • run_name (str) – name for the phenotype calculation run

  • **kwargs – additional arguments to pass to runPhenoScore

copy()[source]
countNormalization(pseudo_count_value=0.5)[source]

Preprocess and normalize the counts data in adata.X

Steps:
  1. Add pseudocount to counts

  2. Normalize counts by sequencing depth

drawVolcano(ax, phenotype_name, threshold, dot_size=1, run_name='auto', score_col='score', pvalue_col='pvalue', xlabel='auto', ylabel='-log10(pvalue)', xlims='auto', ylims='auto', ctrl_label='negative_control', resistance_hits=None, sensitivity_hits=None, size_txt=None, t_x=0, t_y=0, **args)[source]
filterLowCounts(filter_type='all', minimum_reads=1)[source]

Filter low counts in adata.X

getPhenotypeScores(phenotype_name, threshold, run_name='auto', **kwargs)[source]

Get phenotype scores for a given phenotype_name

Parameters
  • phenotype_name (str) – name of the phenotype score

  • run_name (str) – name of the phenotype calculation run to retrieve

listPhenotypeScores(run_name='auto')[source]

List available phenotype scores for a given run_name

Parameters

run_name (str) – name of the phenotype calculation run to retrieve