DeepSpCas9 Activity Prediction

Deep learning–based SpCas9 guide RNA editing efficiency prediction

4 nt

Upstream

pos 1–4

20 nt

Protospacer

pos 5–24

3 nt

PAM (NGG)

pos 25–27

3 nt

Downstream

pos 28–30


              ACCCCCTCCACCCCGCCTCCGGGACT

Total: exactly 30 nucleotides · A/T/G/C only

ACCCCCTCCACCCCGCCTCCGGGACTGCGA

GTCGCCCTCGAACTTCACCTCGGCGCGGGG

ATAGAATACTCAAGCTATGCATCAAGCTTG

High

≥ 60%

Medium

30–60%

Low

< 30%

Predicted indel frequency (%): expected fraction of alleles with a SpCas9-induced insertion or deletion at the target site.

DeepSpCas9 predicts SpCas9 editing efficiency using a deep CNN trained on ~12,000 target sequences measured in human cells.

3-branch CNN (inception-style)
100 / 70 / 40 filters at 3 / 5 / 7 nt widths

Avg Pool → Flatten → Concat
All 3 branches merged into fully connected layers

FC(80) → FC(60) → Linear output
Dropout 0.3 at each layer, regression target = indel %

Training data
12,832 SpCas9 target sequences in human cells

Performance note: First request loads the TF model (~2–5 s). Subsequent predictions are fast (≪1 s per batch).