DeepSpCas9 Activity Prediction
Deep learning–based SpCas9 guide RNA editing efficiency prediction
30-nt Input Format
4 nt
Upstream
pos 1–4
20 nt
Protospacer
pos 5–24
3 nt
PAM (NGG)
pos 25–27
3 nt
Downstream
pos 28–30
ACCCCCTCCACCCCGCCTCCGGGACT
Total: exactly 30 nucleotides · A/T/G/C only
Sequence Input
Example Sequences
ACCCCCTCCACCCCGCCTCCGGGACTGCGA
GTCGCCCTCGAACTTCACCTCGGCGCGGGG
ATAGAATACTCAAGCTATGCATCAAGCTTG
Activity Classification
High
≥ 60%
Medium
30–60%
Low
< 30%
Predicted indel frequency (%): expected fraction of alleles with a SpCas9-induced insertion or deletion at the target site.
About DeepSpCas9
DeepSpCas9 predicts SpCas9 editing efficiency using a deep CNN trained on ~12,000 target sequences measured in human cells.
Architecture
3-branch CNN (inception-style)
100 / 70 / 40 filters at 3 / 5 / 7 nt widths
100 / 70 / 40 filters at 3 / 5 / 7 nt widths
Avg Pool → Flatten → Concat
All 3 branches merged into fully connected layers
All 3 branches merged into fully connected layers
FC(80) → FC(60) → Linear output
Dropout 0.3 at each layer, regression target = indel %
Dropout 0.3 at each layer, regression target = indel %
Training data
12,832 SpCas9 target sequences in human cells
12,832 SpCas9 target sequences in human cells
Input Rules
- Exactly 30 nucleotides per sequence
- Only A / T / G / C — no ambiguous bases
- PAM region (pos 25–27) should be NGG
- One per line, or standard FASTA format
- Up to 100 sequences per run
Performance note:
First request loads the TF model (~2–5 s). Subsequent predictions are fast (≪1 s per batch).
AI Assistant
안녕하세요! 무엇을 도와드릴까요?