List of DNA sequences

Molecular Dynamics data for sequences 1-36 and 54-56 below were produced by the ABC consortium. We enhanced the original ABC data set with the oligomers labelled 37-53, which, when considering both the reference and complementary strands, contain all 16 possible end dimers. The MD simulation protocol can be found in:

Oligomers 53-56 were excluded from our training data set, because after filtering for broken hydrogen bonds a relatively small number of snapshots was left for those sequences.

1
GCTATATATATATATAGC
GCTAGATAGATAGATAGC
29
2
GCATTAATTAATTAATGC
GCGCGGGCGGGCGGGCGC
30
3
GCGCATGCATGCATGCGC
GCGTGGGTGGGTGGGTGC
31
4
GCCTAGCTAGCTAGCTGC
GCACTAACTAACTAACGC
32
5
GCCGCGCGCGCGCGCGGC
GCGCTGGCTGGCTGGCGC
33
6
GCGCCGGCCGGCCGGCGC
GCTATGTATGTATGTAGC
34
7
GCTACGTACGTACGTAGC
GCTGTGTGTGTGTGTGGC
35
8
GCGATCGATCGATCGAGC
GCGTTGGTTGGTTGGTGC
36
9
GCAAAAAAAAAAAAAAGC
AAACAATAAGAA
37
10
GCCGAGCGAGCGAGCGGC
AAAGAACAATAA
38
11
GCGAAGGAAGGAAGGAGC
AAATAACAAGAA
39
12
GCGTAGGTAGGTAGGTGC
GGGAGGTGGCGG
40
13
GCTGAGTGAGTGAGTGGC
GGGCGGAGGTGG
41
14
GCAGCAAGCAAGCAAGGC
GGGCGGTGGAGG
42
15
GCAAGAAAGAAAGAAAGC
GGGTGGAGGCGG
43
16
GCGAGGGAGGGAGGGAGC
GGGTGGCGGAGG
44
17
GCGGGGGGGGGGGGGGGC
AAATAAAAATAAGAACAA
45
18
GCAGTAAGTAAGTAAGGC
AAATAACAATAAGAACAA
46
19
GCGATGGATGGATGGAGC
GGGAGGGGGAGGCGGTGG
47
20
GCTCTGTCTGTCTGTCGC
GACATGGTACAG
48
21
GCACAAACAAACAAACGC
ACGATCCTAGCA
49
22
GCAGAGAGAGAGAGAGGC
ATGCTAATCGTA
50
23
GCGCAGGCAGGCAGGCGC
AGCTGAAGTCGA
51
24
GCTCAGTCAGTCAGTCGC
CGAACTTCAAGC
52
25
GCATCAATCAATCAATGC
GTCTACCATCTG
53
26
GCGTCGGTCGGTCGGTGC
GCATAAATAAATAAATGC
54
27
GCTGCGTGCGTGCGTGGC
GCATGAATGAATGAATGC
55
28
GCACGAACGAACGAACGC
GCGACGGACGGACGGAGC
56

Modified sequences, obtained as point mutations of the sequence 1 from the above table.

S1_6T
GCTATTTATATATATAGC
S1_6T_13A
GCTATTTATATAAATAGC
S1_9C
GCTATATACATATATAGC