Mathematical modelling of DNA
1st and 2nd year courses in math or physics, (or with teacher's permission)
Helpful although not required
This course is designed to be an introduction, within the particular context of DNA, to the interplay between analysis, computation and experiment that makes up the process called mathematical modelling. In addition to students whose primary interest is in DNA, the syllabus is intended for students wishing an introduction to the modelling process in general, and the course will describe a number of widely encountered mathematical and computational techniques.
The course will be a detailed introduction to the cgDNA sequence-dependent coarse grain model of DNA, including both how to use it to predict various biologically pertinent sequence-dependent expectations with an associated Monte Carlo code, and all the extensive underlying applied mathematics necessary to estimate cgDNA parameter sets from a library of Molecular Dynamics simulations. The cgDNA model is a research tool that has its own web page . The course will work through the details of publications described on that page, specifically, ,, and  below.
The course has five chapters.
0) Introduction to DNA and a brief overview of its coarse grain models.
1) The sequence-dependent, rigid-base cgDNA model.
2) Monte Carlo methods for sampling cgDNA model equilibrium distributions and application to DNA persistence lengths.
3) Parameter estimation for the cgDNA model from Molecular Dynamics time series.
4) Equality constrained nonlinear optimisation with application to computing cgDNA equilibria.
|Week 1 (20.2)||
Description of the basic structure of DNA, and multiscaling (or coarse graining) approaches. The need for a tertiary structure model of DNA, i.e. a sequence-dependent coarse grain model. Overview of the cgDNA coarse grain model to predict a Gaussian PDF for the configuration distribution of a DNA fragment of given sequence. (three periods lecture, one period exercises)
Here you have the link to the supplementary material for this first lecture.
|Week 2 (27.2)||Coarse graining groups of atoms (in our case atoms forming a base) to a rigid body or frame (R,r), with the data structure of R∈ SO(3) r∈ R^3. The Lie group SE(3) of rigid body displacements and its 4x4 matrix representation|
|Week 3 (6.3)||Relative coordinates of a double chain of rigid bodies. Definition of Watson (or reading) and Crick strands, and the re-embedding of frames on the Crick strand. Here you can find the supplementary material of  , see Bibliography at the bottom of the page. This week we covered pages 1-2 until Figure S3.|
|Week 4 (13.3)||Finish of cgDNA internal coordinates. Watson or reading strand, and definition of base, base-pair and junction frames. cgDNA model configuration coordinates: translations expressed in mid-frames (base-pair frame between two base frames for intras, junction frames between two base-pair frames for inters) and Cayley vectors of relative rotations for both intra and inter relative rotations (with matrix multiplication on the right). Sketch of transformation of frames under Crick-Watson change of reading strand and associated transformation rules for cgDNA coordinates (detailed treatment in exercise session later in the semester). Start of the cgDNA model to construct a Gaussian PDF approximation to the equilibrium distribution for a DNA fragment in solution given its sequence and a cgDNA parameter set. (Much of the material of these lectures covered in pages 2--5 of the PDF linked to under the Week 3 summary).|
|Week 5 (20.3)||
Completion of the definition and assumptions underlying the cgDNA model free energy and Gaussian PDF. Nearest-neighbour interactions and bandedness. Localised sequence-dependence of stiffness and sigmas, and non-local sequence dependence of mu. Structure of a cgDNA parameter set. End of Chapter 1.
|Weeks 6 - 7 - 8 (27.3 - 3.4 - 10.4)||What can be done with the cgDNA model? Discussion of i) expectations, chain correlations and persistence lengths, and ii) probabilities and looping experiments. Numerical approximations of both via the cgDNAmc Monte Carlo code. Definition and analytical computation of persistence lengths in a simplified model (a version of the Helical Worm Like Chain or HWLC model), and relation to numerics for the cgDNA model. Polycopies are available for the material in weeks (6, 7) and the Monte Carlo part of 8 . Shape factorised persistence length was introduced and is treated in the Exercise Session 7.|
|Week 9 (17.4)||Easter break.|
|Week 10 (24.4)||Start of Chapter 3, estimation of cgDNA parameter set. Rappel: definition and assumptions underlying cgDNA model free energy and Gaussian PDF. Nearest-neighbour interactions and bandedness. Localised sequence-dependence of stiffness and sigmas.. Sufficient conditions for a) the stiffness matrix to be positive definite for all sequences, and b) parameters for palindromic sequences to satisfy Crick-Watson symmetry conditions. The count on number of independent scalar parameters in a cgDNA parameter set.|
|Week 11 (1.5)||Maximum likelihood and start of maximum entropy parameter estimation giving rise to Gaussians with both unbanded and banded stiffness matrices. Jensen inequality. Entropy and Kullback-Leibler relative entropy.|
|Week 12 (8.5)||Computation of first-order necessary conditions for maximum entropy fits, and use of the constraints to determine the associated Lagrange multipliers.|
|Week 13 (15.5)||How to design a good sequence library. How to better estimate the oligomer based ground state mu(S) and covariances from MD time series for palindromic sequences. Units of cgDNA internal coordinates and the internal rescaling chosen in the cgDNA model between rotation and translation coordinates.|
|Week 14 (22.5)||Extraction of a cgDNAparamset using a sum of Kullback-Leibler divergences as objective fitting functional. Some details about cgDNAparamset2 and the ABC molecular dynamics simulation project. A palindromic sequence library. The simpler case study of fitting with an L^2 norm objective functional and the associated least squares approach to compute a paramset: an illustration of how to deal with triple overlap blocks in cgDNA stiffness matrices, and the associated null space in the parameter set. For the least square system we refer to the paragraphs 7.4 and 7.5 of .|
|Week 15 (29.5)||Monday exercise session and Friday exercise session/demo of cgDNAeq. Here you can find some complementary notes that can be useful for Session 13.|
The following references for the cgDNA model are available on the cgDNA web page .
-  A DNA Coarse-Grain Rigid Base Model and Parameter Estimation from Molecular Dynamics Simulations , D. Petkevičiūtė Thesis #5520, EPFL, (2012).
-  cgDNA: a software package for the prediction of sequence-dependent coarse-grain free energies of B-form DNA , D. Petkevičiūtė, M. Pasi, O. Gonzalez and J. H. Maddocks Nucleic Acids Research 42, no. 20 (2014), p. e153, (2014) .
-  A sequence-dependent rigid-base model of DNA , O. Gonzalez, D. Petkevičiūtė, and J. H. Maddocks, Journal of Chemical Physics 138, no. 5 (2013), p. 055122 1-28 .
References for general books on DNA.
 Understanding DNA, The molecule & how it work C. R. Calladine, H. R. Drew, B. F. Luisi, A. A. Travers, Third Edition, 2004, Academic Press, ISBN 9780121550893 .
Summary: Understanding DNA explains, step by step, why DNA forms specific structures, the form of these structures and how they fundamentally affect the biological processes of transcription and replication.
 Unraveling Dna: The Most Important Molecule Of Life M. D. Frank-Kamenetskii, Revised and Updated Edition, 1997, Perseus Publishing, ISBN 9780201155846.
Summary: A curious blend of history, biographical details to cover the development of molecular biology from the influence of physicists earlier in the century, through the central dogma of molecular biology to discussion of social issues raised by genetic engineering.
 DNA topology A. D. Bates & A. Maxwell, 2005, Oxford University Press, ISBN 9780198506553.
Summary: A clear, concise explanation of the relevance of supercoiling and catenation in the context of biological activity of the DNA molecule.
 DNA structure and Function R. R. Sinden, 1994, Academic Press, ISBN 9780126457506.
Summary: a timely and comprehensive resource, that provides a simple yet comprehensive introduction to nearly all aspects of DNA structure. It also explains current ideas on the biological significance of classic and alternative DNA conformations.