| |
Lectures by the
Computational Molecular
Biology Department at the
Max Planck Institute for
Molecular Genetics
Statistics, Probability, and Algorithms in
Bioinformatics (SPAß)
(Algorithmen
für Stochastik und Stochastik für Algorithmen)
Lecture and Exercises
|
Vorlesung (Lecture) |
Übung (Exercises) |
| Nr. |
19 706 |
19 706 |
| SWS |
2 |
2 |
| Credits |
3 |
3 |
| Time |
Wed 14:00 |
Wed 15:30 |
| Place |
MATH 025/026 Arnimallee 2-6 |
MATH 025/026 Arnimallee 2-6 |
| Dozenten |
Sven Rahmann
Steffen Grossmann
|
Sven Rahmann
Steffen Grossmann
|
| |
11.02. In the week before the Klausur, an uncorrected
version of the lecture notes sources and the
resulting pdf is available.
03.12. Because of the ongoing strike, today's lecture will
take place at the MPI for Molecular Genetics, Ihnestr. 73, 3rd floor
in the small library room.
26.11. Because of the strike, the lecture could not take
place today. The problem sets that were due today can be handed in
next week. Also, be prepared for more and more difficult problems next
week. You might also start reading about the accept-reject method for
random number generation. (Even though there is a strike, there is no
reason to learn less.)
19.11. About note-taking: I will say in class how to obtain
the most recent version of the lecture notes and the source files.
Please do not change any of those files or hand in modified
versions. Instead, please send me your notes in a file
notes_yourname.tex. It should contain a chapter and
any new commands that you might need (but please use the provided
commands as far as possible). Look at previous lecture notes for
examples.
31.10. In problem 6 of the exercises, the file
intronlengths.txt that you can download from the website,
contains the lengths of individual introns, not the length
distribution! So the mean intron length is approximately 882. Thanks
to Jonas, and sorry about the confusion.
28.10. Lectures start each Wednesday at 14:00,
not 14:15. They take place in Room 025/026 in the Pi-building,
Arnimallee 2-6. The exercises start immediately after the
lecture, at 15:30.
16.10. I put the first set of exercises online beforehand,
so you know what to expect...
This lecture is about fun (Spaß): Statistics, Probability,
and Algorithms in Bioinformatics. In other words, we'll be looking at
the many situations where stochastics and algorithms meet. After this
course, the successful participant will be well prepared with a
toolbox of methods from both worlds and their applications to sequence
analysis.
| Date |
Topics |
Materials |
| 22.10. | Distributions and their representation. Parametric and non-parametric distributions. Discrete and continuous distributions. The IEEE floating point standard (you may use this document as a reference). | Problem Set 1 printieee.c
| | 29.10. | Important parametric distributions and their pdf (pmf, ff), cdf, qf. Moments of distributions. These pages at MathWorld contain all the information you'll ever need to know (and more). | Problem Set 2 Intron lengths
|
| 5.11. | Computing with small probabilities. The lngamma function. See Numerical Recipes in C.
| Problem Set 3
|
| 12.11. | Generation of uniform random numbers. | Problem Set 4
|
| 19.11. | Generation of random numbers with given distributions: Distribution-specific methods
| Problem Set 5
|
| 26.11. | The lecture could not take place because of the strike
|
| 3.12. | Accept-reject method for random number
generation. Reminder on statistical testing. Location: MPI for Molecular Genetics, Ihnestr. 73, 3rd floor, small library.
| Problem Sets 6+7
|
| 10.12. | Quality tests for random number generators.
| Problem Set 8
|
| 17.12. | Sampling of combinatorial structures and objects with
specified properties.
| Problem Set 9
|
| 7.1. | Statistics of sequence alignment I (Steffen Grossmann).
| Problem Set 10
|
| 14.1. | Statistics of sequence alignment II (Steffen Grossmann).
| Problem Set 11
|
| 21.1. | MCMC methods.
| Problem Set 12
|
| 28.1. | MCMC examples. Simulated annealing.
| Problem Set 13
|
| 4.2. | Importance Sampling. Generating functions.
|
| 11.2. | Review.
|
| 18.2. | Final exam (Klausur).
|
Vorlesung und Übung können nur zusammen belegt werden. Die
beiden Teilnoten ergeben gemittelt die Gesamtnote.
- Vorlesung (3 credits):
-
Ausarbeiten eines Skripts zu einer Vorlesungsstunde mit LaTeX (50% der
Note); Bestehen der Klausur am Ende (50% der Note). Beide Noten
müssen mindestens ausreichend (4.0) sein.
- Übungen (3 credits):
-
Die Note richtet sich nach der erreichten Gesamtpunktzahl in den
Übungsaufgaben. Zum Bestehen sind mindestens 50% zu erreichen.
Ein großer Teil der Aufgaben wird aus Projekten
bestehen. Eventuell lassen sich aus einzelnen Projekten Themen
für Master-Arbeiten ableiten.
-
William H. Press, Brian P. Flannery, Saul A. Teukolsky und William T. Vetterling
Numerical Recipes in C, Second Edition
Cambridge University Press
A classic.
-
John A. Rice
Mathematical Statistics and Data Analysis, Second Edition
Duxbury Press
Relatively expensive, but easy-to-read introductory statistics text.
-
W.N. Venables, D.M. Smith, and the R Development Core Team
An introduction to R (Notes on R: A Programming Environment for Data Analysis and Graphics, Version 1.8.0, 2003-10-08)
Official web page of the R project: http://www.r-project.org
R is a free version of the commercial Splus software. Look here
for further books on R.
-
Christian P. Robert und George Casella
Monte Carlo Statistical Methods
Springer
Not very easy to read. Has a lot of material about random number generation and MCMC methods. Contains quite a number of misprints for a SPringer statistics text.
-
Ronald L. Graham, Donald E. Knuth und Oren Patashnik
Concrete Mathematics
Addison-Wesley
Contains all the mathematics a programmer or algorithmician will
ever need -- and more. Not very easy material, but extremely well
written..
-
Robert Sedgewick und Philippe Flajolet
Analysis of Algorithms
Addison-Wesley
Has many examples about how to use generating functions for the analysis of algorithms.
Weitere folgen. In der Vorlesung werde ich jeweils einzelne Themen
und Beispiele aus den genannten Büchern herausgreifen.
Zur Vorlesung soll von den Teilnehmern ein Skript erstellt
werden. Deswegen wird jeder Teilnehmer mindestens einmal eine
Vorlesungsstunde als LaTeX Skript nachbereiten. Dies macht 50% der
Vorlesungsnote aus.
Homepage der Abteilung Computational Molecular Biology am Max-Plack-Institut für Molekulare Genetik.
Zum Studiengang Bioinformatik an der FU Berlin.
Anmerkungen und Fragen zu dieser Seite bitte an Sven Rahmann. Letzte Änderung: Wednesday, 11-Feb-2004 18:54:38 CET
|
|