Schedule UCSC BME 205 Fall 2012

Bioinformatics: models and algorithms

(Last Update: 19:49 PST 9 December 2012 )

Lecture and Homework Schedule

The lecture schedule reflects the material actually delivered. To get an idea what is coming up, see Fall 2010 schedule.

Date Lecture Topic(s) Due
Fri 26 Sept 2012administrivia, texts, assignments, structure of courseintake survey
Mon 1 Oct 2012Guest Lecture, Andrew Uzilov, Python tutorial
Wed 3 Oct 2012 Discussion of UNIX permissions, chmod, how to submit programs. What to comment in a program (meanings of variables, external view of functions and whole program). Why sequences are key to much of bioinformatics (low cost of DNA sequencing). Error rates of information copying (polymerases):
DNA->DNA: Taq 1/125,000; Pfu 1/2,300,000; mitochondrial pol gamma ; 1/300,000-1/500,000; pol eta 1/18-1/38
RNA->DNA: HIV 1/1700-/8000; SIV 1/19,000; Accuscript RT 1/62,000
RNA->RNA: flu 1/10,000; Qbeta 1/3,000.
Ribosome: Ecoli 1/3000; generally 1/2000-1/20,000.
Fri 5 Oct 2012error rates and read lengths in sequencing. Notion of stochastic model as computable probability function. Scaffolding
Mon 8 Oct 2012Review of returned homework: re.findall vs. re.split (keep re.compile outside loop), indent comments same as code, format() now preferred for Python, __doc__. Review definition of probability function, uniform distribution, i.i.d., event spaces, distribution over sequences of length n, lead up to length distribution * conditional distribution.
Wed 10 Oct 2012Discussion of FASTA/FASTQ format details. Conditional probability, i.i.d. model, log probability, adding probabilities in log probability
Fri 12 Oct 2012Mainly about fellowship applicationsFASTA/FASTQ
Mon 15 Oct 2012Review of returned homework: suggestions about argparse (store_const and file opening), list comprehensions, izip_longest, translate. Markov chains: sum to 1 over strings of fixed length. Markov chains with stop: sum to 1 over all strings (left as homework).
Wed 17 Oct 2012Markov chains with stop do sum to 1. Generating and parsing sequences with Markov model. Training Markov model. Higher-order Markov model. MLE and MAP (pseudocounts) estimates of model parameters.
Fri 19 Oct 2012entropy, encoding cost, information gain. fellowship
Mon 22 Oct 2012feedback on writing. Somewhat sloppy introduction to classifiers and converting P(data) stochastic models in to classifiers.
Wed 24 Oct 2012Details on Markov chains. Using (scaled) lower-order Markov chains for pseudocounts. Started "Better than Chance" talk, digression into P-values and E-values.
Fri 26 Oct 2012middle of "Better than Chance" talk. Log scales for histograms. Log normal distribution for lengths. Local vs. global composition for null model. Markov chains
Mon 29 Oct 2012review of homework. Palindrome null as product of conditionally independent models of lower order. Contact prediction null (uniform versus conditioned on separation)
Wed 31 Oct 2012Review of E-value and p-value. Markov's inequality proven. Arbitrary score function and null model converted to stochastic model. Substitution matrix as arbitrary score and as log( P_aligned(x,y)/(P(x)P(y)))
Fri 2 Nov 2012Guest Lecture, Katie Fortney, Science Library
palindromes
Mon 5 Nov 2012Go over palindrome homework: efficient palindrome generation, citations, fellow students as audience for writing, handling wild cards in center of odd palindrome (SW vs. X, using sum of expected counts). Started alignments: alignment as pairing, as edit operations (and other views).
Wed 7 Nov 2012some clarifications about null models, brief discussion of plotting and curve fitting. Alignment scoring, global vs. local, different gap cost models (# gaps, linear, affine, arbitrary increasing), alignment graph.
Fri 9 Nov 2012dynamic programming and memoized algorithms. algorithm for 2D arbitrary gap cost O(m^2 n^2). algorithm for 1D arbitrary gap cost O(mn (m+n)). example for 1D algorithm. null models
Mon 12 Nov 2012Veteran's Day, no class
Wed 14 Nov 2012discussed homework returned. Looked at Pog contig graphs, for which I was not adequately prepared (I'd forgotten some of Bernick's notation). Referred students to Bernick's thesis.
Fri 16 Nov 2012Cleaned up presentation of the Pog contig graphs. Had students help derive Needleman-Wunsch alighment.
Mon 19 Nov 2012Further discussion of Pog contig graphs. Derivation by class of local alignment with linear gap cost.
Wed 21 Nov 2012Derivation by students of Smith-Waterman. Pesentation of traceback algorithm. research paper
Fri 23 Nov 2012Thanksgiving, no class
Mon 26 Nov 2012re-presentation of traceback algorithm (example of global traceback). Intro to HMMs.
Wed 28 Nov 2012return of research papers, feedback on writing, explanation of Bernick's computation of inversion requencies (cancellation of mapping biases), derivation of HMM forward algorithm
Fri 30 Nov 2012 affine-gap alignment
Mon 3 Dec 2012Lagrangian Multipliers, Expectation Maximization, Backward algorithm, Baum-Welch training
Wed 5 Dec 2012list of topic we could talk about. Genome assembly (read pairs and libraries, seqprep, k-mer cleanup, overlap graph)
Fri 7 Dec 2012de Bruijn graph, k-mer counting tricks in Jellyfish (lock-free parallel hashing, packing key and count into one word) degenerate codons
Wed 12 Dec 20128a.m.–11a.m. exam slot, not used


baskin-icon
SoE home
sketch of Kevin Karplus by Abe
Kevin Karplus's home page
BME-slug-icon
Biomolecular Engineering Department
BME 205 home page old BME 205 discussion forum Fall 2012 BME 205 discussion forum UCSC Bioinformatics research

Questions about page content should be directed to Kevin Karplus
Biomolecular Engineering
University of California, Santa Cruz
Santa Cruz, CA 95064
USA
karplus@soe.ucsc.edu
1-831-459-4250
318 Physical Sciences Building

Locations of visitors to pages with this footer (started 3 Nov 2008)