CCB Courses

Selected Hopkins Courses in Genomics and Bioinformatics

Term Course Name Course Number Instructor
Fall Computational Genomics: Sequences EN.601.447/647 Ben Langmead
Spring Genomic Data Science EN.601.350 Steven Salzberg
Spring Sketching & Indexing for Sequences EN.601.446/646 Ben Langmead
Spring Advanced Topics in Genome Data Analysis EN.580.743 Alexis Battle
Spring Computational Genomics: Data Analysis EN.601.448/649 Alexis Battle
Spring Computational Genomics: Applied Comparative Genomics EN.601.749 Mike Schatz
Spring Genomic Technologies: Tools for Illuminating Biology and Dissecting Disease ME.710.744 Mike Beer
Spring Nonlinear Dynamics of Biological Systems EN.580.244 Mike Beer
Fall Computational Biomedical Research & Advanced Biomedical Research EN.601.452 / AS.020.415 Mike Schatz
Spring Foundations of Computational Biology and Bioinformatics EN.580.488/688 Rachel Karchin
Spring Computing the Transcriptome EN 580.458/658 Ela Pertea
Spring (2nd half) Systems Biology of the Cell EN.580.248 Joel Bader
Spring (2nd half) Methods in Nucleic Acid Sequencing EN.580.454 Winston Timp
Summer (1 week) BCMB Computational Biology Bootcamp ME.800.806 Winston Timp

Course Descriptions

To register for courses, visit JHU's SIS site.

EN.601.447/647 Computational Genomics: Sequences

Your genome is the blueprint for the molecules in your body. It's also a string of letters (A, C, G and T) about 3 billion letters long. How does this string give rise to you? Your heart, your brain, your health? This, broadly speaking, is what genomics research is about. This course will familiarize you with a breadth of topics from the field of computational genomics. The emphasis is on current research problems, real-world genomics data, and efficient software implementations for analyzing data. Topics will include: string matching, sequence alignment and indexing, assembly, and sequence models. Course will involve significant programming projects.

Prerequisite(s): EN.600.120/EN.601.220 AND EN.600.226/EN.601.226
Note: Students may receive credit for only one of EN.600.439, EN.600.639, EN.601.447, EN.601.647.
Return to course list

EN.601.350 Genomic Data Science

This course will use a project-based approach to introduce undergraduates to research in computational biology and genomics. During the semester, students will take a series of large data sets, all derived from recent research, and learn all the computational steps required to convert raw data into a polished analysis. Data challenges might include the DNA sequences from a bacterial genome project, the RNA sequences from an experiment to measure gene expression, the DNA from a human microbiome sequencing experiment, and others. Topics may vary from year to year. In addition to computational data analysis, students will learn to do critical reading of the scientific iterature by reading high-profile research papers that generated groundbreaking or controversial results. [Applications] Recommended Course Background: Knowledge of the Unix operating system and programming expertise in a language such as Perl or Python.

Return to course list

EN.601.446/646 Sketching & Indexing for Sequences

Many of the world's largest and fastest-growing datasets are text, e.g. DNA sequencing data, web pages, logs and social media posts. Such datasets are useful only to the degree we can query, compare and analyze them. Here we discuss two powerful approaches in this area. We will cover sketching, which enables us to summarize very large texts in small structures that allow us to measure the sizes of sets and of their unions and intersections. This in turn allows us to measure similarity and find near neighbors. Second, we will discuss indexing --- succinct and compressed indexes in particular -- which enables us to efficiently search inside very long strings, especially in highly repetitive texts. The course will involve significant programming projects.

Prerequisite(s): EN.601.220 AND EN.601.226
Return to course list

EN.580.743 Advanced Topics in Genome Data Analysis

Genomic data is becoming available in large quantities, but understanding how genetics contributes to human disease and other traits remains a major challenge. Machine learning and statistical approaches allow us to automatically analyze and combine genomic data, build predictive models, and identify genetic elements important to disease and cellular processes. This course will cover current uses of statistical methods and machine learning in diverse genomic applications including new genomic technologies. Students will present and discuss current literature. Topics include personal genomics, integrating diverse genomic data types, new technologies such as single cell sequencing and CRISPR, and other topics guided by student interest. The course will include a project component with the opportunity to explore publicly available genomic data. Recommended Course Background: coursework in data science or machine learning.

Return to course list

EN.601.448/649 Computational Genomics: Data Analysis

Genomic data has the potential to reveal causes of disease, novel drug targets, and relationships among genes and pathways in our cells. However, identifying meaningful patterns from high-dimensional genomic data has required development of new computational tools. This course will cover current approaches in computational analysis of genomic data with a focus on statistical methods and machine learning.Topics will include disease association, prediction tasks, clustering and dimensionality reduction, data integration, and network reconstruction. There will be some programming and a project component.

Prerequisites: EN.601.226 or other programming experience, probability and statistics, linear algebra or calculus.
Note:Students may receive credit for only one of EN.600.438, EN.600.638, EN.601.448, EN.601.648.
Return to course list

EN.601.749 Computational Genomics: Applied Comparative Genomics

The goal of this course is to study the leading computational and quantitative approaches for comparing and analyzing genomes starting from raw sequencing data. The course will focus on human genomics and human medical applications, but the techniques will be broadly applicable across the tree of life. The topics will include genome assembly & comparative genomics, variant identification & analysis, gene expression & regulation, personal genome analysis, and cancer genomics. The grading will be based on assignments, a midterm & final exam, class presentations, and a significant class project. [Applications] Expected course background: familiarity with UNIX scripting and/or programming.

Return to course list

ME.710.744 Genomic Technologies: Tools for Illuminating Biology and Dissecting Disease


Return to course list

EN.580.244 Nonlinear Dynamics of Biological Systems

Analysis and simulation of nonlinear behavior in biological systems: bifurcations (cell-fate decision), limit cycles (cell-cycle, neuronal excitations), chaos, and maps. Matlab will be used to simulate these systems and motivate nonlinear analytic tools and stability analysis.

Recommended course background: AS.110.201 Linear Algebra, AS.110.302 Differential Equations, or EN.553.292 Linear Algebra and Differential Equations.
Return to course list

EN.601.452 / AS.020.415 Computational Biomedical Research & Advanced Biomedical Research

This course for advanced undergraduates includes classroom instruction in interdisciplinary research approaches and lab work on an independent research project in the lab of a Bloomberg Distinguished Professor and other distinguished faculty. Lectures will focus on cross-cutting techniques such as data visualization, statistical inference, and scientific computing. In addition to two 50-minute classes per week, students will commit to working approximately 3 hours per week in the lab of one of the professors. The student and professor will work together to schedule the research project. Students will present their work at a symposium at the end of the semester.

Recommended course background: AS.110.201 Linear Algebra, AS.110.302 Differential Equations, or EN.553.292 Linear Algebra and Differential Equations.
Return to course list

EN.580.488/688 Foundations of Computational Biology and Bioinformatics

This course is designed to give students a foundation in the basics of statistical and algorithmic approaches developed in computational biology/bioinformatics over the past 30 years, while emphasizing the need to extend these approaches to emerging problems in the field. Topics covered include probabilistic modeling applied to biological sequence analysis, supervised machine learning, interpretation of genetic variants, cancer genomics bioinformatic workflows and computational immuno-oncology. Attending the lab section "Annotate Your Genome" is required.

Prerequisite(s): EN.601.220
Return to course list

EN 580.458/658 Computing the Transcriptome

This course will introduce computational tools used in the field of transcriptomics to analyze the genes and transcripts expressed in a living cell. Lectures will cover different practical ways to analyze large data sets generated by high-throughput RNA sequencing (RNA-Seq) experiments, including alignment, assembly, and quantification. The students will learn how to use RNA-seq to answer questions such as: what is the complete set of human genes? How do we reconstruct the splice variants that are transcribed in different cell types and conditions? How do we compute which genes are differentially expressed between different RNA-seq datasets?

Prerequisite(s): (1) Familiarity with Python or Perl, (2) the Unix command-line environment, and (3) a basic understanding of programming in R
Return to course list

EN.580.248 Systems Biology of the Cell

Cellular systems biology provides a theoretical and quantitative understanding of the interactions between DNA, RNA, and proteins that create the well-regulated system we call life. This course develops first-principles models for the central dogma of molecular biology: information flow through protein signal transduction pathways, gene regulation by protein-DNA physical interactions, transcription of DNA to RNA, translation of RNA to protein, and feedback regulation that closes the cycle. Topics include complex analysis and contour integrals, spectral transforms, linear models for cell signaling, positive and negative feedback, non-linearities introducted by saturation and cooperativity, information content and combinatorial regulation, and instabilities leading to cell fate specification.

Recommended Course Background: Linear Algebra, Systems and Controls and programming.
Return to course list

EN.580.454 Methods in Nucleic Acid Sequencing

Sequencing technology is a rapidly progressing field that requires experience in both wet (molecular biology) and dry (computational analysis) techniques. This laboratory course will consist of three experimental modules that will provide students with valuable hands-on experience in DNA sequencing and analysis. Students will learn basic sequencing library preparation, perform sequencing experiments and analyze the resulting data. Experiments include human targeted sequencing, metagenomic sequencing and genome assembly.

Prerequisite(s): Students must have completed Lab Safety training prior to registering for this class.
Return to course list

ME.800.806 BCMB Computational Biology Bootcamp

This intensive one week course is meant to immerse student in computation, and to provide them with the foundational tools to be able to apply modern computational techniques and appropriate statistics to their data.

Return to course list