The ““new biology”” of the 21st century is increasingly dependent on mathematics, and preparing high school students to have both strong science and math skills has created major challenges for both disciplines. Researchers and educators in biology and mathematics have been working long hours on a project to create high school teaching modules suitable for both biology and mathematics classrooms, as well as classes held jointly, to help supply teachers with materials that can be used in classrooms attempting to overcome the disciplinary boundaries that often separate them. Biology topics such as evolution, ecology, bioinformatics, and epidemiology are interwoven with a variety of mathematical topics, including algebraic equations, multiplying matrices, algorithms, dynamic programming, probability, and graphing. These modules will be free to educators for a few years. We give an overview of the modules and describe how to obtain them.

Imagine walking into a high school classroom. Students are sitting in groups with their heads together, busily typing numbers into their graphing calculators. Short bursts of discussion occur, a few matrices (refer to Table 1 for a glossary of terms) are sketched, and more data are entered into the calculator. A casual passerby would likely assume that this is a math class, but as the teacher brings the students back into a discussion about the evolution of proteins, it becomes clear that this is a biology class. Perhaps it is not a typical biology class. But modules developed by the BioMath Connection project at DIMACS, the Center for Discrete Mathematics and Theoretical Computer Science at Rutgers University, and sponsored by the National Science Foundation, are helping to make such scenes more common in biology classrooms.

**Table 1.**

Term
. | Definition
. |
---|---|

Algebraic equation | Equations involving variables such as x, y, and t. |

Algorithm | A well-defined sequence of steps for the solution of some problem. Long division is an algorithm. |

Deductive reasoning | The process of drawing conclusions from given data, using accepted rules of logical deduction. Sufficient for proving assertions. |

Directed graph | Also called a ““directed vertex-edge graph.”” A set of objects called ““vertices”” and a set of ““edges”” that are connections from one vertex to another. Drawings of these often show vertices as circles and edges as arrows from one circle to another. Not all pairs of vertices need to be connected. |

Dynamic programming | An algorithm that solves large problems by solving many smaller problems, remembering their solutions, and reusing and combining those solutions to create a solution to the large problem. |

Inductive reasoning | The process of drawing conclusions from given data by generalizing the data and making statements that are reasonable to believe on the basis of the data. Not sufficient for rigorous proofs of assertions. |

Integers | The positive and negative whole numbers: {2, 1, 0, 1, 2, 3, }. |

Iteration | Repetition, particularly of some step in an algorithm. |

Line of best fit | A line that most closely matches data points plotted in the plane, typically measured by summing the squares of the distances of the points to the line. |

Markov chains | A random process whereby some system moves from state to state, in which the probability of moving into any state depends only on the current state and not on any other past states. |

Matrices | Rectangular arrays of numbers, with a commonly accepted way of adding, subtracting, and multiplying them. |

Optimal local alignment | The best way of aligning two given strings, according to some criterion for measuring ““goodness”” of an alignment. |

Permutation | An arrangement or rearrangement of a set of objects. |

Predictive value | A measure of an objects usefulness for predicting some outcome. |

Rate of incidence | How frequently some event occurs. |

Term
. | Definition
. |
---|---|

Algebraic equation | Equations involving variables such as x, y, and t. |

Algorithm | A well-defined sequence of steps for the solution of some problem. Long division is an algorithm. |

Deductive reasoning | The process of drawing conclusions from given data, using accepted rules of logical deduction. Sufficient for proving assertions. |

Directed graph | Also called a ““directed vertex-edge graph.”” A set of objects called ““vertices”” and a set of ““edges”” that are connections from one vertex to another. Drawings of these often show vertices as circles and edges as arrows from one circle to another. Not all pairs of vertices need to be connected. |

Dynamic programming | An algorithm that solves large problems by solving many smaller problems, remembering their solutions, and reusing and combining those solutions to create a solution to the large problem. |

Inductive reasoning | The process of drawing conclusions from given data by generalizing the data and making statements that are reasonable to believe on the basis of the data. Not sufficient for rigorous proofs of assertions. |

Integers | The positive and negative whole numbers: {2, 1, 0, 1, 2, 3, }. |

Iteration | Repetition, particularly of some step in an algorithm. |

Line of best fit | A line that most closely matches data points plotted in the plane, typically measured by summing the squares of the distances of the points to the line. |

Markov chains | A random process whereby some system moves from state to state, in which the probability of moving into any state depends only on the current state and not on any other past states. |

Matrices | Rectangular arrays of numbers, with a commonly accepted way of adding, subtracting, and multiplying them. |

Optimal local alignment | The best way of aligning two given strings, according to some criterion for measuring ““goodness”” of an alignment. |

Permutation | An arrangement or rearrangement of a set of objects. |

Predictive value | A measure of an objects usefulness for predicting some outcome. |

Rate of incidence | How frequently some event occurs. |

But wait –– there's more! In a different part of the building, a math class is double checking the optimal local alignments they generated for two distantly related protein sequences, including a discussion of mutations and evolution. Here, the biological framework provides important context for the dynamic programming that forms the mathematical portion of the lesson. This module uses spider silk as a way to introduce proteins and evolution and then develops the alignment algorithms step by step, implementing them first by hand and then using Biology Student Workbench. This last step grabs the math students' attention. Students recognize that this free Internet resource puts at their fingertips all the databases and algorithms that scientists at the very frontiers of biology research are using. Their subsequent involvement is convincing evidence that this topic is indeed appropriate for a calculus class.

Classrooms such as these are the vision of National Science Foundation grant NSF-0628091. The primary investigator, Fred Roberts, is acutely aware of the need to integrate biology and math at the high school level. As he has written,

Modern biology has changed dramatically in the past two decades. Driven by large scientific endeavors such as the human genome project, it has become very much an information science, closely tied to tools and methods of the mathematical sciences. New algorithms and mathematical models played a central role in sequencing the human genome and continue to play a crucial role as biology develops models of information processing in biological organisms. Increasingly, undergraduate and graduate students are being exposed to this interplay between the mathematical and biological sciences. In the high schools, the biology curriculum has made some advances by including such things as genetics and the human genome project, and even some of the mathematics in the Mendelian genetics model. There are also a few isolated efforts to bring biological examples into the mathematics classroom. But for the most part, high schools have done little to develop connections between the biological and the mathematical sciences. Current efforts need to be supported and new efforts developed to bring high school education up to speed in the integration of mathematics and biology. Students need to be exposed to the excitement of modern biology from both the biological and mathematical points of view. They need to be informed of the new educational and career opportunities that are arising from the interconnections between these disciplines. Introducing high school students to the interconnections between the biological and mathematical sciences will enhance both the study of biology and the study of mathematics. Students interested in studying biology will realize the importance of understanding modern mathematics. New horizons will be opened for those who find mathematics interesting, but wonder how it might be useful. There is the potential for all students (not just high-achieving students) to study mathematics both longer and more seriously because they are aware early of its importance in applications such as protecting us from bioterrorism, responding to public health crises, and understanding modern diseases. (Roberts, 2005)

Given the complexities of biological systems, intuition alone is not sufficient to gain understanding of these systems. Math allows biologists to create models that can make predictions about these complex systems. Therefore, biologists must learn to understand and speak the language of mathematics if they are to work with mathematicians and computer scientists to develop such models. In Teaching Young Biologists New Tricks, Marc Wortman (2008) stated that

Perhaps the most important factor driving the increased emphasis on quantitative skills is the changing nature of biology. From discoveries about neural networks, genetics, and cardiac blood flow to understanding disease pathways within cells and throughout entire populations, many of the most important advances in the field now rely on mathematical modeling, quantitative analysis, and bioinformatics.

As a result, ““Students have to realize that they won't do well without some quantitative competencies,”” says Fernan Jaramillo, a neuroscientist (Wortman, 2008). It is now time to increase the exposure of high school students to this interface and improve their quantitative competencies. Exposing students to these exciting topics will increase their appreciation of biology, even as it opens up new realms of study and prepares them for a wider range of career opportunities. Unfortunately, very few high school biology textbooks emphasize the roles of mathematics in biology.

Too frequently, these roles have been unappreciated in biology curricula because textbook authors assume that biology students have an inadequate mathematical preparation. This practice (1) deskills many biology students, (2) is inconsistent with our requirements, (3) misrepresents contemporary biological research, and hence, (4) underprepares students to read many articles or to contribute to many areas of biology. (Jungck, 1997)

Even more unfortunate is the fact that we are deceiving students into believing that mathematics does not play a role in their biological learning. ““Over time, the prevalent biological course materials are those that omit mathematics, even in instances where mathematics has an essential role to play”” (Edelstein-Keshet, 2005). Fortunately, this series of modules takes a palatable first step in the right direction, as we will describe in the next section.

## Much to Gain

There is some potential for pain here, at both ends of the hallway. There are math teachers who feel that ““applications”” only mess up their beautiful mathematics, particularly biology applications in which nice, neat formulas typically fall short of capturing the complexities of any ““real life”” biological system. And there are biology teachers for whom mathematics is not a particularly strong subject or only a part of their distant educational past. For such biology and math teachers, these modules will represent serious excursions outside their comfort zones; it is our claim that the trip is worth it. For example, during the piloting of a module that examines the evolution of proteins while developing the skill of matrix multiplication, one of the present authors found the math somewhat intimidating. I had never been taught how to multiply matrices, and while I learned how to do it, it did not come easy for me. I mentioned this in one of my classes where students were also struggling with matrix multiplication. One student, Bill, told the class how he had struggled with it too in a previous math class. He then asked if he could show me a trick that someone had showed him to make it easier. Bill came up to the board, showed me and the class how to do it, and it immediately became clear to me. For the rest of the day I showed my classes Bill's method and gave him credit. Having always struggled with biology, Bill finally held his head high in my class.

Was it worth the effort and discomfort? Definitely! It took the students' understanding of evolution to a much higher level. Classroom discussions showed a deeper understanding of evolution than in any previous class, and the students experienced firsthand how math could be used to take a scientist from just making observations in nature to making predictions and developing theories about how things work.

## Description of the Project

As part of this grant, modules that integrate biology and math are being developed over a five-year period. Every module is intended to be appropriate in a math class as well as in a biology class. The topics covered by the modules are topics that are often covered in biology classrooms, so as not to increase the burden on an already overloaded curriculum. The main topics are evolution, ecology, and epidemiology. As the biology topic is developed, the appropriate mathematical skills are woven into the module, making these modules appropriate for biology teachers who want to expose their students to the bio––math interface and for math teachers who desire real-life applications of the tools and methods of mathematics.

The modules are developed by teams of content experts active in research in partnership with high school biology and math teachers. They are then piloted, field tested, and assessed in a variety of math and biology classrooms in different communities with different academic levels, socioeconomic levels, and ethnic makeups. The first set of three modules will be available soon for teachers to use. Each year, another three modules will become available. Teachers interested in receiving the modules as they become available should sign up at http://www.comap-math.com/biomath/form/index.php.

The modules consist of self-contained text and problem material that can be used in high school mathematics and biology courses, in individually or team taught courses, and in undergraduate college classes. They cover up to six (or more, in a few cases) class meetings of 40 minutes each. The modules can be used in their entirety, or teachers can choose to insert one or two lessons into an already crowded curriculum. Because it is vitally important to start exposing students (and teachers) to the bio––math interface, even such a small ““foot in the door”” is important.

Each module has teacher materials that start with a preface on the general topic and its interest, the mathematics required (e.g., algebraic operations, matrices, deductive and inductive reasoning), the biology required (e.g., DNA, natural selection, immune response), whether calculators or computers are necessary or desirable, and what is provided in the module (e.g., assessments, transparencies). The next section, ““How to Use the Module,”” discusses grade levels appropriate for the material, level of student background, the format of the module, references, expected number of days and amount of material per day, where to implement it within existing math or biology material, relevant math and science standards, and so on. The main body of the module is based on lessons assumed to take approximately 35 minutes plus questions from the preceding lesson. It has an introduction, giving the history of the problem or area, application interests, basic definitions needed, and preparatory reading required. The lessons include sections that set the stage, motivation (e.g., with a short activity or game), guidance for the teacher as a sidebar, questions and extensions for advanced students, and a homework assignment. The emphasis is on setting up models and working through them. The module closes with extensions, generalizations, wrap-up assignments, references, and a glossary.

## Brief Descriptions of 14 Modules

### Genetic Inversions

Students explore the basic concepts of DNA and chromosomal inversions. The module starts with a game that introduces the idea of gene rearrangements and then gradually leads the students through a series of improved algorithms designed to rearrange one genome into another in the least number of steps. Topics covered include chromosomes and genes, mutations, inversions, permutations, and algorithms.

### Spider Silk

This module invites students to pose and answer the fundamental question: ““What alignment of two sequences is biologically most meaningful?”” Students explore the rapidly emerging field of bioinformatics by developing the basic mathematical principles that underlie computer programs used to align nucleotide and peptide sequences. As students become researchers, they begin to understand how mathematical modeling, computing, and biology can work together to answer important scientific questions. Students will work through the ideas of mutation and selection, gene and protein sequences and homology, alignment, dynamic programming, and iteration.

### Evolution by Substitution

Students relate DNA changes and resulting amino acid substitutions to evolution, analyze the various pathways of change that could occur in a single amino acid position, and, using transition matrices, develop a powerful model for explaining and predicting long-term mutation probabilities. Topics covered include evolution, DNA and amino acids, synonymous and nonsynonymous substitutions, probability, matrices, and Markov chains.

### Competition in Disease Evolution

This module examines infectious diseases from the perspective of evolutionary biology on a basic level. Students will gain an understanding of how different methods of pathogen reproduction can greatly affect the evolutionary fitness of a disease. After learning to compute simple and conditional probabilities, students calculate probable levels of exposure to a disease in a population, probabilities of infection given exposure, and expected rates of disease incidence. Students practice rounding real numbers to integers and converting among fractions, decimal representations, and percentages while discussing methods of disease transmission, evolutionary fitness, natural selection, and evolutionary competition.

### Computer Modeling of Disease Outbreaks

This module uses two hypothetical infectious disease outbreaks, which students simulate, to introduce and develop mathematical models for disease spread. As models are constructed the students have the opportunity to interactively see how changes in the parameters of the model change the pattern of the disease outbreak, thereby investigating how effective various intervention strategies can be and witnessing how powerful a good model can be.

### Sensitivity//Specificity

This module uses an interrupted case-study approach to answer the following two questions: What do the results of an imperfect medical test actually mean? How does this information affect public policy or personal decision making? The students are presented with the case of an adult female who learns that her mammography test is positive. They then discuss the possible implications or outcomes of a positive test result given the properties of the test, including its sensitivity and specificity, and explore the predictive value of a test for a single individual.

### Habitat Selection

Understanding and predicting species abundance is of fundamental concern to nearly every aspect of population ecology. Whether trying to preserve habitat for endangered species or trying to engineer control strategies for invasive pests, it is important to understand the impact of habitat preference on species abundance. This module has students develop a mathematical method to infer habitat preferences based on species abundance measures and use this method to predict changes in population distributions as land use changes over time. In the process, students learn about biotic and abiotic factors, niche, data categorization, dependent and independent variables, and line of best fit.

### Food Webs

Food webs are abstract representations of feeding relationships in communities. Discrete mathematics provides a model for a food web using a directed graph (digraph) whose vertices are the species, and an arc goes from ““a”” to ““b”” if ““a”” is food for ““b.”” Digraphs representing food webs make understanding predator––prey relationships easier, and various properties of digraphs provide insight into properties of the food web and the species contained within. Keystone species, species trophic levels, status, dominance, and other concepts can all be recognized from the digraph.

### Carbon Footprint

This module helps students see themselves and humans in general as intimately connected to the environment. To foster understanding of humans' impact on the environment, basic mathematics is used to quantify aspects of our ecological impact. Ecological footprinting is developed as a tool for assessing humans' impact and as a decision-making tool. This module enables students to be more aware of humans' roles in threats to the environment and enables them to make more informed decisions about behaviors that affect the environment. Students will gain a foundation in mathematical modeling that will inform future creation or use of mathematical models.

### Evolutionary Games: The Game of Life

This module examines the role that behavior plays in evolutionary fitness. Students will develop an understanding of natural selection as organisms compete for limiting resources (e.g., food, water, space, mates, safety). Traditionally, this idea is developed in the high school curriculum by focusing on adaptations due to the physical phenotype of the organisms. This module will look specifically at behavioral choices made to obtain the resources they need for survival and for reproduction. This module builds independent concepts in game theory that can be taught at any grade level within the mathematics or biology curricula.

### Array of Hope

This module is written as seen through the eyes of a doctor. This doctor has a patient who is diagnosed with melanoma. To find the best treatment for this patient, a microarray is done. The doctor wants to know more, so he begins to educate himself on the topic of microarrays. He begins by learning how a microarray is done. He becomes very interested, so he visits a friend from med school, who now works in a research lab specializing in microarrays. The doctor learns not only how to run a microarray, but how to read a microarray and analyze the individual results mathematically, taking into consideration variability of tests and standard deviation. This doctor's exploration causes him to look at larger data sets of many patients to see whether different genes could be involved in different forms of the same disease. Ultimately, he understands which variation his patient has and which treatment is most likely to be successful.

### Drawing Lines: Animals & Their Territories

This module examines a single unifying principle governing the partitioning of a space in a wide range of ecological contexts. Students will come to understand how the minimization of energy expenditure results in a widely applicable ““nearest neighbor”” dynamic governing the use of space. This dynamic will be seen to apply in both animate and inanimate situations. Students will choose to examine the implications of this principle in one of several different contexts and will share their findings with each other.

### CrIME: Criminal Investigations through Mathematical Examination

This module uses the forensics of fingerprint analysis to introduce students to some of the basic mathematical concepts in the areas of graph theory, probability, and geometry, and the biological concept of species identification. The module begins with an activity in which the students are given a scenario in which they discuss how one investigates fingerprint evidence that is found at a crime scene. This is followed by a series of activities in which the students are introduced to terminology and procedures used in fingerprint analysis, mathematical procedures used in fingerprint identification, and its application to species identification. Graph theory terminology and techniques are thoroughly explained so that students with no prior experience in graph theory will understand the basic concepts. The geometry, which concentrates on triangle congruence and similarity axioms and theorems, is presented as additional material that can be included or not, depending on time and student background. The fingerprint classification techniques also use the probabilistic concepts of ratio and proportion, which are covered in the activities.

### Genetic Epidemiology: Finding Disease Susceptibility Alleles in Presence of Population Stratification

Personalized medicine based on known risk factors, including genetic risk factors, is a major focus of research. In order to include genetic risk factors in these predictions, the genetic risk factors need to be identified. This module explores the potential for falsely identifying a genetic factor as increasing the risk (or odds) of disease when the individuals chosen for study are not genetically homogeneous. After the problem is identified, potential solutions are introduced, including the use of patterns of allele frequencies to reclassify individuals into genetically homogeneous groups. The module will also explore the useful information gained in studying the patterns of linkage disequilibrium across different populations.

## Conclusion

In reporting the results of a series of workshops, Hastings and Palmer (2003) concluded

that a vital next step will be to promote the training of scientists with expertise in both biology and mathematics. A new generation of empiricists with stronger quantitative skills and of theoreticians with an appreciation for the empirical structure of biological processes will facilitate a bright future for the application of mathematics to solving biological problems.

As teachers of the next generation of biologists, we bear the responsibility for training these future scientists well, which means preparing our students for the real world that awaits them. Today that requires teaching new topics bravely and old topics in new ways. These DIMACS modules enable us to blaze that trail with confidence, to broaden our comfort zone even as we step beyond it.