In this session, I will first talk about what is a phylogenetic tree, and then how do we use phylogenetic trees to study an epidemic of infections disease. Phylogenetics is a field studying the evolutionary relationships of organsisms using their biomarkers, such as DNA sequences. A phylogenetic tree is a tree-like diagram illustrating such evolutionary relationships. Here's an example of a phylogenetic tree built from the DNA sequences of these organisms. The tree shows a long evolutionary distance between humans and cattle, passing through the root of the tree. In contrast, a human is evolutionarily closer to a chimpanzee, which shares a more recent common ancestor. The evolutionary relationships and the branch lengths in the tree, largely reflect the similarity between the DNA sequences of these organisms. A phylogenetic tree can be shown in different visual presentations. Tree branches can be straight and slanted, they can also be rectangular, or even circular. Although they look different, they illustrate the same evolutionary relationships. There are a number of computational methods to build a phylogenetic tree from genetic sequences, such as neighbour-joining, maximum likelihood search, and Bayesian inference. Let's go through the neighbour-joining algorithm. We compare each pair of DNA sequences to compute their genetic distances. The two closest sequences are first joined, their common genetic distance to the other sequences is recomputed. Then the next two closest sequences are joined, and the algorithm iterates until a complete tree is built by joining all the sequences. Similar to higher organisms, pathogens, such as viruses, mutate and evolve when they infect and transmit among their hosts. Their evolutionary relationships reflect their transmission history in the host population. Therefore, building the phylogenetic tree of the pathogens in an epidemic helps us understand the development of the disease in the past. When an infectious disease starts from the index patient, and spreads through the population, some infected cases are registered by the public health surveillance. If the specimens containing the pathogens are also sampled, their genome sequences can be obtained by laboratory process. For example, these sequences are obtained in surveillance, and built into a phylogenetic tree. This tree shows that the sequences are separately clustered into two lineages, suggesting that they resulted from two separate clusters of transmissions. As a conventional approach, we can also trace the contact and travel history of the patients, to learn about the disease transmission history. However, when the surveillance system misses out some infected patients in the transmission chain, this method becomes very difficult to complete the transmission history. This often happens for diseases that cause no or mild symptoms in some individuals. In contrast, phylogenetic approach is less affected by this problem, because the transmission by even symptomless patients has already left their genetic footprints in the pathogen genomes, so the transmission history can still be constructed with a relatively lower sampling ratio in the surveillance. Phylogenetic analysis has been widely used to study the development of an epidemic. An example is the SARS outbreak in the year 2003. This epidemic appears to start in Guangdong, then spread to Hong Kong and soon other countries, including Vietnam, Canada, Singapore, and Taiwan. Phylogenetic analysis of the virus gives clues to the source and relation of the outbreaks in different regions. SARS virus genome sequences from different regions were built into a phylogenetic tree shown here. Below that is a transmission history inferred from the contact and travel history of the SARS patients. Soon after the initial outbreaks in Guangdong, the virus spread to Beijing, which is supported by the early divergence of the Beijing sequences in the tree. In Hong Kong some early cases are traced to be sporadic cases imported from Guangdong. The disease did not spread through until an index patient from Guangdong traveled to Hong Kong, and passed the virus to a number of people staying in the same hotel with him. These secondary patients, many of them who are visitors, subsequently transmitted the virus to their home country when they traveled back. The phylogenetic tree supports this massive spread as we see in the tree, the viruses from these countries are clustered into the same lineage. One infected Singaporean visitor may be the source of the local outbreak in his country, because in the tree, his virus sequence clusters with other viruses from the local outbreak. Taiwan also has SARS outbreaks. In the tree, some Taiwan viruses are clustered with the viruses from Amoy Gardens residents in Hong Kong, and this is in line with the finding that an infected Amoy Gardens resident had traveled to Taiwan during that time. Some other Taiwan viruses form separate clusters in the tree, suggesting additional disease introduction to the country. An epidemic can be generated when a novel pathogen jumps from animals to humans, and establishes efficient transmissions. Tracing the animal source for such a zoonotic outbreak informs us to which animal we should reduce our exposure, and hence prevent our population from getting the pathogen again. This is important to control an epidemic caused by zoonosis. To achieve this, in addition to the pathogen sequences we obtain from the epidemic, we also need to survey the animals that possibly carry the same or related pathogens. With all pathogen sequences from the human patients and animals, we can build a phylogenetic tree of them to identify which animal pathogen is the most closely connected to the human ones, which is the possible animal origin of the epidemic. Such phylogenetic analysis has been applied to study the human infections of novel H1N1 influenza in the year 2009. In April, a 10-year-old boy in California was tested positive for influenza. Molecular diagnosis and sequencing showed that the influenza virus is new to humans. Further cases were reported in California and Texas. Soon after that the disease spread all over the world. To set up a phylogenetic analysis to trace the disease origin, we have the novel H1N1 virus sequences. We also include sequences of the human seasonal influenza. And the influenza virus found in birds, pigs, and horses. First, the tree shows that the novel H1N1 virus is genetically distant to the seasonal human influenza, which means that this virus did not directly evolve from the preexisting human influenza. Second, in the tree, this novel virus is clustered with the swine influenza viruses, suggesting that the virus came from pigs. Phylogenetic analysis of all influenza gene sequences, lead to the same conclusion of swine origin, therefore, this novel disease is also called swine flu. In summary, phylogenetic tree is used to delineate the evolutionary relationship of the pathogens from their genetic sequences. It is a useful tool to study the transmission history and trace the animal source of an epidemic.