Here I’m honored to introduce a web server for annotation and identification of metabolism pathways which is mainly based on KEGG, KOBAS 2.0. To begin with, let’s make a brief review of KOBAS 1.0 It’s a software and web server that annotates an input set of genes or proteins by mapping to genes with pathways in the KEGG PATHWAYS High-throughput experimental technologies such as next generation sequencing, microarray profiling and proteomics profiling are widely used in current biological research and produce flood of data and often identify dozens to hundreds of genes related to a biological or pathological process. Effective analysis and accurate interpretation of this flood from system level, which indicates how genes are involved in metabolic and signaling pathways and influence phenotypes, represents a major challenge to this field at post-genomics era. In order to meet these challenges, which identify statistically significantly enriched pathways in a quick and accurate manner. And further discorver the biological meaning of those metabolitic pathways in which there lies the most enriched genesor proteins, or those pathways which are statistically significant contrast to the random background. Professor Liping Wei’s group developed KOBAS 1.0 in 2005, Despite the achievements KOBAS 1.0 has made, there exists weakpoints. And it’s desirable to have an updated version to meet the requirements generated by increased data volume and varying research topics. Hence, Professor Liping Wei’s group released KOBAS 2.0 in 2011. Instead of using single metabolic pathway database KEGG KOBAS 2.0 incorporates knowledge from five pathway databases, including KEGG PATHWAYS, NaturePID, BioCyc, Reactome, and Panther and five human disease databases, including OMIM, KEGG DISEASE, FunDo, GAD and NHGRI GWAS Catalog. KOBAS 2.0 supports both ID mapping and sequence similarity mapping. This unprecedented function allows users to character the relativity of the metabolic and signaling pathways of human and other species KOBAS 2.0 has two consecutive programs, “annotate” and “identify”. The first program “annotates” each input gene with putative pathways and diseases by mapping the gene to genes in backend databases. For ID mapping, input IDs are mapped directly to genes from KEGG GENES. For sequence similarity mapping, each input sequence is BLASTed against all sequences in KEGG GENES. The default cutoffs are BLAST E-value < 10^-5 and rank ≤ 5. Users can customize the threshed according to varying demands. The second program “identifies” statistically significantly enriched pathways and diseases by comparing results from the first program against the background, which is usually the genes from whole genome, or all probe sets on a microarray. As previously mentioned, one of KOBAS 2.0’s advantages is that it can use sequence similarity mapping to annotate input genes from species that are not yet well-represented in existing pathway databases. It can also map the genes from other species to human diseases to predict whether these genes may be good candidates to study any human diseases. In this paper , the author exemplified how to realize this function in KOBAS 2.0. The researchers analyzed the microarray expression profiles in rhesus monkeys in two major hippocampal subdivisions critical for memory and cognitive function: cornu ammonis (CA) and dentate gyrus (DG). They identified 371 up-regulated genes and then used both DAVID and KOBAS 2.0 to identify enriched pathways and diseases. DAVID can only perform ID mapping to rhesus genes in its two pathway databases KEGG PATHWAY and Panther, and as a result, identified no statistically significantly enriched pathways or diseases. On the other hand, KOBAS 2.0 supports sequence similarity mapping by BLAST to annotate the rhesus gene set and can thus take full advantage of the abundant data on human pathways and diseases. 130 of 371 genes are mapped to existing pathways and diseases, with 61 genes related to pathways and 30 genes involved in diseases after statistical tests. These results are consistent with known functional differences between the two regions. The authors also compared KOBAS 2.0 with popular GO enrichment analysis tools including FuncAsspciate 2.0, Ontologizer 2.0, BiNGO, and EASE, showing that enriched pathways identified by KOBAS 2.0 is more specific and informative than than other analysis tools. KOBAS 2.0 hence offers more insights into the biological processes. In conclusion, KOBAS 2.0 is an optimal software to annotate and analyze pathways and diseases. Our references include KOBAS 2.0 and KOBAS 1.0 It can be accessed at the site showed here。 At last,I express my appreciation to all of our group members, to TA Meng Wang, to Dr. Ge Gao and Dr. Liping Wei Thank you. Now I’ll show you how to use kobas website First , go to the homepage of kobas click “Annotate” button in the left corner and this will direct us to the page of annotation. To run it correctly, we shall choose proper input file format which can be gene fasta sequences or tabular blast. or else the ID list. We choose the Gene ID for example and upload a file from the local disk. The species should be Human Click “Run” and here goes the annotation about the genes The annotation lists the genes corresponding to every ID in the uploading file Click on specific gene to view more comprehensive information including gene name, genetic definition and so on. Cross-linking to other databases is also provided Then we can do hypothesis test according to these annotation information in order to find out the most possible pathways or diseases related with the query gene. On the top of the output results, we use this file as identifys sample input and you can click "show available database according to the species used in Sample Input”. Here we could see lots of databases available The users could select certain databases according to their needs. Here, we run it with the options all by default Click "Run" This process may take a few minutes. After the running is finished, we can check the output results in analysis history on the left. The result lists the term, stored database, ID, sample number, input background number, p value and corrected p value. Accordingly, we could click to check the information related to diseases. In theory, smaller p value means more credible identification result. So we can sort the corrected p value in ascending order The users could choose certain P value according to their needs. and find out the most possible pathway or disease the gene might be involved in. Then we can we can click each term for more detailed information. and thus this will contribute to further research Here’s our presentation. Thank all of you!