[MUSIC] Hello everyone, my name is Pimlapas Leekitcharoenphon. I'm a Postdoc from Research Group for Genomic Epidemiology at DTU Food. Today, I'm going to be presenting you plasmid replicon identification, and plasmid typing, PlasmidFinder and pMLST tools description and application. Plasmids are double-stranded circular or linear DNA molecules. They can replicate and transfer between different bacterial species or different bacterial clones. Most of the known plasmids have been identified because they confer phenotypes that are subject to positive selection on bacterial host. For example, the presence of antimicrobial resistance genes. And that's why it is important not only to study the molecular epidemiology of different bacterial clones but also to study and understand the molecular epidemiology of transferable plasmids. And for this specific purpose, plasmid typing systems are needed. And, because it's possible to actually identify plasmids directly from whole genoming sequencing data, therefore, we make a PlasmidFinder tool. It's an easy-to-use web based tool for detection of the plasmid, but not the whole plasmid. It's the plasmid replicon directly from whole genome sequencing data. And in the PlasmidFinder, they have the database that right now consists of 116 replicon sequences that match with at least 80% of all replicon sequences identified in 559 fully sequenced plasmids. And if you have plasmid, then you can type your plasmid. We also have another tool that can type your plasmid. That's called pMLST and the pMLST, of course, is the tool for plasmid multilocus sequence typing. And the database of pMLST is updated weekly from pubmlst.org. Both tools, PlasmidFinder and pMLST, they were actually evaluated using draft genomes from a collection of Salmonella Typhimurium isolates. And the PlasmidFinder was able to detect a broad variety of plasmids that are often associated with antimicrobial resistance in clinical bacteria pathogens. For the pMLST, it also show that it was able to subtype genomic sequencing data of plasmids and show for both known sequence type and also new alleles and ST variants. Let's start with the PlasmidFinder, the idea of the PlasmidFinder as a scientist or medical doctor or any researcher that already sequence their own interest of bacteria genomes. Once you have the bacteria sequences you upload the sequences to PlasmidFinder that contain plasmid replicons. And then we give you the plasmid replicons back to you. And the tools, PlasmidFinder, of course, is contained the set of genes. Sorry, not a gene, it's a replicon. And when you have your unknown genome, you put your unknown genome into the database and the tool will blast your unknown genome to all the replicons that they have in their database. And your unknown sequences can be either raw reads or assembled genomes. If the genomes or raw reads, the tool will do denote assembly of your raw reads to conceive before it actually blasts your genome to the replicons. And what you got is the plasmid replicons found in your unknown sequence. Here is a link to PlasmidFinder and here is the website of the PlasmidFinder. Right now, we have only the database of plasmid for enterobacteria and some of the Gram-positive bacteria. So you choose the type of your sample. And then you choose the percentage of identity, the percent identity. Percent identity is the percentage of nucleotide that are identical between the plasmid in the database and the sequences in your genome. And then you have to choose the type of your sample. If your sample is a contig or assembled genome then the program expects you to upload the Fasta file or the sequences in Fasta format. The Fasta format, start by great than sign and sequencing ID, and sequences. And if your genome is a raw read that came from the sequencing machine, then you going to have the Fastq. The Fastq is the Fasta plus quality score. It contains four lines instead of two lines. The first two lines is the sequencing data. And the last two lines is the quality score. Then if your genome is the raw reads, then you have to choose it's a single n or pair n. And then you have to choose the corresponding technology that you use for sequencing. So one juicy leg of the cut off an option over here and you are ready to upload your genome, click Add Isolation File. You upload your genome, the unknown genome and then you click Upload. The tool will upload your genome to the server and start doing the analysis. And this is optional that you can actually put your email over here and click Notify Me Via Email. So once the program done with the analysis, it will send you an email with the output in the email. So you can actually start another submission or close this one then, but you still have the output anyway if you put your email here. And here is an example of the output that you will get from the PlasmidFinder. It tell you the plasmid replicon that found in your unknown sample, and percent identity and the query and HSP length. So the query length is the length of the plasmid replicons. And the HSP length is the alignment length between the plasmid replicons and the sequence in your sample. So in this case, if you have the range of the plasmid, 450 and the alignment range is 450, meaning that every position in the replicons actually align with your sequences from your genome. And if you have also 100% similarity or identity, meaning that every alignment between the replicons and the sequence of your genome, they are identical in every precision. That's the perfect match. But sometimes you also have something In different color. The green one represent the perfect match and sometimes you have light green, also gray. So if it's in light green meaning that you have the alignment length equal to the size of the replicon. But their percent identity is not a 100%. So you have some mismatch in the alignment. And if you have gray, meaning that you have nearly a 100% identity but samples, the query length and the alignment length, they're not equal in this case. The size of the replicon is 535 but the size of the alignment size is 534. So there is one position that cannot be aligned to that. But it doesn't mean that your genome doesn't have these replicons, it still has this replicons because the percent identity is very high and the alignment length is also very high as well. And then you can see the alignment detail, if you click Extended Output, you can see all the alignment detail. In this case, you can see one mismatch over here. Now if your genome, contain plasmid, then you have an opportunity to type your plasmid using our tool pMLST. The pMLST is actually half the database of the allele sequences of the plasmid. And once you have your unknown sequences, then the program will blast your unknown sequences to all the alleles of the plasmid in the database and report your the sequence type of your plasmid. Here is the link to pMLST and is the webpage of pMLST. So, let's start by, you have to choose the pMLST configuration. So you have to know what plasmid that you have in your genome so you can do PlasmidFinder before you actually start doing pMLST. Then, you choose the corresponding plasmid that you have in your genome. Then, you choose the type of the reads of your unknown sample, either assembled genomes in contig or raw reads. Then you choose the genome, and upload your genome here and click Upload. Here is the output from pMLST. So, of course you have the sequence type, it's a sequence type of your plasmid. In this case, it's ST-31 of your ING1 plasmid. And you can see that the allele number that refer to ST-31 of your sample and of course it show you percent identity, alignment length and the length of the allele And at the end of the page, you will see that there is a technical problems. So if you have any technical problems using the tools, you can click over here and email us. And thank you very much for watching. [MUSIC]