Hi, everyone. Today, we're going to learn how to use PATRIC's Protein Family Sorter. This is a tool that allows you to compare protein families across hundreds of genomes. In today's demonstration, we'll learn how to use this tool and how to launch the tool. The Protein Family Sorter uses protein families, of course, but for anybody who does any comparative genomics, you are justified in asking which protein families. Well, any genome that is available in PATRIC, be at a public or private one, it's been annotated in PATRIC, is assigned two types of protein families which we call PATtyfams. This is the manuscript that describes that. It's a K-mer based approach that uses the function-based assignments in PATRIC. For every genome annotated in PATRIC, we provide these two types of protein families for the annotated genomes. One is specific at the genus level, which also includes the species and the strings. We call these our local families or pleas. We also have protein families that cross the genus boundary. These are our global protein families or PGfams. Now that you know about the protein families, let's talk about finding and using the tool in PATRIC. Let's go into Services, and over here, under protein tools, click on a Protein Family Sorter. This will rewrite the page, and you can see you can add genomes individually or groups of genomes. Now, obviously, if you want to compare 400 genomes, it would be a real pain to add genomes individually. You'd want to create those as a group, which we've done in a different video that you can watch and see how to do that. But let's say you did want to add a genome individually. Notice this filter icon, if I click on that, it allows me to filter my search. Let's say I don't want to use my private genomes, PATRIC has over 300,000 public genomes and maybe I don't want to use those. The genome I'm looking for, I know it was assigned as a reference or representative genomes from GenBank, we use those designations in PATRIC. That's like 8,000 genomes. But the search box here is a smart search box. I know it's the strain name as NVSL. Look, I just start typing that and ta-da, this genome pops up. I click on that. But what I have to do, I've selected this genome, I want to use this genome. It has to get over here into the selected genomes box to be included in the tool. Notice that I can't click on the search icon yet, it has to have some genomes to search on. I can move it over into the selected genomes box by clicking here. I have one genome, all it takes is one, but we wanted to do a few more. Select genome group, you can go in and start typing different groups that you've named and you know what they are. If it's something you did recently, you can click here and find the genome group that you want to include, which is generally it starts at the most recently created all the way back. If I put that in there and move that over. Let me also note that you can get into your workspace by clicking the folder here and then you can see other groups. If you've created genome groups in a different folder system, you can use that. Remember, we've talked about the two different types of protein families. Actually, there are three. We still have FIGfams which are functional families. But those aren't populated broadly across PATRIC so I wouldn't use those. Cross-genus families, if you've had two different genera, you'd need to use this type of protein family. But for this example, all I have are genomes within the genus brucella. So I'm going to click on the genus-specific families. I could use the global families too, that should work just as well but I would just want to demonstrate the genus-specific families. Then I'm ready to click the search button and be able to see the tool and how it works in the protein table. We'll be showing that in the next instructional video, which is manipulating the protein family search tool. Thanks for watching. Now, you finally get to submit a Protein Family Sorter job. Remember that each of the assignments build on each other. When the first one we had to look for what I hoped you'd find your private genomes, that you would use Unicycler to assemble. Secondly, we had you create two groups, one with the Unicycler genomes and one with the Canu assembly strategies of the same reads. Keep that in mind. These are the same reads used with each of these strategies. Now, we're going to launch a job. I want you to submit a protein family job using the two genome groups. First, upload them and say you want the PLfams and launch the job. Did you get any protein families? If not, why? Then secondly, select the PGfam as the protein families. Did you get any protein families? If not, why? Keep that tab open. There's a lot more after this.