What I'm very excited by is the advent of population genomics. Improvement in binning techniques coupled with deeper sequencing. Which allows you to pull back, pull out, high-quality and near-complete genomes for, uncultured organisms. And so the binning method that has, which is starting to get more traction is called differential coverage binning. And this is based on the idea that if you look at a set of related micro metagenomes. So for instance, a time series or a spatial series or even using different DNA extraction methods on the same sample. You have the same populations [COUGH] but they're present in different relative abundance. And you can use that pattern of relative abundance as a signature. So you get your you do your assembly and you get back anonymous fragments of genomes. And if you look at this coverage pattern for each of those anonymous fragments. You can bend them together by virtue of their coverage. And that method actually works really well. And so what's really exciting me at the moment is, is, on two fronts using that technology. On the evolutionary front we can not make genome trees using those ones. And now we can actually see, get a, get a very high resolution map of the microbial tree. And these trees are more, robust than sixteen ESE trees. And I'll, and so, so my goal at the moment is to replace the 16s-based, phylogeny. And the taxonomy derived from that, with genome-based phylogeny. And so, at the moment we've got a genome tree database that's got about 12,000 genomes in it. Of which about two and half thousand are these population genomes. So my prediction is that, two or three years from now, when you go to the public. Database's you'll find that the dominant form of Giraffe genomes will be these population genomes. Because every study of every habitat produces you know, usually on the order of dozens of these genomes. And we have been, and has with other stuff, just, not just us but other peo, other researchers as well. Developing tools for taking those, checking the, the quality of those genomes. So we have ways of checking to see how complete, and, or contaminated they are. And ways of then quickly piping them into genome trees, so we've spent some time on that. And so I'm very excited by that. because you know I have a, I have an obsessive compulsive disorder when it comes to classifying lifeforms. And so this, this very much meets that requirement of my personality to do that, you know, in a robust way. So I don't feel like I'm going around in circles. But the other, the other application for being able to pull out high quality population genomes from environmental samples. Is now you you can do your ecological analysis much more robustly. So, you know, when we first started meta genomics we were, it became apparent that for complex communities. We were kind of stuck in not being able to pull out the component populations. So we do things like do genecentric analysis where we look at relative abundance of gene families. Rather than do it from an organismal context, which has been fine. But the problem is that, you don't, you don't understand, you don't know who's doing what function. So you're getting a sort of a global overview of community function. So with the population genomes, you can, in many cases, you can pull out the major players from a given ecosystem, and now you can see. Which organisms are forming which functions, and you can work out the traffic interaction networks. So that's very exciting for ecology because that provides a really a solid foundation for understanding our ecosystem. All right, so Green Genes, was started in the, early to mid 2000s. And the main, developer is Todd Dissentis, he was the original developer. And, he knew that I, was curating sixteen ESE sequences in order to get taxonomy based on phylogeny. Which obviously the way we should all be doing it because phylogeny is a natural grouping of organisms and so we want to base. Classification which is a human construction, natural classification. So that's in that's in the goal. And he developed the green genes database as a vehicle for being able to pull in the public sequences. And then annotate them with all the metadata. And then I have been the main curator of the database since its inception. And my job is to go through, and this is a crazy job. And only a crazy person would do it. There's a couple of crazy people on the planet that do this kind of thing. Where you go through and you look at the structure of the tree. And, and ideally if you have some idea of how robust the tree is, and you. Reconcile that phylogeny with, what's the currently acceptable nomenclature for taxonomy. And so there are good resources for, you know, nomenclature people. There's a, there's a committee which decides on the names of organism. And then the higher ranks. And what you find when you do that. And there are numerous instances where the taxonomy. Doesn't match the phylogeny. So then it's a process of trying to reconcile that. And then another major issue is that because so much of the diversity is not represented by cultured organisms. There's big squares of the tree, off the phylogenic tree, that has no classification at all. So another part of green genes is to, is to give some form of classification to these uncharacterized part of the tree. The main programmer is Daniel McDonald. to, by the very generous hosting of Rob, Rob Nyatt. And so he's, he's been so supportive of the, of green genes. And others still involved as well. My take on the situation is that with whole genome based biologeny, worried about whether 60 ness was in about 2001. So, we're not even that far off the pace. I think, another 10 years from now, the genome, tree based biologeny, taxonomy will be. with, at about the level of number sequences that we have with 60 nets. So I predict we're going to go from somewhere in the order of ten to 20,000 sequences now. To about half a million genomes in ten years from that. And then that we should have a very nice comprehensive courage. Or the tree of life, in a taxonomy that's not compromised by chimeric artifacts. Hallelujah! And then I, then I don't know, you know, part of the fun is the journey. I hope that this journey will never be over, of course, because there's always going to be more diversity to discover. But that will be a far more. Solid basis for the taxonomy. So, during curation of the dream team's database I noticed. And not only me, other people noticed. That there was quite a large cluster of environmental sequences that were grouping with the cyanobacteria. And these sequences were coming from habitats that weren't exposed to sunlight. So, the nagging question in the back of, your mind would be are these non-photosynthetic cyanobacteria? The dogma in microbiology for decades has been that all cyanobacteria are photosynthetic. So, it was a very, it was an attractive target. So, we started to make primers and probes that would, target the, these basal or, cyanobacteria which we nicknamed. Darcy, short for dark cyanobacteria. So, Darcy was the nickname. And other groups, Ruth Layers group was also interested, and they were, in parallel looking for them. And we ended up using these population genomic methods to recover the genomes. And as it turned out, they, they fell right out. So, we got very, we got, good quality Darcy genomes from a range of habitats including the koala gut. From a bioreactor, and from a full-scale industrial granular sludge. And, we looked in those genomes and sure enough, they had no photosynthetic apparatus. And if you do the geometry it was a very robust clustering of that group with the photosynthetic side of bacteria. So, with taxonomy we wanted to call them cyanobacteria, and they met a huge amount of resistance. So anytime that you challenge a dogma, you're going to meet resistance. And so what they ended up doing was classifying. That group as a system. fo, [COUGH] followed the cyanobacteria called the Melainabacteria which is Greek a greek nymph, dark nymph. And then that was a lot less controversial. Because now you're not, you're still managing to maintain the dogma that all cyanobacteria, for synthetic. Now is some ways it's a semantic argument right? because taxonomys human made. But the point is that this group is, reproducible, reproducibly modified it with the sign of bacteria, so they are. The last common ancestor before the photosynthet, the introduction of the photosynthetic apparatus. Which must have occurred after the divisions. So, we should be able to learn something about the ancestry of photosynthesis by studying this, this star group. Now the power, the power of a name, though. This is the interesting thing. The original paper calling him a sister phylum of Melainabacteria sort of went with not much fan theorem. It was kind of a, you know, another, it was a, it was a cool study. But it was another candidate for phylum for which we now have genomes. And that's becoming more regular now with the new techniques. But because it didn't make any controversial claims, that they were assigned a bacteria, they didn't get much attention. We got a fair bit of attention for our paper for calling them cyanobacteria. So you can see the power of the name, and that's something to be remembered. Because people say, well, taxonomy is such a dry discipline, but really this power unites.