So, welcome to our third section.
We're going to talk now about going beyond chemistry,
going about biological effects of substances and how we can make use of
this for our Read-Across arguments.
The Good Read-Across Practice Guidance is only one step to Good Read-Across Practice.
We also need data,
we need tools and we hope to use biological support data which is the topic here.
We actually accompanied our document on Good-Read Across Practice with
the second one with a subset of authors which we're dealing
with the use of Read-Across with biological data.
We separated this because we felt that here
even less guidance can be given how to do it regularly.
We can only give some examples where it has been done successfully so that
others can learn and possibly apply similar strategies.
The four examples I'm going to share which are documented in this paper,
they are using public big data Hao Zhu and
his team from Rutgers have given some examples here.
We will see some data produced by Nicole Kleinstreuer
from the National Institute for Environmental Health Sciences,
who's been using the ToxCast and Tox21 data to support Read-Across.
We will learn from Stemina who are using
metabolomics and in vitro systems and we will hear from BASF,
the producer of industrial chemicals,
how they are using short term animal tests and metabolomics to do this very same thing.
So let's start with the use of public big data in Rutgers.
Hao Zhu's group is using PubChem.
PubChem is one of the largest databases which
is incorporating more and more biological information.
Already in 2014, when this assessment was done more than 700,000
bioassays were available with 200 million biological outcomes.
This included about 1.2 billion data points on
2.8 million small molecules and 1.9 million chemical structures.
So, you can see that this is a tremendous data source.
However, it is a data set which is produced by accumulation not by systematic testing.
So the sources are very varied.
The data sets are sparse.
You cannot expect to have the exact same data for two different molecules.
This group produced the tool
which is called the Chemical In Vitro-In Vivo Profiling tool, CIIPro,
and this tool is publicly available on the website at Rutgers,
you see the URL up here on top of this molecule.
We have also just published jointly on this how CIIPro
is the new Read-Across portal to fill
data gaps using public large-scale chemical and biological data.
This paper is in press,
but the advanced exercise version is already available.
So this approach is trying to make use of
enormous data sets of not necessarily optimal quality,
but where mass does compensate for possible less of quality.
The opposite approach was done by Nicole Kleinstreuer,
formerly part of the ToxCast team and
now at the National Institute of Environmental Health Sciences.
She developed the process which she calls Bioactivity Based Read-Across (BaBRA),
and a combined approach
where also Structural information is used which she calls St. BaBRA.
She was using the ToxCast information,
and you should know by now from this lecture series that ToxCast has
more than 700 endpoints which have been applied to about 2000 chemicals,
and we have a via dashboard openly available data set on
concentration response curves that replicates for this highly correlated data sets
on biological characterizing these 2000 substances.
I don't have the time to go into details,
but she used this successfully for
the example of endocrine disruption by the estrogenic route.
So substance which are exerting effects like an estrogen.
And very impressively she combined this with the uterotrophic database.
The uterotrophic assay is the animal test for estrogenic endocrine disruption.
And by combining structural and biological activity,
the best model was actually showing 97 percent balanced accuracy which is unheard of.
Which means by identifying three substances which had
similar properties with regard to
biological effects and also similarity in chemical structure,
she achieved this enormous sensitivity and
specificity with only this type of information.
81 compounds could be included in this analysis,
a very good evidence that such data sets can actually support Read-Across.
Her conclusions and ongoing work were to conclude that there's great promise within
certain applicability domains and well-curated data
exists because this is I think the prerequisite for her work.
The fact that endocrine disruption by the estrogenic route was
very well covered endpoint both in the animal studies to compare to,
and in the high-throughput screening set that she
was using was the basis for the success.
We cannot expect that this will work for many many other endpoints.
She concludes that feature selection and
optimization methods need to be explored to improve further
the predictive accuracy and applicability that also
a good separation is being achieved between the positive and the negative space,
which means the substance which show the toxic properties and those which don't.
And the ToxRefDB, the database of EPA was the legacy data from
animal studies needs to have
such endpoints covered with positive and negative data to allow this.
There's a lot of challenges.
She's currently trying to expand this approach to
other health effects from endocrine targets moving to reproductive impairments or cancer.
So there's at the moment a follow up work taking place.
Similar work is being done at this moment at the Environmental Protection Agency.
So there's a lot to be expected how
these large data sets are now coming into use to support Read-Across.
Stemina is a biotech company in Madison,
Wisconsin who are promoting among others the devTOX assay.
This devTOX assay, is an assay where stem cells are being used,
and these stem cells are growing on 96 plates,
and then they are allowed to spontaneously differentiate
into some type of embryonal tissues with a very short period of time only.
And by using mass spectroscopy,
some biomarkers are being measured,
and they have developed algorithms to analyze the likely teratogenicity,
so the damage to the growing embryo from these biomarkers.
These methods have been published.
Here, we are only interested in
an aspect which is how to use such type of information for Read-Across.
And they have been doing this for some colossal fungicides,
and these fungicides, four of them did include data for their teratogenic effects.
The fifth one, Myclobutanil did not.
So the question is, are these substances which are relatively similar with
regard to their chemical physical properties as you can see here in this table,
are they also behaving similar with regard to the biological activity in this assay?
And indeed without going into any detail, the substance,
the fifth Myclobutanil was behaving just like the others
with respect to the teratogenic effect so that they could conclude that
Myclobutanil has similar potential for developmental toxicity as
the substance which had been tested both in vitro and in animals.
The last example originates from BASF.
BASF has already studied more than 500 compounds in short term animal studies,
so about seven days of daily treatment typically used.
And then they take a tiny amount of plasma from these animals, blood plasma,
and identified as metabolomics,
this pattern of small molecules in the blood of these animals.
And they measure about 9000 different features in
this mass spectroscopy and these features identified some of them,
not identified can then be used in order to
establish similarity of biological effect in these animals.
To give you an example of how this approach can be applied,
I would like to use the grouping of
two different substances which are structurally very
similar which is 2 and 4-Acetylaminofluorene.
These substances while chemically very
similar are quite different in their biological effect.
The 2-Acetylaminofluorene is a strong liver enzyme inducer and a liver carcinogen,
also immunosuppressant and a bladder carcinogen.
The four molecule is a slight enzyme inducer only shows no carcinogenicity in the liver,
only some lipid accumulation and is immune suppressant as well.
So the question is, can we distinguish these two?
Are they showing differences in the pattern of
metabolites in the blood of animals after on the seven day treatment.
What you can see here is that a comparison of the two.
This is the 2-acetylaminofluorene,
and this is the 4-compound.
And you see that they are very different with
regard to significant changes of metabolites so
that the 4-Acetylaminofluorene is
actually very much distinguished not only from the 2-compound,
but also from other treatments which are shown to
the right which are all liver enzyme inducers.
So you see biologically these two have very little in common.
And if you group or rank the five other treatments by similarity,
Acetylaminofluorene 4 and 2 are
actually showing up at very different positions in the ranking.
The most similar treatments which are found to these compounds have nothing in common.
So the ranking only the treatment was 4-Acetylaminofluorene is only similar and position
209 for the 2-Acetylaminofluorene and
vice versa as applying the position of 368 for the focal point.
So they absolutely do not rank with each other which is clear evidence that
the biology says these substance are not similar but the structure is quite similar.
So they are behaving biologically unequal.
There's many more sources for biological data.
In our report, we give this table which simply shows
you that there is a broad resource which can be tapped
for establishing biological similarity in order to make then
a case for these biological similarities as Read-Across argument.
So this section has shown you hopefully that while this is not yet
a standard procedure of getting via case studies,
via examples like the four discussed,
that they are getting more and more evidence that biological similarity
is an interesting approach to complement structural similarity.