Jorge Guerra Pires
Meet the The Toxicity Classifier
Google provides a few “ready-to-go” models of varying complexity. One beneficial
model is called the Toxicity model, which is perhaps one of the most straightforward
and useful models for beginners.
Like all programming, a model will require specific input and will provide specific
output. To kick things off, let’s take a look at what those are for in this model. Toxicity
detects toxic content such as threats, insults, cussing, and generalized hate. Since
those aren’t necessarily mutually exclusive, it’s important that each of these violations
has their own probability.
The Toxicity model attempts to identify a probability that a given input is true or false
for the following characteristics:
• Identity attack
• Insult
• Obscene
• Severe toxicity
• Sexually explicit
• Threat
• Toxicity
When you give the model a string, it returns an array of seven objects to identify the
percentage-of-probability prediction for each specific violation. Percentages are represented
as two Float32 values between zero and one.
If a sentence is surely not a violation, the probabilities will give most of the value to
the zero index in the Float32 array.
For example, [0.7630404233932495, 0.2369595468044281] reads that the prediction
for this particular violation is 76% not a violation and 24% likely a violation. We are going to use this information for our upcoming addition the the course, creating an insult meter
.
Tell me more!
Reference
Learning TensorFlow.js. Book by Gant Laborde
Comments