Reducing Toxicity in Language Models

Lilian Weng·Lilian Weng·AI·March 21, 2021

Large pretrained language models are trained over a sizable collection of online data. They unavoidably acquire certain toxic behavior and biases from the Internet. Pretrained language models are very powerful and have shown great success in many NLP tasks. However, to safely deploy them for practical real-world applications demands a strong safety control over the model generation process.

Read full article →

Reducing Toxicity in Language Models

Related Articles