Reddit Inc. is weighing feedback from early meetings with potential investors in its initial public offering that it should consider a valuation of at least $5 billion, according to people familiar with the matter, even as it is estimated below that figure in the volatile market for shares of private companies.
Most LLMs have tonnes of NSFW data in their training.
Typically, if this wants to be blocked, a secondary RAG or LORA is run overtop to act as a filtering mechanism to catch, block, and regenerate explicit responses.
Furthermore, output allowed lexicon is a whole thing.
Unfiltered LLMs without these layers added on are actually quite explicit and very much capable of generating extremely NSFW output by default.
@pixxelkick @ardi60 Well, if anyone wants to buy it for that purpose, then I just hope they remember to screen out the more NSFW parts of Reddit.
Otherwise, their bots are going to start giving some rather unfortunate responses to customer questions…
I am looking forward to the hilarity of it for a while though.
“Cooking bot, i have found this cucumber i need to use before it gets bad. What can i do with it?”
“Shove it up your rectum”
Could lead to a lot of interesting lawsuits and let a lot of MBA bros look rather stupid.
Most LLMs have tonnes of NSFW data in their training.
Typically, if this wants to be blocked, a secondary RAG or LORA is run overtop to act as a filtering mechanism to catch, block, and regenerate explicit responses.
Furthermore, output allowed lexicon is a whole thing.
Unfiltered LLMs without these layers added on are actually quite explicit and very much capable of generating extremely NSFW output by default.