09.05.2023 18:09 Uhr, Quelle: Engadget
Anthropic explains how its Constitutional AI girds Claude against adversarial inputs
It is not hard — at all — to trick today’s chatbots into discussing taboo topics, regurgitating bigoted content and spreading misinformation. That’s why AI pioneer Anthropic has imbued its generative AI, Claude, with a mix of 10 secret principles of fairness, which it unveiled in March. In a blog post Tuesday, the company further explained how its Constitutional AI system is designed and how it is intended to operate.Normally, when an generative AI model is being trained, there’s a human in the loop to provide quality control and feedback on the outputs — like when ChatGPT or Bard asks you rate your conversations with their systems. “For us, this involved having human contractors compare two responses,” the Anthropic team wrote. “from a model and select the one they felt was better according to some principle (for example, choosing the one that was more helpful, or more harmless).”Problem with this method is that a human also has to be in the loop fo
Weiterlesen bei Engadget