Anthropic explains how its Constitutional AI girds Claude against adversarial inputs

09.05.2023 18:09 Uhr, Quelle: Engadget

It is not hard — at all — to trick today’s chatbots into discussing taboo topics, regurgitating bigoted content and spreading misinformation. That’s why AI pioneer Anthropic has imbued its generative AI, Claude, with a mix of 10 secret principles of fairness, which it unveiled in March. In a blog post Tuesday, the company further explained how its Constitutional AI system is designed and how it is intended to operate.Normally, when an generative AI model is being trained, there’s a human in the loop to provide quality control and feedback on the outputs — like when ChatGPT or Bard asks you rate your conversations with their systems. “For us, this involved having human contractors compare two responses,” the Anthropic team wrote. “from a model and select the one they felt was better according to some principle (for example, choosing the one that was more helpful, or more harmless).”Problem with this method is that a human also has to be in the loop fo

Weiterlesen bei Engadget

Mo	Di	Mi	Do	Fo	Sa	So
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Mo	Di	Mi	Do	Fo	Sa	So
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Archiv