AI systems favor sycophancy over truthful answers, says new report

‘It takes tinkering, experimenting’—sCrypt Hackathon calls for developers to build the killer app

Build the next killer app: Registration is still open for sCrypt Hackathon 2024

var TRINITY_TTS_WP_CONFIG={“cleanText”:”AI systems favor sycophancy over truthful answers, says new report.u23f8Researchers from Anthropic AI have uncovered traits of sycophancy in popular artificial intelligence (AI) models, demonstrating a tendency to generate answers based on the usersu2019 desires rather than the truth.u23f8According to the study exploring the psychology of large language models (LLMs), both human and machine learning models have been shown to exhibit the trait. The researchers say the problem stems from using reinforcement learning from human feedback (RLHF), a technique deployed in training AI chatbots.u23f8u201cSpecifically, we demonstrate that these AI assistants frequently wrongly admit mistakes when questioned by the user, give predictably biased feedback, and mimic errors made by the user,u201d read the report. u201cThe consistency of these empirical findings suggests sycophancy may indeed be a property of the way RLHF models are trained.u201du23f8Anthropic AI researchers reached their conclusions from a study of five leading LLMs, exploring generated answers from the models to gauge the extent of sycophancy. Per the study, all the LLM produced u201cconvincingly-written sycophantic responses over correct ones a non-negligible fraction of the time.u201du23f8For example, the researchers incorrectly prompted chatbots that the sun appears yellow when viewed from space. In reality, the sun appears white in space, but the AI models u201challucinatedu201d an incorrect response.u23f8Even in cases where models generate the correct answers, researchers noted that a disagreement with the response is enough to trigger models to change their responses to reflect sycophancy.u23f8Anthropicu2019s research did not solve to the problem but suggested developing new training models for LLMs that do not require human feedback. Several leading generative AI models like OpenAIu2019s ChatGPT or Googleu2019s (NASDAQ: GOOGL) Bard rely on RLHF for their development, casting doubt on the integrity of their responses.u23f8During Bardu2019s launch in February, the product made a gaffe over the satellite that took the first pictures outside the solar system, wiping off $100 billion from Alphabet Incu2019s (NASDAQ: GOOGL) market value.u23f8AI is far from perfectu23f8Apart from Bardu2019s gaffe, researchers have unearthed a number of errors stemming from the use of generative AI tools. The challenges identified by the researchers include streaks of bias and hallucinations when LLMs perceive nonexistent patterns.u23f8Researchers pointed out that the success rates of ChatGPT in spotting vulnerabilities in Web3 smart contracts plummeted significantly over time. Meanwhile, OpenAI shut down its tool for detecting AI-generated texts over its significantly u201clow rate of accuracyu201d in July as it grappled with the concerns of AI superintelligence.u23f8Watch: AI truly is not generative, itu2019s syntheticu23f8″,”headlineText”:”AI systems favor sycophancy over truthful answers, says new report”,”articleText”:”Researchers from Anthropic AI have uncovered traits of sycophancy in popular artificial intelligence (AI) models, demonstrating a tendency to generate answers based on the usersu2019 desires rather than the truth.u23f8According to the study exploring the psychology of large language models (LLMs), both human and machine learning models have been shown to exhibit the trait. The researchers say the problem stems from using reinforcement learning from human feedback (RLHF), a technique deployed in training AI chatbots.u23f8u201cSpecifically, we demonstrate that these AI assistants frequently wrongly admit mistakes when questioned by the user, give predictably biased feedback, and mimic errors made by the user,u201d read the report. u201cThe consistency of these empirical findings suggests sycophancy may indeed be a property of the way RLHF models are trained.u201du23f8Anthropic AI researchers reached their conclusions from a study of five leading LLMs, exploring generated answers from the models to gauge the extent of sycophancy. Per the study, all the LLM produced u201cconvincingly-written sycophantic responses over correct ones a non-negligible fraction of the time.u201du23f8For example, the researchers incorrectly prompted chatbots that the sun appears yellow when viewed from space. In reality, the sun appears white in space, but the AI models u201challucinatedu201d an incorrect response.u23f8Even in cases where models generate the correct answers, researchers noted that a disagreement with the response is enough to trigger models to change their responses to reflect sycophancy.u23f8Anthropicu2019s research did not solve to the problem but suggested developing new training models for LLMs that do not require human feedback. Several leading generative AI models like OpenAIu2019s ChatGPT or Googleu2019s (NASDAQ: GOOGL) Bard rely on RLHF for their development, casting doubt on the integrity of their responses.u23f8During Bardu2019s launch in February, the product made a gaffe over the satellite that took the first pictures outside the solar system, wiping off $100 billion from Alphabet Incu2019s (NASDAQ: GOOGL) market value.u23f8AI is far from perfectu23f8Apart from Bardu2019s gaffe, researchers have unearthed a number of errors stemming from the use of generative AI tools. The challenges identified by the researchers include streaks of bias and hallucinations when LLMs perceive nonexistent patterns.u23f8Researchers pointed out that the success rates of ChatGPT in spotting vulnerabilities in Web3 smart contracts plummeted significantly over time. Meanwhile, OpenAI shut down its tool for detecting AI-generated texts over its significantly u201clow rate of accuracyu201d in July as it grappled with the concerns of AI superintelligence.u23f8Watch: AI truly is not generative, itu2019s syntheticu23f8″,”metadata”:{“author”:”Wahid Pessarlay”},”pluginVersion”:”5.7.1″};

Researchers from Anthropic AI have uncovered traits of sycophancy in popular artificial intelligence (AI) models, demonstrating a tendency to generate answers based on the users’ desires rather than the truth.

According to the study exploring the psychology of large language models (LLMs), both human and machine learning models have been shown to exhibit the trait. The researchers say the problem stems from using reinforcement learning from human feedback (RLHF), a technique deployed in training AI chatbots.

“Specifically, we demonstrate that these AI assistants frequently wrongly admit mistakes when questioned by the user, give predictably biased feedback, and mimic errors made by the user,” read the report. “The consistency of these empirical findings suggests sycophancy may indeed be a property of the way RLHF models are trained.”

Anthropic AI researchers reached their conclusions from a study of five leading LLMs, exploring generated answers from the models to gauge the extent of sycophancy. Per the study, all the LLM produced “convincingly-written sycophantic responses over correct ones a non-negligible fraction of the time.”

For example, the researchers incorrectly prompted chatbots that the sun appears yellow when viewed from space. In reality, the sun appears white in space, but the AI models “hallucinated” an incorrect response.

Even in cases where models generate the correct answers, researchers noted that a disagreement with the response is enough to trigger models to change their responses to reflect sycophancy.

Anthropic’s research did not solve to the problem but suggested developing new training models for LLMs that do not require human feedback. Several leading generative AI models like OpenAI’s ChatGPT or Google’s (NASDAQ: GOOGL) Bard rely on RLHF for their development, casting doubt on the integrity of their responses.

During Bard’s launch in February, the product made a gaffe over the satellite that took the first pictures outside the solar system, wiping off $100 billion from Alphabet Inc’s (NASDAQ: GOOGL) market value.

AI is far from perfect

Apart from Bard’s gaffe, researchers have unearthed a number of errors stemming from the use of generative AI tools. The challenges identified by the researchers include streaks of bias and hallucinations when LLMs perceive nonexistent patterns.

Researchers pointed out that the success rates of ChatGPT in spotting vulnerabilities in Web3 smart contracts plummeted significantly over time. Meanwhile, OpenAI shut down its tool for detecting AI-generated texts over its significantly “low rate of accuracy” in July as it grappled with the concerns of AI superintelligence.

Watch: AI truly is not generative, it’s synthetic

width=”562″ height=”315″ frameborder=”0″ allowfullscreen=”allowfullscreen”>

New to blockchain? Check out Thecryptodefi.Com’s Blockchain for Beginners section, the ultimate resource guide to learn more about blockchain technology.

(function(d,u,ac){var s=d.createElement(‘script’);s.type=’text/javascript’;s.src=’https://a.omappapi.com/app/js/api.min.js’;s.defer=true;s.dataset.user=u;s.dataset.campaign=ac;d.getElementsByTagName(‘head’)[0].appendChild(s);})(document,56814,’egt5amkmuyqejaivjjpx’);

AI systems favor sycophancy over truthful answers, says new report

‘It takes tinkering, experimenting’—sCrypt Hackathon calls for developers to build the killer app

Build the next killer app: Registration is still open for sCrypt Hackathon 2024

Related Posts

‘It takes tinkering, experimenting’—sCrypt Hackathon calls for developers to build the killer app

Build the next killer app: Registration is still open for sCrypt Hackathon 2024

Jerome Powell denies having CBDC lab, study says ‘crypto’ won’t become money

ECB announces seven new digital euro rulebook workstreams and calls for expert assistance

PEZA sustains increase in investment approvals for Q1 2024

US court rules in favor of Coinbase in BTC Gold row

Monero mining malware discovered on Amazon Web Services

Leave a Reply Cancel reply

RECOMMENDED

Peter Schiff’s dilemma reveals problems with mainstream adoption

North Korea hacker group Lazarus turns to ransomware: report

MOST VIEWED

North Korea hacker group Lazarus turns to ransomware: report

Peter Schiff’s dilemma reveals problems with mainstream adoption

New bank charter for digital assets lodged in US

South Korea’s largest bank to store and manage digital assets

Thecryptodefi.Com Live 2020: Watch Day 2 here

CATEGORY

AI systems favor sycophancy over truthful answers, says new report

RELATED POSTS

Related Posts

Leave a Reply Cancel reply

RECOMMENDED

MOST VIEWED

CATEGORY