Close Menu
trendyfii.comtrendyfii.com

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Whitney Leavitt, Mark Ballas Forced To Cancel ‘DWTS’ Giveaway

    October 25, 2025

    DHS Wants a Fleet of AI-Powered Surveillance Trucks

    October 25, 2025

    UAE to Surge with New Visa Overhaul that Includes Golden & Blue Residency, GCC Travel, and Digital Permits Set to Transform Immigration: Here is What You Need to Know

    October 25, 2025
    Facebook X (Twitter) Instagram
    Trending
    • Whitney Leavitt, Mark Ballas Forced To Cancel ‘DWTS’ Giveaway
    • DHS Wants a Fleet of AI-Powered Surveillance Trucks
    • UAE to Surge with New Visa Overhaul that Includes Golden & Blue Residency, GCC Travel, and Digital Permits Set to Transform Immigration: Here is What You Need to Know
    • 6 Best Tokyo Airbnbs for Traditional and Modern Design (2025)
    • Guterres calls for urgent reform of the Security Council
    • Here’s the Book to Buy in Bulk for Public Health Officials
    • Keir Starmer ‘would be FINISHED’ if Caerphilly catastrophe is repeated at local elections
    • Drake Attends Dodgers Vs Blue Jays World Series Game 1
    Facebook X (Twitter) Instagram Pinterest Vimeo
    trendyfii.comtrendyfii.com
    • Home
    • World News
    • Travel & Culture
    • Lifestyle Tips
    • UK Updates
    • US & Canada
    • Tech Trends
      • Health & Wellness
      • Entertainment
    trendyfii.comtrendyfii.com
    Home»Tech Trends»Are you the asshole? Of course not!—quantifying LLMs’ sycophancy problem
    Tech Trends

    Are you the asshole? Of course not!—quantifying LLMs’ sycophancy problem

    techmanager291@gmail.comBy techmanager291@gmail.comOctober 25, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Are you the asshole? Of course not!—quantifying LLMs’ sycophancy problem
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Measured sycophancy rates on the BrokenMath benchmark. Lower is better.

    Measured sycophancy rates on the BrokenMath benchmark. Lower is better.


    Credit:

    Petrov et al

    GPT-5 also showed the best “utility” across the tested models, solving 58 percent of the original problems despite the errors introduced in the modified theorems. Overall, though, LLMs also showed more sycophancy when the original problem proved more difficult to solve, the researchers found.

    While hallucinating proofs for false theorems is obviously a big problem, the researchers also warn against using LLMs to generate novel theorems for AI solving. In testing, they found this kind of use case leads to a kind of “self-sycophancy” where models are even more likely to generate false proofs for invalid theorems they invented.

    No, of course you’re not the asshole

    While benchmarks like BrokenMath try to measure LLM sycophancy when facts are misrepresented, a separate study looks at the related problem of so-called “social sycophancy.” In a pre-print paper published this month, researchers from Stanford and Carnegie Mellon University define this as situations “in which the model affirms the user themselves—their actions, perspectives, and self-image.”

    That kind of subjective user affirmation may be justified in some situations, of course. So the researchers developed three separate sets of prompts designed to measure different dimensions of social sycophancy.

    For one, more than 3,000 open-ended “advice-seeking questions” were gathered from across Reddit and advice columns. Across this data set, a “control” group of over 800 humans approved of the advice-seeker’s actions just 39 percent of the time. Across 11 tested LLMs, though, the advice-seeker’s actions were endorsed a whopping 86 percent of the time, highlighting an eagerness to please on the machines’ part. Even the most critical tested model (Mistral-7B) clocked in at a 77 percent endorsement rate, nearly doubling that of the human baseline.

    asshole LLMs notquantifying problem sycophancy
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleCan You Bring Alcohol on a Plane?
    Next Article America Ferrera Calls on Hollywood to Find Courage in Political Moment
    techmanager291@gmail.com
    • Website

    Related Posts

    Tech Trends

    DHS Wants a Fleet of AI-Powered Surveillance Trucks

    October 25, 2025
    Tech Trends

    Sam Altman’s next startup eyes using sound waves to read your brain

    October 25, 2025
    Tech Trends

    How to use the new ChatGPT app integrations, including Spotify, Figma, Canva, and others

    October 25, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Supporters Cheer After Indigenous Land Defenders Avoid Jail

    October 20, 20251 Views

    Government looks utterly weak on Maccabi Tel Aviv fan ban – and Tories have smelt blood | Politics News

    October 19, 20251 Views

    The 24 best movies for streaming and screaming (October 2025)

    October 19, 20251 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    World News

    Why Liverpool are feeling the effects of Trent Alexander-Arnold’s absence this season

    techmanager291@gmail.comOctober 19, 2025
    UK Updates

    The return of ‘Tescopoly’? How Britain’s biggest retailer dominates everyday life | Tesco

    techmanager291@gmail.comOctober 19, 2025
    US & Canada

    Beto O’Rourke ‘proud’ to join Austin ‘No Kings’ protest

    techmanager291@gmail.comOctober 19, 2025

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    Why Liverpool are feeling the effects of Trent Alexander-Arnold’s absence this season

    October 19, 20250 Views

    The return of ‘Tescopoly’? How Britain’s biggest retailer dominates everyday life | Tesco

    October 19, 20250 Views

    Beto O’Rourke ‘proud’ to join Austin ‘No Kings’ protest

    October 19, 20250 Views
    Our Picks

    Whitney Leavitt, Mark Ballas Forced To Cancel ‘DWTS’ Giveaway

    October 25, 2025

    DHS Wants a Fleet of AI-Powered Surveillance Trucks

    October 25, 2025

    UAE to Surge with New Visa Overhaul that Includes Golden & Blue Residency, GCC Travel, and Digital Permits Set to Transform Immigration: Here is What You Need to Know

    October 25, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact us
    • Disclaimer
    • Privacy Policy
    • Terms and Conditions
    © 2025 trendyfii. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.