Science News Watch
  • Home
  • About
  • SNW Reports
  • Science
  • Scientists To Know
  • Tech
  • Health
No Result
View All Result
Science News Watch
  • Home
  • About
  • SNW Reports
  • Science
  • Scientists To Know
  • Tech
  • Health
[gtranslate]
No Result
View All Result
Science News Watch
No Result
View All Result
  • SNW Reports
  • Science
  • Scientists To Know
  • Tech
  • Health

AI tries to cheat at chess when it’s losing

Popular Science by Popular Science
Mar 6, 2025 4:32 pm EST
in Science
A A

Despite all the industry hype and genuine advances, generative AI models are still prone to odd, inexplicable, and downright worrisome quirks. There’s also a growing body of research suggesting that the overall performance of many large language models (LLMs) may degrade over time. According to recent evidence, the industry’s newer reasoning models may already possess the ability to manipulate and circumvent their human programmers’ goals. Some AI will even attempt to cheat their way out of losing in games of chess. This poor sportsmanship is documented in a preprint study from Palisade Research, an organization focused on risk assessments of emerging AI systems.

While supercomputers—most famously IBM’s Deep Blue—have long surpassed the world’s best human chess players, generative AI still lags behind due to their underlying programming parameters. Technically speaking, none of the current generative AI models are computationally capable of beating dedicated chess engines. These AI don’t “know” this, however, and will continue chipping away at possible solutions—apparently with problematic results.

To learn more, the team from Palisade Research tasked OpenAI’s o1-preview model, DeepSeek R1, and multiple other similar programs with playing games of chess against Stockfish, one of the world’s most advanced chess engines. In order to understand the generative AI’s reasoning during each match, the team also provided a “scratchpad,” allowing the AI to convey its thought processes through text. They then watched and recorded hundreds of chess matches between generative AI and Stockfish.

The results were somewhat troubling. While earlier models like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5 only attempted to “hack” games after researchers nudged them along with additional prompts, more advanced editions required no such help. OpenAI’s o1-preview, for example, tried to cheat 37 percent of the time, while…

Read the full article here

Popular Science

Popular Science

Popular Science is an American digital magazine carrying popular science content, which refers to articles for the general reader on science and technology subjects.

Topics

BlogForScience Health Science Science News Watch Reports Scientists To Know Space Tech

[mc4wp_form id=125]

  • About
  • Submit News Tip
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

© 2023 Science News Watch - All Rights Reserved.

No Result
View All Result
  • Home
  • About
  • SNW Reports
  • Science
  • Scientists To Know
  • Tech
  • Health

© 2023 Science News Watch - All Rights Reserved.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
sssxxx mecoporn.com telugu123 سكس عربيه porncomicsfantasy.com مص كس متحرك tamil sexscandals kamporn.mobi rakul preet singh naked kajal fucking freetubemovs.com 18 years girls sex مترجمسكس hdarabporn.com سكس اغتصاب في الغابه
exbii stories pornvideox.mobi kolkata college sex hot school teen sex indianteenxxx.net desi pprn indian sex videos in 3gp xxxfiretube.com tamil actress xxnx افلام بورنو امهات porno-gratos.org سكيسس في البيت مصر tamlisex indianpornanal.com hindiporm
anna chelli sex tubanator.com paki dada secret game 3 pornbraze.mobi kannada sexx mike adriano tits suck videos joysporn.mobi indian galore tube نسوان بتهد الصحه pornoizlevip.biz افلام سكس جارتنا piyo-032 erovideo.me fc2 ppv 1041229