AI closer than ever to passing Turing test for ‘intelligence’

American computer scientist Alan Turing put forth an experimental approach to the question of” can machines think” in 1950. After five minutes of questioning, he suggested that if a human couldn’t tell whether they were speaking to an artificially intelligent ( AI ) machine or another human, this would prove that AI has intelligence similar to that of humans.

Despite the fact that AI systems never quite passed Turing’s exam during his career, he hypothesized that

It will be possible to program computers to enjoy the imitation activity so well in around fifty years that an average interrogator won’t have more than 70 % of making the correct identification after five minutes of questioning.

More than 70 times after Turing’s plan, no AI has been able to pass the test by meeting the precise requirements he specified. However, a few devices have come very close, as some headlines reflect.

Three big language versions, including GPT – 4( the AI technologies behind ChatGPT ), were tested in a recent study. The members conversed with either another individual or an AI system for two days. The AI was instructed to correct minor spelling errors and to stop if the developer became very intense.

The AI did a great job of deceiving the testing with this suggestion. Testers could only accurately guess whether they were speaking to an AI system 60 % of the time when paired with an artificial intelligence ( AI ) bot.

We might see AI go Turing’s unique test within the next few years given the quick advancements made in the design of natural language processing methods.

But is mimicking people actually a good way to gauge intelligence? If no, what other metrics could we use to assess the skills of AI?

The Turing test’s restrictions

Although a program passing the Turing test provides us with some indication that it is clever, this test does not represent intelligence in its entirety. It can result in” false negatives ,” which is one issue.

The huge language models of today are frequently made to quickly state that they are not human. For instance, ChatGPT frequently begins its response when you ask it a problem with the word” as an AI language type.” This type of programming may overturn the Turing test’s underlying capability, even if AI systems had it.

A computer screen displays the ChatGPT software from OpenAI. Asia Times Files / Getty Images

Additionally, the test runs the risk of some” false positive.” A system was theoretically move the Turing test just by being hard-coded with a human-like response to any potential input, as philosopher Ned Block noted in an article from 1981.

Beyond that, the Turing exam focuses specifically on individual consciousness. An expert investigator will be able to identify a job where AIs and humans perform differently if AI consciousness differs from that of humans.

Regarding this issue, Turing penned:

This is a very strong criticism, but at least we can say that it doesn’t need to bother us if it can be built to perform the emulation game well.

In other words, while passing the Turing check indicates that a program is clever, failing it does not indicate that it is not intelligent.

Furthermore, the test does not accurately reflect whether AIs are conscious, capable of experiencing pleasure and pain, or possess spiritual significance. Some cognitive scientists believe that having a working memory, higher-order thoughts, the capacity to consider one’s environment, and the ability to model how one moves their body around it are all components of consciousness.

The issue of whether or not AI systems possess these capabilities is not addressed by the Turing check.

AI’s expanding skills

A particular logic is the foundation of the Turing check. In other words, since humans are clever, anything that can successfully mimic humans is probably smart.

However, this concept offers no insight into the nature of knowledge. Thinking more analytically about what intellect is is a different way to assess AI’s intelligence.

Now, there isn’t a single test that you definitively assess either people or artificial intelligence.

In its broadest sense, knowledge is the capacity to accomplish a variety of objectives in various contexts. More brilliant systems are those that can accomplish more objectives across more environments.

Therefore, evaluating the performance of general-purpose AI techniques across a range of responsibilities is the best way to keep track of developments in their design. Scientists working on machine learning have created a number of measures for this.

A standard measuring effectiveness on multiple choice tests across a variety of college-level academic subjects, GPT-4, for instance, was able to correctly answer 86 % of queries in massive multitask vocabulary understanding.

It also performed well in AgentBench, a tool that assesses how well an agent behaves by, for instance, browsing the internet, making online purchases, and playing video games.

Is the Darwin test however applicable today?

The Turing evaluation is a check of artificial intelligence’s capacity to mimic human behavior. The ability of huge language models to move the Turing test is now being demonstrated by the fact that they are skilled imitators. But emulation and brains are not the same thing.

There are as many different types of knowledge as there are objectives. Monitoring AI’s development of a variety of crucial features is the best way to comprehend its knowledge.

When it comes to the issue of whether AI is clever, it’s crucial that we don’t maintain” changing the goalposts.” Reviewers of the idea of AI brains are continually discovering new tasks that AI systems may struggle to finish, only to discover that they have jumped over yet another obstacle because AI’s capabilities are rapidly improving.

The key issue in this situation is not whether AI systems are smart, but rather, more specifically, what types of intelligence they might possess.

Cameron Domenico Kirk – Giannini, Assistant Professor of Philosophy at Rutgers University, and Simon Goldstein, Associate Professor at the Dianoia Institute of theory at Australian Catholic University

Under a Creative Commons license, this essay has been republished from The Conversation. Read the article in its entirety.