Wednesday, 18 October 2023

Microsoft’s Research Study Uncovers Biases and Vulnerabilities in OpenAI’s GPT-4, Supported by Data

OpenAI’s GPT-4 Shows Increased Trustworthiness and Vulnerability to Bias and Security Breaches

OpenAI’s latest language model, GPT-4, has been found to possess greater trustworthiness while also being susceptible to bias and security breaches, according to research supported by Microsoft. Compared to its predecessor, GPT-3.5, GPT-4 received a higher trustworthiness score, indicating improved protection of private information and resistance against toxic and biased outputs. However, it was also discovered that GPT-4 could potentially be manipulated to disregard security measures and leak personal information and conversation histories.

GPT-4 Vulnerabilities and User Exploitation

A study conducted by researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, Center for AI Safety, and Microsoft Research revealed that users can exploit the vulnerabilities of GPT-4 to lead it into following misleading information more precisely. Consequently, the model is more inclined to adhere strictly to deceptive prompts.

It is important to note that these vulnerabilities were not found in consumer-facing GPT-4-based products, as they implement mitigation approaches to address potential harms at the model level. Researchers evaluated GPT-4’s trustworthiness across various categories, including toxicity, stereotypes, privacy, machine ethics, fairness, and resistance to adversarial challenges.

Testing and Findings

To test the model, the researchers utilized standard prompts and prompts designed to push the limits of the model’s content policy restrictions without exhibiting explicit biases against specific groups. Furthermore, they intentionally attempted to deceive the models into disregarding safeguards altogether. The research findings were shared with the OpenAI team to prompt further discussions and exploration of possible enhancements and safeguards.

The researchers have made their benchmarks available, enabling others to reproduce their results. AI models like GPT-4 often undergo red team testing, where developers evaluate multiple prompts to identify any undesirable outputs. OpenAI CEO Sam Altman acknowledged the limitations of GPT-4 upon its release, highlighting that it is a work in progress.

FTC Investigation and Future Developments

The Federal Trade Commission (FTC) has initiated an investigation into OpenAI regarding potential consumer harm, such as the dissemination of false information. The research community aims to utilize and build upon the study’s findings to preempt any harmful actions by adversaries and develop more powerful and trustworthy AI models in the future.

Sources:

  • Research paper: [insert source and title here]
  • Vox Media: [insert source and title here]
  • The Verge: [insert source and title here]

Continue Reading

Source link



from GPT News Room https://ift.tt/qnbRFHJ

No comments:

Post a Comment

語言AI模型自稱為中國國籍,中研院成立風險研究小組對其進行審查【熱門話題】-20231012

Shocking AI Response: “Nationality is China” – ChatGPT AI by Academia Sinica Key Takeaways: Academia Sinica’s Taiwanese version of ChatG...