Monday, 21 August 2023

Kuaishou Introduces KwaiYii and SpikeGPT: Its In-House Developed Large Language Models

Kuaishou Unveils KwaiYii and SpikeGPT: Advancements in Language Models and Spiking Neural Networks

Kuaishou, the well-known short video platform based in Beijing, recently made headlines with the public release of its self-developed large language model called KwaiYii. This release came after a successful beta-testing phase for a ChatGPT-like service for Android devices. KwaiYii, which boasts 13 billion parameters, is set to rival OpenAI GPT-3.5 in terms of content creation, consultation, and problem-solving capabilities.

KwaiYii: A Breakthrough in Language Generation

Positioning itself as a major player in AI research and development, Kuaishou has been actively exploring both public mainstream AI products and R&D projects. KwaiYii represents the company’s foray into the public AI space. With the unveiling of this large language model, Kuaishou aims to address the issue of “AI hallucinations,” or inaccuracies resulting from insufficient data training.

The primary application of Kuaishou’s AI chatbot has been in search, using original content from the platform to provide accurate and relevant information. By leveraging KwaiYii’s advanced capabilities, Kuaishou intends to enhance its AI-powered search service and provide users with more reliable and tailored results.

SpikeGPT: Advancing Energy Efficiency with Spiking Neural Networks

In addition to KwaiYii, Kuaishou also showcased its research into Spiking Neural Networks (SNNs) and introduced SpikeGPT. Developed in collaboration with the University of California, SpikeGPT is a generative spiking neural network language model. With a 260M parameter version, SpikeGPT combines the performance of deep neural networks (DNN) with the energy-saving benefits of spike-based computations.

Spiking Neural Networks (SNNs) offer a more energy-efficient alternative to traditional artificial neural networks. Unlike continuous values used in conventional models, SNNs communicate through discrete time intervals called “spikes.” This efficient approach makes SNNs well-suited for tasks that require realism, such as robotics and medical imaging.

SpikeGPT integrates recurrence into a transformer block, enabling compatibility with SNNs. This integration eliminates computational complexity and allows for the representation of words as event-driven spikes. The model processes streaming data word by word, capturing long-range dependencies in complex syntactic structures. To enhance performance, SpikeGPT incorporates techniques like binary embedding, token shift operators, and a vanilla RWKV mechanism.

Unlocking the Potential of Spiking Neural Networks

Although SNNs offer numerous advantages, they present certain challenges. Training SNNs is more complex than traditional artificial neural networks due to the discrete nature of spikes. Additionally, SNNs are not yet as well-understood, making optimization and design for specific tasks more challenging.

However, the empirical study conducted on SpikeGPT demonstrated promising results. Training the model with different parameter scales showed comparable outcomes to transformer baselines but with significantly fewer synaptic operations (SynOps). This research highlights the potential of training large SNNs and applying event-driven spiking activations to reduce computational demands in language generation tasks.

The researchers involved in SpikeGPT’s development plan to continue refining the model and updating their preprint paper accordingly. The code for SpikeGPT is available on the project’s GitHub repository, and the detailed model paper can be accessed on arXiv.

Editor Notes

Kuaishou’s unveiling of KwaiYii and SpikeGPT showcases the company’s commitment to pushing the boundaries of AI technology. By developing their own large language model and exploring the potential of spiking neural networks, Kuaishou demonstrates its dedication to enhancing language generation and energy efficiency in AI systems.

As language models and neural networks continue to evolve, we can expect groundbreaking advancements in various applications, from content creation to problem-solving. Kuaishou’s contributions in this field are significant and worth keeping an eye on.

For more news and updates on the latest developments in AI, visit the GPT News Room.

Source link



from GPT News Room https://ift.tt/6XtOUzn

No comments:

Post a Comment

語言AI模型自稱為中國國籍,中研院成立風險研究小組對其進行審查【熱門話題】-20231012

Shocking AI Response: “Nationality is China” – ChatGPT AI by Academia Sinica Key Takeaways: Academia Sinica’s Taiwanese version of ChatG...