Meta AI brings a third llama to the table.
In a previous release, Meta AI developed and announced Llama 3, a large-scale language model that is making waves in artificial intelligence. This new addition to the Llama family has great features for generating consistent and fluent text, answering questions, participating in conversations, and more.
Rama 3 – Just a revised version?
What makes Llama 3 different from previous works? According to human evaluation results, this model achieved a win rate of 59.3% against Mistral Medium and 63.7% against GPT-3.5. These impressive numbers demonstrate that Llama 3 can produce text of comparable quality to human-generated text.
The training dataset for Llama 3 consists of over 15000 tokens collected from publicly available sources, making it 7 times larger than the training dataset used for Llama 2. This extensive training data allows the model to generate diverse and accurate text.
This 15 trillion token dataset is significantly larger than previous datasets, containing 7x more data than Llama 2 and 4x more extensive code than previously used. In particular, while more than 5% of our data is high-quality non-English content across over 30 languages, we acknowledge that performance in these languages may not reach the levels seen in English.
To ensure data quality, Meta has developed an advanced filtering pipeline. These include heuristic filters, NSFW content filters, semantic deduplication, and classifiers designed to assess text quality. Interestingly, Llama 2 was utilized to refine the training data for these quality classifiers and proved useful in enhancing subsequent generations.
Regarding scaling up pre-training, Meta has innovated detailed scaling laws to effectively enhance model training. These laws guide the combination of data and compute usage to optimize performance across various benchmarks such as code generation. Remarkably, the 8B and 70B parameter models showed continued performance improvement beyond the upper limits of traditional training, demonstrating their potential in large-scale data training scenarios.
llama 3 and you
The future of the Llama ecosystem is also promising, with plans to extend the functionality of the model and make it even more accessible to developers. This means we can expect to see even more innovative applications of Llama 3 in the coming months and years.
As a practical training application, Meta leverages three parallelization strategies: data, model, and pipeline parallelization, and trains at unprecedented scale using 16K GPUs. This scale was driven by a custom-built GPU cluster and a new training stack that automates maintenance and optimizes GPU usage to ensure more than 95% effective training time.
Meta reports that post-training improvement through instructional adjustments is essential. Techniques such as supervised fine-tuning, rejection sampling, and policy optimization improve model performance on specific tasks, allowing the model to learn how to choose the correct answer from the generated possibilities. . This subtle training strategy significantly improves Llama 3’s inference and coding capabilities, setting new benchmarks for AI model training and applications.
lastly
Llama 3 comes alongside a number of competitors that promise better performance and usability. With its powerful features and extensive training data, it revolutionizes the way we interact with machines. Whether you’re a developer looking to integrate Llama into your next project or just curious about the future of AI, Llama 3 is worth your attention.
Meta AI can be used on Facebook, Instagram, WhatsApp, Messenger, and more. web. Meta AI provides Meta AI documentation here.
of Llama 3 website There is model download information, Start guide.
Work with StorageReview
Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | twitter | TikTok | TikTok RSS feed