The United Arab Emirates (UAE) has unveiled their latest achievement in the world of language models. The UAE's Falcon has emerged as the top-performing open source large language model, surpassing Meta's LLaMA in the Hugging Face Open LLM Leaderboard. Available under the Apache 2.0 license for commercial applications, Falcon is a pretrained model based on transformers, developed by Abu Dhabi's Technology Innovation Institute (TII). The 40-billion parameter model was trained on 1 trillion tokens of text, drawing from various sources such as curated web data, books, code, academic papers, and online conversations. Falcon showcases similarities to OpenAI's GPT-3 but incorporates unique features like the FlashAttention algorithm and multiquery attention. Four versions of Falcon are available, catering to different purposes. Initially, TII imposed an "authorization fee" for commercial applications exceeding $1 million, but they later shifted to a more permissive license. Falcon's success highlights the increasing potential of independent teams and open source licenses in challenging big tech companies. The global reach of AI talent is exemplified by the UAE's achievement, emphasizing the collaborative nature of AI development worldwide.
Source: The Batch