{"id":100427,"date":"2023-09-11T12:16:01","date_gmt":"2023-09-11T12:16:01","guid":{"rendered":"https:\/\/www.techopedia.com"},"modified":"2023-09-11T12:16:01","modified_gmt":"2023-09-11T12:16:01","slug":"llama-killer-falcon-180b-shows-open-source-ai-is-done-playing-catch-up","status":"publish","type":"post","link":"https:\/\/www.techopedia.com\/llama-killer-falcon-180b","title":{"rendered":"“Llama Killer” Falcon 180B Shows Open-Source AI is Done Playing Catch-Up"},"content":{"rendered":"
This week, the United Arab Emirates (UAE) Technology Innovation Institute (TII) unveiled the largest open-source large language model<\/a> (LLM) to date, Falcon 180B. It comes with 180 billion parameters and was trained on 3.5 trillion tokens.\u00a0<\/span><\/p>\n Falcon 180B uses TII\u2019s <\/span>RefinedWeb<\/span><\/a> dataset, which uses data taken from public web-crawling<\/a> research papers, legal text, news, literature, and social media conversations.\u00a0\u00a0<\/span><\/p>\n This data means that the model can perform well when conducting tasks like reasoning, coding, proficiency, and knowledge tests.\u00a0<\/span><\/p>\n Falcon 180B\u2019s release comes just months after Meta launched the pre-trained model <\/span>Llama 2<\/span><\/a> in July\u00a0and after TII <\/span>launched<\/span> Falcon 40B<\/a> in May 2023. At launch, Llama-2 supported 180-billion parameters and was trained on 2 trillion tokens, which made it the largest open-source LLM at the time of release.\u00a0<\/span><\/p>\n However, TII\u2019s new LLM is 2.5x bigger than Llama 2 and trained using 4x more computing power. It also outperforms Llama 2 in multi-task language understanding (MMLU) tasks. This is why some are calling Falcon 180B the Llama killer<\/a>.<\/span><\/p>\n These performance advantages make Falcon 180B the largest open-source LLM on the market, and why it’s currently sitting at the top of Hugging Face\u2019s <\/span>Open LLM Leaderboard<\/span><\/a>.\u00a0\u00a0\u00a0<\/span><\/p>\n Falcon 180B has also shown promising performance against proprietary LLMs, with <\/span>Hugging Face<\/span><\/a> suggesting that it can rival Google\u2019s <\/span>PaLM 2<\/span><\/a>, the language model used to power <\/span>Bard<\/span><\/a>, and highlighting that it outright outperforms GPT-3.5<\/a>.<\/span><\/p>\n That being said, it\u2019s worth noting that the size of the model requires at least 320GB of memory in order to function, which can be a costly investment.\u00a0<\/span><\/p>\n In any case, while Falcon 180B isn\u2019t on the level of GPT-4<\/a>, it has demonstrated that the gap between open and closed-source AI<\/a> is closing rapidly.\u00a0<\/span><\/p>\n As that gap continues to close, open-source platforms will be situated to carve out a much greater share of the market, particularly if organizations prefer the privacy offered by open-source LLMs.\u00a0<\/span><\/p>\n Open-source AI models like Falcon 180B offer a distinct advantage over proprietary models in terms of data privacy.\u00a0<\/span><\/p>\n With an open-source AI model, an organization can train a pre-trained model on its own servers without sending data back to a third-party provider\u2019s centralized model.\u00a0<\/span><\/p>\n This isn\u2019t the case with most proprietary AI solutions. For instance, OpenAI, Google, and Anthropic all collect data from user conversations with their chatbots<\/a>. This doesn\u2019t occur with open-source LLM\u2019s.\u00a0<\/span><\/p>\n However, it\u2019s important to note that OpenAI has attempted to address these privacy concerns by launching <\/span>ChatGPT Enterprise<\/span><\/a>, which doesn\u2019t collect data from user prompts, so it is likely other proprietary vendors will look to tighten data sharing in the future.\u00a0<\/span><\/p>\n At this level of performance, open-source AI solutions like Falcon 180B have the potential to democratize access to AI so that enterprises can experiment with this technology as part of applications and integrations with complete transparency rather than using products that have been built with an opaque <\/span>black box<\/span><\/a> approach.\u00a0\u00a0<\/span><\/p>\n With open-source AI models, a community of researchers can work together and iterate on code and use cases to drive the development of the technology as a whole forward without being limited by proprietary gatekeepers.\u00a0<\/span><\/p>\n \u201cWe are committed to democratizing access to advanced AI, as our privacy and the potential impact of AI on humanity should not be controlled by a select few,\u201d said Secretary General of the UAE\u2019s Advanced Technology Research Council, H.E. Faisal Al Bannai in the announcement press release.<\/span><\/p>\n \u201cWhile we may not have all the answers, our resolve remains unwavering: to collaborate and contribute to the open source community, ensuring that the benefits of AI are shared by all.\u201d\u00a0<\/span><\/p><\/blockquote>\n In addition, the launch of Falcon 180B will help to challenge the Silicon Valley monopoly of companies like Google, Meta, OpenAI, and Anthropic, which have been dominating AI innovation. It also helps to solidify the Middle East as a key region to watch for the development of these technologies going forward.\u00a0<\/span><\/p>\n One of the most interesting elements of the release is that Falcon 180B has a more permissive <\/span>acceptable use<\/span><\/a> policy than other competitors like OpenAI and Anthrophic.\u00a0<\/span><\/p>\n For instance, Falcon 180B\u2019s policy is broken down into four lines (paraphrased); don\u2019t use it to violate local or international regulations, harm or exploit others, disseminate false information, or defame others.\u00a0<\/span><\/p>\nWhat is Falcon 180B? A New Challenger for the AI Crown\u00a0<\/span><\/h2>\n
Open Source AI\u2019s Privacy Edge vs. Proprietary Models\u00a0<\/span><\/h2>\n
Democratization and Open-Source AI\u00a0\u00a0<\/span><\/h2>\n
Acceptable Use\u00a0<\/span><\/h2>\n