Apple today released several open source large language models (LLMs) that are designed to run on-device rather than through cloud servers. Called OpenELM (Open-source Efficient Language Models), the LLMs are available on the Hugging Face Hub, a community for sharing AI code.
As outlined in a white paper [PDF], there are eight total OpenELM models, four of which were pre-trained using the CoreNet library, and four instruction tuned models. Apple uses a layer-wise scaling strategy that is aimed at improving accuracy and efficiency.
Apple provided code, training logs, and multiple versions rather than just the final trained model, and the researchers behind the project hope that it will lead to faster progress and "more trustworthy results" in the natural language AI field.
OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2x fewer pre-training tokens.
Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations.
Apple says that it is releasing the OpenELM models to "empower and enrich the open research community" with state-of-the-art language models. Sharing open source models gives researchers a way to investigate risks and data and model biases. Developers and companies are able to use the models as-is or make modifications.
The open sharing of information has become an important tool for Apple to recruit top engineers, scientists, and experts because it provides opportunities for research papers that would not normally have been able to be published under Apple's secretive policies.
Apple has not yet brought these kinds of AI capabilities to its devices, but iOS 18 is expected to include a number of new AI features, and rumors suggest that Apple is planning to run its large language models on-device for privacy purposes.
Source: Macrumors