电脑桌面
添加51搜公文到电脑桌面
安装后可以在桌面快捷访问

【英文原版】Gemma技术报告-16页.pdfVIP专享VIP免费优质

【英文原版】Gemma技术报告-16页.pdf_第1页
1/16
【英文原版】Gemma技术报告-16页.pdf_第2页
2/16
【英文原版】Gemma技术报告-16页.pdf_第3页
3/16
2024-02-21Gemma: Open Models Based on GeminiResearch and TechnologyGemma Team, Google DeepMind11See Contributions and Acknowledgments section for full author list. Please send correspondence to gemma-1-report@google.com.This work introduces Gemma, a family of lightweight, state-of-the art open models built from the researchand technology used to create Gemini models. Gemma models demonstrate strong performance acrossacademic benchmarks for language understanding, reasoning, and safety. We release two sizes of models(2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemmaoutperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensiveevaluations of safety and responsibility aspects of the models, alongside a detailed description of modeldevelopment. We believe the responsible release of LLMs is critical for improving the safety of frontiermodels, and for enabling the next wave of LLM innovations.IntroductionWe present Gemma, a family of open modelsbased on Google’s Gemini models (Gemini Team,2023).We trained Gemma models on up to 6T to-kens of text, using similar architectures, data,and training recipes as the Gemini model family.Like Gemini, these models achieve strong gener-alist capabilities in text domains, alongside state-of-the-art understanding and reasoning skills atscale. With this work, we release both pre-trainedand fine-tuned checkpoints, as well as an open-source codebase for inference and serving.Gemma comes in two sizes: a 7 billion param-eter model for efficient deployment and develop-ment on GPU and TPU, and a 2 billion param-eter model for CPU and on-device applications.Each size is designed to address different compu-tational constraints, applications, and developerrequirements. At each scale, we release raw, pre-trained checkpoints, as well as checkpoints fine-tuned for dialogue, instruction-following, help-fulness, and safety. We thoroughly evaluate theshortcomings of our models on a suite of quantita-tive and qualitative benchmarks. We believe therelease of both pretrained and fine-tuned check-points will enable thorough research and inves-tigation into the impact of current instruction-tuning regimes, as well as the development ofincreasingly safe and responsible model develop-ment methodologies.Gemma advances state-of-the-art performancerelative to comparable-scale (and some larger),open models (Almazrouei et al., 2023; Jianget al., 2023; Touvron et al., 2023a,b) across awide range of domains including both automatedbenchmarks and human evaluation. Example do-mains include question answering (Clark et al.,2019; Kwiatkowski et al., 2019), commonsensereasoning (Sakaguchi et al., 2019; Suzgun et al.,2022), mathematics and science (Cobbe et al.,2021; Hendrycks et al., 2020), and coding (Austinet al., 2021; Chen et al., 2021). See complete de-tails in the Evaluation section.Like Gemini, Gemma...

1、当您付费下载文档后,您只拥有了使用权限,并不意味着购买了版权,文档只能用于自身使用,不得用于其他商业用途(如 [转卖]进行直接盈利或[编辑后售卖]进行间接盈利)。
2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。
3、如文档内容存在违规,或者侵犯商业秘密、侵犯著作权等,请点击“违规举报”。

碎片内容

【英文原版】Gemma技术报告-16页.pdf

您可能关注的文档

确认删除?
QQ
  • QQ点击这里给我发消息
回到顶部