qwq:32b-preview-fp16 - Ollama 框架

qwq

QwQ 是一个实验性研究模型，专注于提升 AI 推理能力。

工具 32b

153.9K 下载量更新于 2 个月前

更新于 2 个月前

2 个月前

44d5ed096b85 · 66GB

{ "stop": [ "<|im_start|>", "<|im_end|>" ] }

你是一个乐于助人且无害的助手。你是由阿里巴巴开发的 Qwen。你应该逐步思考

{{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{- end }} {{- if .Too

Apache License Version 2.0, January 2004

自述文件

QwQ 是一个由 Qwen 团队开发的 32B 参数实验性研究模型，专注于提升 AI 推理能力。

QwQ 在这些基准测试中表现出卓越的性能

GPQA 上达到 65.2%，展示了其研究生水平的科学推理能力
AIME 上达到 50.0%，突显了其强大的数学问题解决能力
MATH-500 上达到 90.6%，展示了在不同主题中出色的数学理解能力
LiveCodeBench 上达到 50.0%，验证了其在真实场景中强大的编程能力。

这些结果突显了 QwQ 在分析和问题解决能力方面的显著进步，尤其是在需要深度推理的技术领域。

作为一个预览版本，它展示了有希望的分析能力，但也存在一些重要的局限性

语言混合和代码切换： 该模型可能会意外地混合语言或在语言之间切换，从而影响响应的清晰度。
递归推理循环： 该模型可能会进入循环推理模式，导致冗长的响应而没有明确的答案。
安全和伦理考量： 该模型需要加强安全措施以确保可靠和安全的性能，用户在部署时应谨慎行事。
性能和基准限制： 该模型在数学和编码方面表现出色，但在其他领域（如常识推理和细致的语言理解）仍有改进空间。

QwQ is a 32B parameter experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities.

![image.png](/assets/mchiang0610/mikey3.1/e6d2ac3a-0d55-4e8b-9f53-ee5e269ed521)

![image.png](/assets/mchiang0610/mikey3.1/b56aaf87-c5bf-4249-be99-28930845e48e)

QwQ demonstrates remarkable performance across these benchmarks:

- **65.2% on GPQA**, showcasing its graduate-level scientific reasoning capabilities 
- **50.0% on AIME**, highlighting its strong mathematical problem-solving skills
- **90.6% on MATH-500**, demonstrating exceptional mathematical comprehension across diverse topics
- **50.0% on LiveCodeBench**, validating its robust programming abilities in real-world scenarios.

These results underscore QwQ’s significant advancement in analytical and problem-solving capabilities, particularly in technical domains requiring deep reasoning.

As a preview release, it demonstrates promising analytical abilities while having several important limitations:

1. **Language Mixing and Code-Switching:** The model may mix languages or switch between them unexpectedly, affecting response clarity.

2. **Recursive Reasoning Loops:** The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer.

3. **Safety and Ethical Considerations:** The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.

4. **Performance and Benchmark Limitations:** The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.

粘贴、拖放或单击以上传图片（.png、.jpeg、.jpg、.svg、.gif）