qwq - Ollama 框架

qwq

QwQ 是一个专注于提升 AI 推理能力的实验性研究模型。

工具 32b

153.9K 下载量更新于 2 个月前

更新于 2 个月前

2 个月前

46407beda5c0 · 20GB

parameters32.8B

quantizationQ4_K_M

{ "stop": [ "<|im_start|>", "<|im_end|>" ] }

You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-b

{{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{- end }} {{- if .Too

Apache License Version 2.0, January 2004

Readme

QwQ 是由 Qwen 团队开发的 32B 参数实验性研究模型，专注于提升 AI 推理能力。

QwQ 在这些基准测试中展现了卓越的性能

在 GPQA 上取得 65.2% 的成绩，展示了其研究生水平的科学推理能力
在 AIME 上取得 50.0% 的成绩，突显了其强大的数学问题解决能力
在 MATH-500 上取得 90.6% 的成绩，展示了在不同主题中卓越的数学理解能力
在 LiveCodeBench 上取得 50.0% 的成绩，验证了其在真实场景中强大的编程能力。

这些结果强调了 QwQ 在分析和问题解决能力方面的显著进步，尤其是在需要深入推理的技术领域。

作为一个预览版本，它展示了有希望的分析能力，但也存在一些重要的局限性

语言混合和代码切换： 该模型可能会混合语言或意外地在语言之间切换，从而影响响应的清晰度。
递归推理循环： 该模型可能会进入循环推理模式，导致冗长的响应，而没有结论性的答案。
安全和伦理考量： 该模型需要加强安全措施以确保可靠和安全的性能，用户在部署时应谨慎。
性能和基准测试限制： 该模型在数学和编码方面表现出色，但在其他领域仍有改进空间，例如常识推理和细致的语言理解。

QwQ is a 32B parameter experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities.

![image.png](/assets/mchiang0610/mikey3.1/e6d2ac3a-0d55-4e8b-9f53-ee5e269ed521)

![image.png](/assets/mchiang0610/mikey3.1/b56aaf87-c5bf-4249-be99-28930845e48e)

QwQ demonstrates remarkable performance across these benchmarks:

- **65.2% on GPQA**, showcasing its graduate-level scientific reasoning capabilities 
- **50.0% on AIME**, highlighting its strong mathematical problem-solving skills
- **90.6% on MATH-500**, demonstrating exceptional mathematical comprehension across diverse topics
- **50.0% on LiveCodeBench**, validating its robust programming abilities in real-world scenarios.

These results underscore QwQ’s significant advancement in analytical and problem-solving capabilities, particularly in technical domains requiring deep reasoning.

As a preview release, it demonstrates promising analytical abilities while having several important limitations:

1. **Language Mixing and Code-Switching:** The model may mix languages or switch between them unexpectedly, affecting response clarity.

2. **Recursive Reasoning Loops:** The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer.

3. **Safety and Ethical Considerations:** The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.

4. **Performance and Benchmark Limitations:** The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.

粘贴、拖放或点击上传图片（.png, .jpeg, .jpg, .svg, .gif）