granite3-guardian:2b

IBM Granite Guardian 3.0 的 **2B 和 8B 模型**旨在检测提示和/或响应中的风险。它们可以帮助检测 IBM AI 风险图谱中编目的许多关键维度上的风险。它们在独特的数据上进行训练，这些数据包括人工注释和由内部红队提供的合成数据，并且在标准基准测试中，它们优于同一领域的其他开源模型。

参数大小

该模型将生成单个输出令牌，即 `Yes` 或 `No`。默认情况下，使用通用 `harm` 类别，但可以通过设置系统提示来选择其他类别。

ollama run granite3-guardian:2b
>>> /set system profanity

ollama run granite3-guardian:8b
>>> /set system violence

支持的用途

提示文本或模型响应中的风险检测（即，作为护栏），例如
- 危害 (harm)：被认为通常有害的内容
- 社会偏见 (social_bias)：基于身份或特征的偏见
- 越狱 (jailbreak)：故意操纵 AI 以生成有害、不希望的或不适当的内容的实例
- 暴力 (violence)：宣传身体、精神或性伤害的内容
- 亵渎 (profanity)：使用攻击性语言或侮辱
- 色情内容 (sexual_content)：具有色情性质的明确或暗示性材料
- 不道德行为 (unethical_behavior)：违反道德或法律标准的行为
RAG（检索增强生成）评估
- 上下文相关性 (relevance)：检索到的上下文是否与查询相关
- 基础性 (groundedness)：响应是否准确且忠实于提供的上下文
- 答案相关性 (answer_relevance)：响应是否直接回答用户的问题

Granite 稠密模型

Granite 稠密模型提供 **2B 和 8B** 参数大小，旨在支持基于工具的用例和检索增强生成 (RAG)，从而简化代码生成、翻译和错误修复。

查看模型页面

Granite 混合专家模型

Granite MoE 模型提供 **1B 和 3B** 参数大小，专为低延迟使用而设计，并支持在设备上应用程序或需要即时推理的情况下进行部署。

查看模型页面

了解更多

**开发者：** IBM Research
**GitHub 仓库：** ibm-granite/granite-guardian
**网站**：Granite Guardian 文档
**Cookbook**：Granite Guardian Snack
**发布日期**：2024 年 10 月 21 日
**许可证：** Apache 2.0。

## Granite guardian models

The IBM Granite Guardian 3.0 **2B and 8B models** are designed to detect risks in prompts and/or responses. They can help with risk detection along many key dimensions catalogued in the [IBM AI Risk Atlas](https://www.ibm.com/docs/en/watsonx/saas?topic=ai-risk-atlas). They are trained on unique data comprising human annotations and synthetic data informed by internal red-teaming, and they outperform other open-source models in the same space on standard benchmarks.

### Parameter Sizes

The model will produce a single output token, either `Yes` or `No`. By default, the general-purpose `harm` category is used, but other categories can be selected by setting the system prompt.

**2B:**
  
```
ollama run granite3-guardian:2b
>>> /set system profanity
```

**8B:**

```
ollama run granite3-guardian:8b
>>> /set system violence
```

### Supported Uses

* Risk detection in prompt text or model response (i.e. as guardrails), such as:
    * Harm (`harm`): content considered generally harmful
    * Social Bias (`social_bias`): prejudice based on identity or characteristics
    * Jailbreaking (`jailbreak`): deliberate instances of manipulating AI to generate harmful, undesired, or inappropriate content
    * Violence (`violence`): content promoting physical, mental, or sexual harm
    * Profanity (`profanity`): use of offensive language or insults
    * Sexual Content (`sexual_content`): explicit or suggestive material of a sexual nature
    * Unethical Behavior (`unethical_behavior`): actions that violate moral or legal standards

* RAG (retrieval-augmented generation) to assess: 
    * Context relevance (`relevance`): whether the retrieved context is relevant to the query 
    * Groundedness (`groundedness`): whether the response is accurate and faithful to the provided context
    * Answer relevance (`answer_relevance`): whether the response directly addresses the user's query

## Granite dense models

The Granite dense models are available in **2B and 8B** parameter sizes designed to support tool-based use cases and for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.

[See model page](https://ollama.org.cn/library/granite3-dense)

## Granite mixture of experts models

The Granite MoE models are available in **1B and 3B** parameter sizes designed for low latency usage and to support deployment in on-device applications or situations requiring instantaneous inference.

[See model page](https://ollama.org.cn/library/granite3-moe)

## Learn more

- **Developers:** IBM Research
- **GitHub Repository:** [ibm-granite/granite-guardian](https://github.com/ibm-granite/granite-guardian)
- **Website**: [Granite Guardian Docs](https://www.ibm.com/granite/docs/models/guardian/)
- **Cookbook**: [Granite Guardian Snack](https://github.com/ibm-granite-community/granite-snack-cookbook/blob/main/recipes/Granite_Guardian/Granite_Guardian_Detailed_Guide.ipynb)
- **Release Date**: October 21st, 2024
- **License:** [Apache 2.0](https://apache.ac.cn/licenses/LICENSE-2.0).

粘贴、拖放或点击上传图像 (.png, .jpeg, .jpg, .svg, .gif)