这是mistralai/Mistral-7B-Instruct-v0.2,通过层交错组合成更大的12b模型

168 拉取 更新于5个月前

5个月前

cc885e0f9c89 · 6.6GB

README

bigstral

我原本想使用SOLAR,但发现它只有4k…于是决定制作我自己的32k版本。

这是 mistralai/Mistral-7B-Instruct-v0.2,通过层交错创建了一个更大的12b模型。

这是一个使用mergekit创建的预训练语言模型的合并。

提示格式

<s>[INST] What is your favourite condiment? [/INST]
Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
[INST] Do you have dangerously spicy mayonnaise recipes? [/INST]

示例输出

给我一些在JavaScript中非常简单但在ANSI-C中却非常困难的代码

在JavaScript中,函数是第一类值,这意味着您可以将其分配给变量、将其作为参数传递给其他函数,甚至将其嵌套在其他函数中。这导致了简洁且具有表达力的代码。

以下是在JavaScript中使用递归函数计算给定数字阶乘的一个简单示例

> const factorial = n => n === 0 ? 1 : n * factorial(n - 1);
>
> console.log(factorial(5)); // Output: 120
>
> ```
>
> Now, let's try to write an equivalent code snippet in ANSI-C with similar functionality:
>
> ```c
> #include <stdio.h>
> #define MAX_RECURSION_DEPTH 1000
> int factorial(int n) {
>     if (n == 0)
>         return 1;
>     else {
>         int result;
>         result = factorial(--n);
>         if (++depth > MAX_RECURSION_DEPTH)
>             while(1){} // Recursion stack overflow
>         return n * result;
>     }
> }
> int main() {
>     int i, result;
>     int n = 5;
>     int depth = 0;
>
>     result = factorial(n);
>     printf("Factorial of %d is: %d\n", n, result);
>     return 0;
> }
> ```
> 
> As you can see, the ANSI-C code becomes more verbose and has several disadvantages compared to JavaScript:
>
> 1. Need to define a maximum recursion depth using a constant or an array index, which can lead to hardcoded limits and potential stack overflow issues.
> 2. The use of preprocessor directives (#define) for defining functions and macros can make the code less readable and harder to understand.
> 3. The lack of closures in ANSI-C requires using static variables or passing arguments as arguments to nested functions, making the code more verbose and less concise.
> ```

## Merge Details
### Merge Method

This model was merged using the passthrough merge method.

### Models Merged

The following models were included in the merge:
* [mistralai/Mistral-7B-Instruct-v0.2](https://hugging-face.cn/mistralai/Mistral-7B-Instruct-v0.2)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
dtype: float16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 8]
    model: mistralai/Mistral-7B-Instruct-v0.2
- sources:
  - layer_range: [4, 12]
    model: mistralai/Mistral-7B-Instruct-v0.2
- sources:
  - layer_range: [8, 16]
    model: mistralai/Mistral-7B-Instruct-v0.2
- sources:
  - layer_range: [12, 20]
    model: mistralai/Mistral-7B-Instruct-v0.2
- sources:
  - layer_range: [16, 24]
    model: mistralai/Mistral-7B-Instruct-v0.2
- sources:
  - layer_range: [20, 28]
    model: mistralai/Mistral-7B-Instruct-v0.2
- sources:
  - layer_range: [24, 32]
    model: mistralai/Mistral-7B-Instruct-v0.2