10个月前更新
10个月前
faad68f73fc2 · 47GB
模型
架构llama
·
参数69B
·
量化Q5_0
47GB
参数
{ "stop": [ "<|endoftext|>" ] }
37B
自述文件
SQLCoder是一个拥有150亿参数的模型,基于StarCoder模型微调而来。在sql-eval框架上,它在自然语言到SQL生成的性能上略微优于gpt-3.5-turbo,并优于流行的开源模型。它也显著优于text-davinci-003,后者大小是它的十倍以上。
这个150亿参数的补全模型通常至少需要16GB的RAM。
使用方法
命令行界面
ollama run sqlcoder
尝试使用三引号输入多行提示:
注意:将{question}替换成你想要回答的SQL问题。例如:‘哪些产品产生了最多的销售额?’
"""
### Instructions:
Your task is to convert a question into a SQL query, given a Postgres database schema.
Adhere to these rules:
- **Deliberately go through the question and database schema word by word** to appropriately answer the question
- **Use Table Aliases** to prevent ambiguity. For example, `SELECT table1.col1, table2.col1 FROM table1 JOIN table2 ON table1.id = table2.id`.
- When creating a ratio, always cast the numerator as float
### Input:
Generate a SQL query that answers the question `{question}`.
This query will run on a database whose schema is represented in this string:
CREATE TABLE products (
product_id INTEGER PRIMARY KEY, -- Unique ID for each product
name VARCHAR(50), -- Name of the product
price DECIMAL(10,2), -- Price of each unit of the product
quantity INTEGER -- Current quantity in stock
);
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY, -- Unique ID for each customer
name VARCHAR(50), -- Name of the customer
address VARCHAR(100) -- Mailing address of the customer
);
CREATE TABLE salespeople (
salesperson_id INTEGER PRIMARY KEY, -- Unique ID for each salesperson
name VARCHAR(50), -- Name of the salesperson
region VARCHAR(50) -- Geographic sales region
);
CREATE TABLE sales (
sale_id INTEGER PRIMARY KEY, -- Unique ID for each sale
product_id INTEGER, -- ID of product sold
customer_id INTEGER, -- ID of customer who made purchase
salesperson_id INTEGER, -- ID of salesperson who made the sale
sale_date DATE, -- Date the sale occurred
quantity INTEGER -- Quantity of product sold
);
CREATE TABLE product_suppliers (
supplier_id INTEGER PRIMARY KEY, -- Unique ID for each supplier
product_id INTEGER, -- Product ID supplied
supply_price DECIMAL(10,2) -- Unit price charged by supplier
);
-- sales.product_id can be joined with products.product_id
-- sales.customer_id can be joined with customers.customer_id
-- sales.salesperson_id can be joined with salespeople.salesperson_id
-- product_suppliers.product_id can be joined with products.product_id
### Response:
Based on your instructions, here is the SQL query I have generated to answer the question `{question}`:
```sql
"""
示例输出
SELECT customers.customer_id,
products.product_id,
(SUM(sales.quantity * product_suppliers.supply_price)) / (SELECT
SUM(quantity)
FROM sales
WHERE
sales.salesperson_id = '{chosen_salesperson}') as profit
FROM sales
INNER JOIN customers on sales.customer_id = customers.customer_id
INNER JOIN products on sales.product_id = products.product_id
INNER JOIN product_suppliers on products.product_id =
product_suppliers.product_id
WHERE sales.salesperson_id = '{chosen_salesperson}'
GROUP BY customers.customer_id, products.product_id
HAVING COUNT(DISTINCT sales.sale_date) >= {min_days}
AND (SELECT SUM(quantity * product_suppliers.supply_price)
FROM sales
WHERE sales.salesperson_id = '{chosen_salesperson}'
and sales.customer_id = customers.customer_id
and sales.product_id = products.product_id) >= {min_total}
ORDER BY profit DESC;
```
### Reasoning
The query is broken down into two parts: the first one fetches information
about each sale made by a specific salesman (`{chosen_salesperson}`), and
groups it by customer ID, product ID, and calculates how much money was
earned from each sale. The second part then filters out these results that
occurred after a certain number of days (determined by the variable
`min_days`) and had an amount of money equal to or more than a certain
value (determined by the variable `min_total`). Finally, the query orders
the results in descending order based on the ratio between earned money
and total money.
The first part relies on several tables from the database: salespeople,
products, customers, sales, and product_suppliers. The salespeople table
contains information about each salesperson's ID; the products table
contains information about each product's name and price; the customers
table contains information about each customer's address; the sales table
contains information about each sale, including the quantity sold, the ID
of the product sold, and the ID of the customer who made the purchase; and
the product_suppliers table contains information about the unit price
charged by each supplier.
The variables in the query can be interpreted as follows:
- `{chosen_salesperson}` is a placeholder for the name of the salesman
whose performance is being evaluated. This value should be replaced with
an actual salesman's name or ID number.
- `min_days` is the minimum number of days that must have passed since the
sale in order to qualify for inclusion in the calculation. For example, if
this variable were set to 30, then only sales made 30 days after the start
date would be included in the calculation. This value should also be
replaced with an integer representing the minimum number of days.
- `min_total` is a threshold amount that must have been earned by the
salesman for the sale to qualify for inclusion in the calculation. For
example, if this variable were set to 1000, then only sales worth at least
$1000 would be included in the calculation. This value should also be
replaced with an integer representing the minimum amount of money that
must have been earned.
The output is a table containing three columns: customer ID, product ID,
and profit per unit price (calculated as the total money earned divided by
the quantity sold). The rows in this table are sorted in descending order
based on the ratio between earned money and total money. Rows that don't
meet both of these conditions will be dropped from the output.
## Example
```
Input:
min_days = 30
min_total = 1000
chosen_salesperson = '<NAME>'
Output:
customer_id | product_id | profit
-------------+----------------------------+-----------------
987 | 456 | 2.1
1029 | 1357 | 1.91
89 | 1234 | 0.59
```
## License
The database is released under the [Open Database
License](https://opendatacommons.org/licenses/odbl/1-0/).