【指标定制】神经网络dnn模型sql标签怎么写，预测的时候总是维度不匹配，因为多了标签列

由bq30zy4n创建，最终由small_q更新于2025-02-16 01:51 被浏览 15 用户

/* 使用DAI SQL为量化模型预测生成标签数据。标签反映了未来5日的收益率，并且被离散化为20个桶，每个桶代表一个收益率范围。这样，我们就可以训练模型来预测未来的收益率范围，而不仅仅是具体的收益率值。

首先定义了一个名为label_data的临时表，用于计算和存储未来5日收益率，其1%和99%分位数，以及离散化后的收益率（被分为20个桶，每个桶代表一个收益率范围）。
对未来5日收益率进行了截断处理，只保留在1%和99%分位数之间的值。
选择了标签值不为空，并且非涨跌停（未来一天的最高价不等于最低价）的数据
从这个临时表中选择了日期、股票代码和标签字段，以供进模型训练。 */

SELECT -- 计算的是未来5日的收益率。这是通过将5天后的收盘价除以第二天的开盘价得到的。这里使用的是一个叫做m_lead的函数，它可以获取某个字段在未来某天的值。 -- _future_return 是一个中间变量名，以 _ 开始的别名列不会在最终结果返回 (m_lead(close, 2) / m_lead(open, 1))*2 AS _future_return, -- m_lead(close, 2) / m_lead(close, 1) AS _future_return,

-- 计算未来5日收益率的1%分位数。all_quantile_cont是一个分位数函数，它能够计算出某个字段值的分位数，这里是计算1%的分位数。
c_quantile_cont(_future_return, 0.01) AS _future_return_1pct,

-- 计算未来5日收益率的99%分位数。同样，all_quantile_cont函数用来计算99%的分位数。
c_quantile_cont(_future_return, 0.99) AS _future_return_99pct,

-- 对未来5日收益率进行截断处理，值位于1%和99%分位数之间的数据被保留，超过这个范围的值将被设为边界值。
clip(_future_return, _future_return_1pct, _future_return_99pct) AS _clipped_return,

-- 将截断后的未来5日收益率分为20个不同的桶，每个桶代表一个范围。all_wbins函数用于将数据离散化为多个桶。
-- cbins(_clipped_return, 10) AS _binned_return,

-- 将离散化后的数据作为标签使用，这是我们预测的目标。
-- _binned_return AS label,
_clipped_return AS label,
-- 日期，这是每个股票每天的数据
date,

-- 股票代码，代表每一支股票
instrument

-- 从cn_stock_bar1d这个表中选择数据，这个表存储的是股票的日线数据 FROM cn_stock_bar1d QUALIFY COLUMNS(*) IS NOT NULL -- 标签值不为空，且非涨跌停(未来一天的最高价不等于最低价) AND m_lead(high, 1) != m_lead(low, 1) ORDER BY date, instrument

报错：

\
```
Output exceeds the size limit. Open the full output data in a text editor
```
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[27], line 1422 1416 m32 = M.dl_model_init.v1( 1417 inputs=m3.data, 1418 outputs=m27.data 1419 ) 1421 # @module(position="-1457.6939697265625,367.9156188964844", comment='', comment_collapsed=True) -> 1422 m37 = M.dl_model_train.v1( 1423 input_model=m32.data, 1424 training_data=m62.data_1, 1425 validation_data=m62.data_2, 1426 optimizer='Adam', 1427 loss='mean_squared_error', 1428 metrics='mse', 1429 batch_size=1024, 1430 epochs=30, 1431 custom_objects=m37_custom_objects_bigquant_run, 1432 n_gpus=1, 1433 verbose='2:每个epoch输出一行记录', 1434 m_cached=False 1435 ) 1436 # </aistudiograph> File module2/common/modulemanagerv2.py:88, in biglearning.module2.common.modulemanagerv2.BigQuantModuleVersion.call() File module2/common/moduleinvoker.py:370, in biglearning.module2.common.moduleinvoker.module_invoke() File module2/common/moduleinvoker.py:292, in biglearning.module2.common.moduleinvoker._invoke_with_cache() File module2/common/moduleinvoker.py:253, in biglearning.module2.common.moduleinvoker._invoke_with_cache() File module2/common/moduleinvoker.py:210, in biglearning.module2.common.moduleinvoker._module_run() File module2/modules/dl_model_train/v1/init.py:129, in biglearning.module2.modules.dl_model_train.v1.init.bigquant_run() File module2/modules/dl_model_train/v1/init.py:114, in biglearning.module2.modules.dl_model_train.v1.init.bigquant_run.do_run() File /usr/local/python3/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:1100, in Model.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing) 1093 with trace.Trace( 1094 'train', 1095 epoch_num=epoch, 1096 step_num=step, 1097 batch_size=batch_size, 1098 _r=1): 1099 callbacks.on_train_batch_begin(step) -> 1100 tmp_logs = self.train_function(iterator) 1101 if data_handler.should_sync:

...

return fn(*args, **kwargs) /usr/local/python3/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:788 run_step ** outputs = model.train_step(data) /usr/local/python3/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:754 train_step y_pred = self(x, training=True) /usr/local/python3/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:998 call input_spec.assert_input_compatibility(self.input_spec, inputs, self.name) /usr/local/python3/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py:271 assert_input_compatibility raise ValueError('Input ' + str(input_index) + ValueError: Input 0 is incompatible with layer BigQuantDL: expected shape=(None, 17), found shape=(None, 18)