直接使用
请打开如何使用EasyRec训练DeepFM模型,并点击右上角 “ 在DSW中打开” 。
使用EasyRec训练DeepFM模型
EasyRec致力于成为容易上手的工业界推荐算法框架,实现了主流的召回、排序、多目标算法,其实现的所有算法都在实际场景中进行了验证
本Sample Notebook支持对所有EasyRec的模型进行训练,包括DeepFM/MultiTower/DIN/DSSM等, 相关介绍参见EasyRec用户手册。
DeepFM模型是常用的一种排序模型,包括wide, fm和deep三个部分。在下文中,我们以DeepFM模型为例。
DeepFM是在WideAndDeep基础上加入了FM模块的改进模型。FM模块和DNN模块共享相同的特征,即相同的Embedding。
model_config:{ model_class: "DeepFM" feature_groups: { group_name: "deep" feature_names: "hour" feature_names: "c1" ... feature_names: "site_id_app_id" wide_deep:DEEP } feature_groups: { group_name: "wide" feature_names: "hour" feature_names: "c1" ... feature_names: "c21" wide_deep:WIDE } deepfm { wide_output_dim: 16 dnn { hidden_units: [128, 64, 32] } final_dnn { hidden_units: [128, 64] } l2_regularization: 1e-5 } embedding_regularization: 1e-7 }
model_class: 'DeepFM', 不需要修改
feature_groups: 需要两个feature_group: wide group和deep group, group name不能变
deepfm: deepfm相关的参数
dnn: deep part的参数配置
hidden_units: dnn每一层的神经元的数目
wide_output_dim: wide部分输出的大小
final_dnn: 整合wide part, fm part, deep part的参数输入, 可以选择是否使用#
embedding_regularization: 对embedding部分加regularization,防止overfit
1.环境准备
1.1 安装EasyRec 0.4.7
!pip3 install https://github.com/alibaba/EasyRec/releases/download/v0.4.7/easy_rec-0.4.7-py2.py3-none-any.whl
Looking in indexes: http://yum.tbsite.net/pypi/simple Collecting easy-rec==0.4.7 Downloading https://github.com/alibaba/EasyRec/releases/download/v0.4.7/easy_rec-0.4.7-py2.py3-none-any.whl (4.3 MB) |████████████████████████████████| 4.3 MB 309 kB/s ?25hRequirement already satisfied: xlrd>=0.9.0 in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (2.0.1) Requirement already satisfied: matplotlib in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (3.3.4) Requirement already satisfied: pandas in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (1.1.5) Requirement already satisfied: future in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (0.18.2) Requirement already satisfied: PyYAML in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (5.4.1) Requirement already satisfied: psutil in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (5.9.0) Requirement already satisfied: scikit-learn in /home/pai/lib/python3.6/site-packages (from easy-rec==0.4.7) (0.24.2) Requirement already satisfied: numpy>=1.15 in /home/pai/lib/python3.6/site-packages (from matplotlib->easy-rec==0.4.7) (1.16.6) Requirement already satisfied: cycler>=0.10 in /home/pai/lib/python3.6/site-packages (from matplotlib->easy-rec==0.4.7) (0.11.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /home/pai/lib/python3.6/site-packages (from matplotlib->easy-rec==0.4.7) (3.0.9) Requirement already satisfied: python-dateutil>=2.1 in /home/pai/lib/python3.6/site-packages (from matplotlib->easy-rec==0.4.7) (2.8.2) Requirement already satisfied: pillow>=6.2.0 in /home/pai/lib/python3.6/site-packages (from matplotlib->easy-rec==0.4.7) (8.3.2) Requirement already satisfied: kiwisolver>=1.0.1 in /home/pai/lib/python3.6/site-packages (from matplotlib->easy-rec==0.4.7) (1.3.1) Requirement already satisfied: pytz>=2017.2 in /home/pai/lib/python3.6/site-packages (from pandas->easy-rec==0.4.7) (2022.1) Requirement already satisfied: joblib>=0.11 in /home/pai/lib/python3.6/site-packages (from scikit-learn->easy-rec==0.4.7) (1.0.1) Requirement already satisfied: scipy>=0.19.1 in /home/pai/lib/python3.6/site-packages (from scikit-learn->easy-rec==0.4.7) (1.5.3) Requirement already satisfied: threadpoolctl>=2.0.0 in /home/pai/lib/python3.6/site-packages (from scikit-learn->easy-rec==0.4.7) (3.1.0) Requirement already satisfied: six>=1.5 in /home/pai/lib/python3.6/site-packages (from python-dateutil>=2.1->matplotlib->easy-rec==0.4.7) (1.16.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
1.2 下载训练集,测试集,以及DeepFM模型的config文件
!mkdir -p data/ !wget https://easyrec.oss-cn-beijing.aliyuncs.com/dsw/dwd_avazu_train_1w -O data/dwd_avazu_ctr_deepmodel_train.csv !wget https://easyrec.oss-cn-beijing.aliyuncs.com/dsw/dwd_avazu_test_5k -O data/dwd_avazu_ctr_deepmodel_test.csv !wget https://easyrec.oss-cn-beijing.aliyuncs.com/dsw/dwd_avazu_ctr_deepmodel_dsw.config -O data/dwd_avazu_ctr_deepmodel_dsw.config
--2022-11-01 17:05:35-- https://easyrec.oss-cn-beijing.aliyuncs.com/dsw/dwd_avazu_train_1w Resolving easyrec.oss-cn-beijing.aliyuncs.com... 59.110.185.48 Connecting to easyrec.oss-cn-beijing.aliyuncs.com|59.110.185.48|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1213131 (1.2M) [application/octet-stream] Saving to: ‘data/dwd_avazu_ctr_deepmodel_train.csv’ data/dwd_avazu_ctr_ 100%[===================>] 1.16M 5.92MB/s in 0.2s 2022-11-01 17:05:36 (5.92 MB/s) - ‘data/dwd_avazu_ctr_deepmodel_train.csv’ saved [1213131/1213131] --2022-11-01 17:05:36-- https://easyrec.oss-cn-beijing.aliyuncs.com/dsw/dwd_avazu_test_5k Resolving easyrec.oss-cn-beijing.aliyuncs.com... 59.110.185.48 Connecting to easyrec.oss-cn-beijing.aliyuncs.com|59.110.185.48|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 605691 (591K) [application/octet-stream] Saving to: ‘data/dwd_avazu_ctr_deepmodel_test.csv’ data/dwd_avazu_ctr_ 100%[===================>] 591.50K 3.62MB/s in 0.2s 2022-11-01 17:05:36 (3.62 MB/s) - ‘data/dwd_avazu_ctr_deepmodel_test.csv’ saved [605691/605691] --2022-11-01 17:05:36-- https://easyrec.oss-cn-beijing.aliyuncs.com/dsw/dwd_avazu_ctr_deepmodel_dsw.config Resolving easyrec.oss-cn-beijing.aliyuncs.com... 59.110.185.48 Connecting to easyrec.oss-cn-beijing.aliyuncs.com|59.110.185.48|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 7128 (7.0K) [application/octet-stream] Saving to: ‘data/dwd_avazu_ctr_deepmodel_dsw.config’ data/dwd_avazu_ctr_ 100%[===================>] 6.96K --.-KB/s in 0s 2022-11-01 17:05:37 (153 MB/s) - ‘data/dwd_avazu_ctr_deepmodel_dsw.config’ saved [7128/7128]
模型的配置文件:
train_input_path: "data/dwd_avazu_ctr_deepmodel_train.csv" eval_input_path: "data/dwd_avazu_ctr_deepmodel_test.csv" model_dir: "experiments/dwd_avazu_ctr/" train_config { num_steps:500 save_checkpoints_steps: 50 save_summary_steps: 10 log_step_count_steps: 10 optimizer_config: { adam_optimizer: { learning_rate: { exponential_decay_learning_rate { initial_learning_rate: 0.0001 decay_steps: 100000 decay_factor: 0.5 min_learning_rate: 0.0000001 } } } use_moving_average: false } sync_replicas: true } eval_config { metrics_set: { auc {} } } data_config { label_fields: "label" batch_size: 1024 input_type: CSVInput separator: "," input_fields: { input_name: "label" input_type: INT64 default_val:"0" } input_fields: { input_name: "hour" input_type: STRING default_val:"" } input_fields: { input_name: "c1" input_type: STRING default_val:"" } input_fields: { input_name: "banner_pos" input_type: STRING default_val:"" } input_fields: { input_name: "site_id" input_type: STRING default_val:"" } input_fields: { input_name: "site_domain" input_type: STRING default_val:"" } input_fields: { input_name: "site_category" input_type: STRING default_val:"" } input_fields: { input_name: "app_id" input_type: STRING default_val:"" } input_fields: { input_name: "app_domain" input_type: STRING default_val:"" } input_fields: { input_name: "app_category" input_type: STRING default_val:"" } input_fields: { input_name: "device_id" input_type: STRING default_val:"" } input_fields: { input_name: "device_ip" input_type: STRING default_val:"" } input_fields: { input_name: "device_model" input_type: STRING default_val:"" } input_fields: { input_name: "device_type" input_type: STRING default_val:"" } input_fields: { input_name: "device_conn_type" input_type: STRING default_val:"" } input_fields: { input_name: "c14" input_type: STRING default_val:"" } input_fields: { input_name: "c15" input_type: STRING default_val:"" } input_fields: { input_name: "c16" input_type: STRING default_val:"" } input_fields: { input_name: "c17" input_type: STRING default_val:"" } input_fields: { input_name: "c18" input_type: STRING default_val:"" } input_fields: { input_name: "c19" input_type: STRING default_val:"" } input_fields: { input_name: "c20" input_type: STRING default_val:"" } input_fields: { input_name: "c21" input_type: STRING default_val:"" } } feature_config { features { input_names: "hour" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 50 } features { input_names: "c1" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10 } features { input_names: "banner_pos" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10 } features { input_names: "site_id" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10000 } features { input_names: "site_domain" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 100 } features { input_names: "site_category" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 100 } features { input_names: "app_id" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10000 } features { input_names: "app_domain" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 1000 } features { input_names: "app_category" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 100 } features { input_names: "device_id" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 100000 } features { input_names: "device_ip" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 100000 } features { input_names: "device_model" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10000 } features { input_names: "device_type" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10 } features { input_names: "device_conn_type" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 10 } features { input_names: "c14" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c15" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c16" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c17" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c18" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c19" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c20" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } features { input_names: "c21" feature_type: IdFeature embedding_dim: 8 hash_bucket_size: 500 } } model_config:{ model_class: "DeepFM" feature_groups: { group_name: "wide" feature_names: "c1" feature_names: "banner_pos" feature_names: "site_id" feature_names: "site_domain" feature_names: "site_category" feature_names: "app_id" feature_names: "app_domain" feature_names: "app_category" feature_names: "device_id" feature_names: "device_ip" feature_names: "device_model" feature_names: "device_type" feature_names: "device_conn_type" feature_names: "hour" feature_names: "c14" feature_names: "c15" feature_names: "c16" feature_names: "c17" feature_names: "c18" feature_names: "c19" feature_names: "c20" feature_names: "c21" wide_deep:WIDE } feature_groups: { group_name: "deep" feature_names: "c1" feature_names: "banner_pos" feature_names: "site_id" feature_names: "site_domain" feature_names: "site_category" feature_names: "app_id" feature_names: "app_domain" feature_names: "app_category" feature_names: "device_id" feature_names: "device_ip" feature_names: "device_model" feature_names: "device_type" feature_names: "device_conn_type" feature_names: "hour" feature_names: "c14" feature_names: "c15" feature_names: "c16" feature_names: "c17" feature_names: "c18" feature_names: "c19" feature_names: "c20" feature_names: "c21" wide_deep:DEEP } deepfm { wide_output_dim: 8 dnn { hidden_units: [128, 64, 32] } final_dnn { hidden_units: [64, 32] } l2_regularization: 1e-4 } embedding_regularization: 1e-4 }
根据上面的配置,我们会训练500个step,并且我们定义了详细的feature和label
1.3 根据上面的配置训练DeepFM模型
!rm -rf experiments/dwd_avazu_ctr/ !python3 -m easy_rec.python.train_eval --pipeline_config_path dwd_avazu_ctr_deepmodel_dsw.config
================================================ | PAI Tensorflow powered by Aliyun PAI Team. | ================================================ Please ignore the following import error if you are using tunnel table io. No module named '_common_io' [2022-11-01 17:05:39,669][WARNING] pyhive is not installed. [2022-11-01 17:05:46,539] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/input/csv_input.py:14: ignore_errors (from tensorflow.contrib.data.python.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.experimental.ignore_errors()`. [2022-11-01 17:05:46,539][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/input/csv_input.py:14: ignore_errors (from tensorflow.contrib.data.python.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.experimental.ignore_errors()`. [2022-11-01 17:05:46,540][WARNING] DataHub is not installed. You can install it by: pip install pydatahub easy_rec version: 0.4.7 Usage: easy_rec.help() [2022-11-01 17:05:46,547] [INFO] [4585#MainThread] [tensorflow/python/util/auto_strategy_utils.py:108] Disable Auto Strategy. [2022-11-01 17:05:46,547][INFO] Disable Auto Strategy. [2022-11-01 17:05:46,555] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:205] Using config: {'_model_dir': 'experiments/dwd_avazu_ctr/', '_tf_random_seed': None, '_save_summary_steps': 10, '_save_checkpoints_steps': 50, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" gpu_options { } allow_soft_placement: true , '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 10, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fd05d6bcbe0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} [2022-11-01 17:05:46,555][INFO] Using config: {'_model_dir': 'experiments/dwd_avazu_ctr/', '_tf_random_seed': None, '_save_summary_steps': 10, '_save_checkpoints_steps': 50, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" gpu_options { } allow_soft_placement: true , '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 10, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fd05d6bcbe0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} [2022-11-01 17:05:46,562][INFO] Writing protobuf message file to experiments/dwd_avazu_ctr/pipeline.config [2022-11-01 17:05:46,576][INFO] check_mode: False [2022-11-01 17:05:46,577][INFO] check_mode: False [2022-11-01 17:05:46,577][INFO] check_mode: False [2022-11-01 17:05:46,577] [INFO] [4585#MainThread] [tensorflow/python/distribute/estimator_training.py:185] Not using Distribute Coordinator. [2022-11-01 17:05:46,577][INFO] Not using Distribute Coordinator. [2022-11-01 17:05:46,578] [INFO] [4585#MainThread] [tensorflow/python/estimator/training.py:612] Running training and evaluation locally (non-distributed). [2022-11-01 17:05:46,578][INFO] Running training and evaluation locally (non-distributed). [2022-11-01 17:05:46,578] [INFO] [4585#MainThread] [tensorflow/python/estimator/training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 50 or save_checkpoints_secs None. [2022-11-01 17:05:46,578][INFO] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 50 or save_checkpoints_secs None. [2022-11-01 17:05:46,594][INFO] train files[1]: data/dwd_avazu_ctr_deepmodel_train.csv [2022-11-01 17:05:46,695] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1226] Calling model_fn. [2022-11-01 17:05:46,695][INFO] Calling model_fn. [2022-11-01 17:05:46,696][INFO] shared embeddings[num=0] [2022-11-01 17:05:46,706] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:206: EmbeddingColumn._get_dense_tensor (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,706][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:206: EmbeddingColumn._get_dense_tensor (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,707] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3643: HashedCategoricalColumn._get_sparse_tensors (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,707][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3643: HashedCategoricalColumn._get_sparse_tensors (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,708] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:2119: HashedCategoricalColumn._transform_feature (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,708][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:2119: HashedCategoricalColumn._transform_feature (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,714] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3537: HashedCategoricalColumn._num_buckets (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,714][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3537: HashedCategoricalColumn._num_buckets (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,752] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:207: EmbeddingColumn._variable_shape (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:46,752][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:207: EmbeddingColumn._variable_shape (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:05:49,051][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:05:49,184][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:05:53,080] [INFO] [4585#MainThread] [tensorflow/python/training/optimizer.py:735] AsyncGrad is disable [2022-11-01 17:05:53,080][INFO] AsyncGrad is disable [2022-11-01 17:05:53,493] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:594] Create CheckpointSaverHook. [2022-11-01 17:05:53,493][INFO] Create CheckpointSaverHook. [2022-11-01 17:05:53,493] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:609] Init incremental saver , incremental_save:False, incremental_path:experiments/dwd_avazu_ctr/.incremental_checkpoint/incremental_model.ckpt [2022-11-01 17:05:53,493][INFO] Init incremental saver , incremental_save:False, incremental_path:experiments/dwd_avazu_ctr/.incremental_checkpoint/incremental_model.ckpt [2022-11-01 17:05:53,493] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:618] Optimize start time: False [2022-11-01 17:05:53,493][INFO] Optimize start time: False [2022-11-01 17:05:53,493] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1228] Done calling model_fn. [2022-11-01 17:05:53,493][INFO] Done calling model_fn. [2022-11-01 17:05:53,495] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:734] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:05:53,495][INFO] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:05:53,495] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.StepCounterHook object at 0x7fd054bb7ef0>] [2022-11-01 17:05:53,495][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.StepCounterHook object at 0x7fd054bb7ef0>] [2022-11-01 17:05:54,515] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.SummarySaverHook object at 0x7fd047019b00>] [2022-11-01 17:05:54,515][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.SummarySaverHook object at 0x7fd047019b00>] [2022-11-01 17:05:54,516] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd05d1f2390>] [2022-11-01 17:05:54,516][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd05d1f2390>] [2022-11-01 17:05:54,517] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.StopAtStepHook object at 0x7fd05d6c8ac8>] [2022-11-01 17:05:54,517][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.StopAtStepHook object at 0x7fd05d6c8ac8>] [2022-11-01 17:05:54,518] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.NanTensorHook object at 0x7fd054d70908>] [2022-11-01 17:05:54,518][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.NanTensorHook object at 0x7fd054d70908>] [2022-11-01 17:05:54,518] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d709b0>] [2022-11-01 17:05:54,518][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d709b0>] [2022-11-01 17:05:54,519] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d4def0>] [2022-11-01 17:05:54,519][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d4def0>] [2022-11-01 17:05:54,519] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<easy_rec.python.utils.estimator_utils.CheckpointSaverHook object at 0x7fd054d70518>] [2022-11-01 17:05:54,519][INFO] [_MonitoredSession __init__][hook begin:<easy_rec.python.utils.estimator_utils.CheckpointSaverHook object at 0x7fd054d70518>] [2022-11-01 17:05:54,520] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:742] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:05:54,520][INFO] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:05:54,520] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:914] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:05:54,520][INFO] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:05:54,520] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:629] [ChiefSessionCreator create_session] [2022-11-01 17:05:54,520][INFO] [ChiefSessionCreator create_session] [2022-11-01 17:05:54,520] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:631] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:05:54,520][INFO] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:05:54,870] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:239] Graph was finalized. [2022-11-01 17:05:54,870][INFO] Graph was finalized. [2022-11-01 17:05:54,870] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:633] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:05:54,870][INFO] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:05:54,870] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:641] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:05:54,870][INFO] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:05:54,871] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:307] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:05:54,871][INFO] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:05:54.871283] [INFO] [4585#4585] [tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F [2022-11-01 17:05:55.052830] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Found device 0 with properties: name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:4f:00.0 totalMemory: 31.75GiB freeMemory: 30.47GiB [2022-11-01 17:05:55.052869] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:05:55.361994] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:05:55.362028] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:05:55.362036] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:05:55.362172] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:05:55.365548] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:05:55.365577] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:05:55.365583] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:05:55.365589] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:05:55.365662] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:05:55.457231] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:05:55.457271] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:05:55.457278] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:05:55.457284] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:05:55.457351] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:05:55,544] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:317] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:05:55,544][INFO] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:05:55,544] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:324] [SessionManager prepare_session][init_op begin: name: "group_deps" op: "NoOp" input: "^init" input: "^init_1" ] [2022-11-01 17:05:55,544][INFO] [SessionManager prepare_session][init_op begin: name: "group_deps" op: "NoOp" input: "^init" input: "^init_1" ] [2022-11-01 17:05:56,182] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:326] [SessionManager prepare_session][init_op end] [2022-11-01 17:05:56,182][INFO] [SessionManager prepare_session][init_op end] [2022-11-01 17:05:56,183] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:335] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:05:56,183][INFO] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:05:56,388] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:544] Running local_init_op: name: "group_deps_1" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:05:56,388][INFO] Running local_init_op: name: "group_deps_1" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:05:56,456] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:546] Done running local_init_op. [2022-11-01 17:05:56,456][INFO] Done running local_init_op. [2022-11-01 17:05:56,457] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:337] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:05:56,457][INFO] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:05:56,457] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:345] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:05:56,457][INFO] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:05:56,660] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:347] [SessionManager prepare_session][model is ready] [2022-11-01 17:05:56,660][INFO] [SessionManager prepare_session][model is ready] [2022-11-01 17:05:56,661] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:652] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:05:56,661][INFO] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:05:56,661] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:916] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:05:56,661][INFO] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:05:56,661] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:924] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:05:56,661][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:05:56,661] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.StepCounterHook object at 0x7fd054bb7ef0>] [2022-11-01 17:05:56,661][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.StepCounterHook object at 0x7fd054bb7ef0>] [2022-11-01 17:05:56,661] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.SummarySaverHook object at 0x7fd047019b00>] [2022-11-01 17:05:56,661][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.SummarySaverHook object at 0x7fd047019b00>] [2022-11-01 17:05:56,661] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd05d1f2390>] [2022-11-01 17:05:56,661][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd05d1f2390>] [2022-11-01 17:05:56,757] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.StopAtStepHook object at 0x7fd05d6c8ac8>] [2022-11-01 17:05:56,757][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.StopAtStepHook object at 0x7fd05d6c8ac8>] [2022-11-01 17:05:56,757] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.NanTensorHook object at 0x7fd054d70908>] [2022-11-01 17:05:56,757][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.NanTensorHook object at 0x7fd054d70908>] [2022-11-01 17:05:56,757] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d709b0>] [2022-11-01 17:05:56,757][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d709b0>] [2022-11-01 17:05:56,757] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d4def0>] [2022-11-01 17:05:56,757][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.LoggingTensorHook object at 0x7fd054d4def0>] [2022-11-01 17:05:56,757] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<easy_rec.python.utils.estimator_utils.CheckpointSaverHook object at 0x7fd054d70518>] [2022-11-01 17:05:56,757][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<easy_rec.python.utils.estimator_utils.CheckpointSaverHook object at 0x7fd054d70518>] [2022-11-01 17:06:00,135][INFO] Saving checkpoints for 0 into experiments/dwd_avazu_ctr/model.ckpt. [2022-11-01 17:06:01,171] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:928] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:01,171][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:02.611812] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:02.664183] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:03.131577] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:03.172861] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:04,223] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:309] loss = 0.84836036, step = 0 [2022-11-01 17:06:04,223][INFO] loss = 0.84836036, step = 0 [2022-11-01 17:06:04,224] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 0,cross_entropy_loss = 0.8297269,regularization_loss = 0.018633498,total_loss = 0.84836036 [2022-11-01 17:06:04,224][INFO] lr = 1e-04,step = 0,cross_entropy_loss = 0.8297269,regularization_loss = 0.018633498,total_loss = 0.84836036 [2022-11-01 17:06:05.665171] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:05.716728] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:06.179800] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 0, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:06.219488] [ERROR] [4585#4585] [tensorflow/core/grappler/optimizers/dependency_optimizer.cc:666] Iteration = 1, topological sort failed with message: The graph couldn't be sorted in topological order. [2022-11-01 17:06:07,362] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 3.18496 [2022-11-01 17:06:07,362][INFO] global_step/sec: 3.18496 [2022-11-01 17:06:07,363] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.6983467, step = 10 (3.140 sec) [2022-11-01 17:06:07,363][INFO] loss = 0.6983467, step = 10 (3.140 sec) [2022-11-01 17:06:07,363] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 10,cross_entropy_loss = 0.6797235,regularization_loss = 0.018623155,total_loss = 0.6983467 [2022-11-01 17:06:07,363][INFO] lr = 1e-04,step = 10,cross_entropy_loss = 0.6797235,regularization_loss = 0.018623155,total_loss = 0.6983467 [2022-11-01 17:06:07,839] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 20.9621 [2022-11-01 17:06:07,839][INFO] global_step/sec: 20.9621 [2022-11-01 17:06:07,840] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.64717877, step = 20 (0.477 sec) [2022-11-01 17:06:07,840][INFO] loss = 0.64717877, step = 20 (0.477 sec) [2022-11-01 17:06:07,840] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 20,cross_entropy_loss = 0.6285608,regularization_loss = 0.01861798,total_loss = 0.64717877 [2022-11-01 17:06:07,840][INFO] lr = 1e-04,step = 20,cross_entropy_loss = 0.6285608,regularization_loss = 0.01861798,total_loss = 0.64717877 [2022-11-01 17:06:08,322] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 20.7022 [2022-11-01 17:06:08,322][INFO] global_step/sec: 20.7022 [2022-11-01 17:06:08,323] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.595971, step = 30 (0.483 sec) [2022-11-01 17:06:08,323][INFO] loss = 0.595971, step = 30 (0.483 sec) [2022-11-01 17:06:08,323] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 30,cross_entropy_loss = 0.5773572,regularization_loss = 0.01861383,total_loss = 0.595971 [2022-11-01 17:06:08,323][INFO] lr = 1e-04,step = 30,cross_entropy_loss = 0.5773572,regularization_loss = 0.01861383,total_loss = 0.595971 [2022-11-01 17:06:08,798] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 21.0294 [2022-11-01 17:06:08,798][INFO] global_step/sec: 21.0294 [2022-11-01 17:06:08,799] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.56887954, step = 40 (0.476 sec) [2022-11-01 17:06:08,799][INFO] loss = 0.56887954, step = 40 (0.476 sec) [2022-11-01 17:06:08,799] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 40,cross_entropy_loss = 0.55027056,regularization_loss = 0.018609012,total_loss = 0.56887954 [2022-11-01 17:06:08,799][INFO] lr = 1e-04,step = 40,cross_entropy_loss = 0.55027056,regularization_loss = 0.018609012,total_loss = 0.56887954 [2022-11-01 17:06:09,222][INFO] Saving checkpoints for 50 into experiments/dwd_avazu_ctr/model.ckpt. [2022-11-01 17:06:09,965][INFO] eval files[1]: data/dwd_avazu_ctr_deepmodel_test.csv [2022-11-01 17:06:10,028] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1226] Calling model_fn. [2022-11-01 17:06:10,028][INFO] Calling model_fn. [2022-11-01 17:06:10,029][INFO] shared embeddings[num=0] [2022-11-01 17:06:12,411][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:12,516][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:12,754] [INFO] [4585#MainThread] [easy_rec/python/model/easy_rec_estimator.py:374] metric_dict keys: dict_keys(['auc', 'loss/loss/cross_entropy_loss', 'loss/loss/total_loss']) [2022-11-01 17:06:12,754][INFO] metric_dict keys: dict_keys(['auc', 'loss/loss/cross_entropy_loss', 'loss/loss/total_loss']) [2022-11-01 17:06:12,755] [INFO] [4585#MainThread] [easy_rec/python/model/easy_rec_estimator.py:377] eval graph construct finished. Time 2.727s [2022-11-01 17:06:12,755][INFO] eval graph construct finished. Time 2.727s [2022-11-01 17:06:12,755] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1228] Done calling model_fn. [2022-11-01 17:06:12,755][INFO] Done calling model_fn. [2022-11-01 17:06:12,775] [INFO] [4585#MainThread] [tensorflow/python/training/evaluation.py:257] Starting evaluation at 2022-11-01-09:06:12 [2022-11-01 17:06:12,775][INFO] Starting evaluation at 2022-11-01-09:06:12 [2022-11-01 17:06:12,775] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:734] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:06:12,775][INFO] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:06:12,775] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd04695b6a0>] [2022-11-01 17:06:12,775][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd04695b6a0>] [2022-11-01 17:06:12,775] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fd046b2ff28>] [2022-11-01 17:06:12,775][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fd046b2ff28>] [2022-11-01 17:06:12,776] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:742] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:06:12,776][INFO] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:06:12,776] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:914] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:06:12,776][INFO] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:06:12,776] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:629] [ChiefSessionCreator create_session] [2022-11-01 17:06:12,776][INFO] [ChiefSessionCreator create_session] [2022-11-01 17:06:12,776] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:631] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:06:12,776][INFO] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:06:13,005] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:239] Graph was finalized. [2022-11-01 17:06:13,005][INFO] Graph was finalized. [2022-11-01 17:06:13,005] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:633] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:06:13,005][INFO] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:06:13,005] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:641] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:06:13,005][INFO] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:06:13,005] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:307] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:06:13,005][INFO] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:06:13.006267] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:13.006304] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:13.006310] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:13.006316] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:13.006394] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:13.006816] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:13.006839] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:13.006846] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:13.006854] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:13.006904] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:13.039314] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:13.039342] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:13.039349] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:13.039356] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:13.039408] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:13,071] [INFO] [4585#MainThread] [tensorflow/python/training/saver.py:1995] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:13,071][INFO] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:13,292] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:317] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:06:13,292][INFO] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:06:13,292] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:335] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:06:13,292][INFO] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:06:13,368] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:544] Running local_init_op: name: "group_deps_2" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:06:13,368][INFO] Running local_init_op: name: "group_deps_2" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:06:13,400] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:546] Done running local_init_op. [2022-11-01 17:06:13,400][INFO] Done running local_init_op. [2022-11-01 17:06:13,401] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:337] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:06:13,401][INFO] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:06:13,401] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:345] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:06:13,401][INFO] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:06:13,479] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:347] [SessionManager prepare_session][model is ready] [2022-11-01 17:06:13,479][INFO] [SessionManager prepare_session][model is ready] [2022-11-01 17:06:13,479] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:652] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:06:13,479][INFO] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:06:13,480] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:916] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:06:13,480][INFO] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:06:13,480] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:924] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:06:13,480][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:06:13,480] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd04695b6a0>] [2022-11-01 17:06:13,480][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fd04695b6a0>] [2022-11-01 17:06:13,524] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fd046b2ff28>] [2022-11-01 17:06:13,524][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fd046b2ff28>] [2022-11-01 17:06:13,524] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:928] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:13,524][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:14,612] [INFO] [4585#MainThread] [tensorflow/python/training/evaluation.py:277] Finished evaluation at 2022-11-01-09:06:14 [2022-11-01 17:06:14,612][INFO] Finished evaluation at 2022-11-01-09:06:14 [2022-11-01 17:06:14,613] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:2003] Saving dict for global step 50: auc = 0.54794496, global_step = 50, loss = 0.6682401, loss/loss/cross_entropy_loss = 0.6682401, loss/loss/total_loss = 0.6682401 [2022-11-01 17:06:14,613][INFO] Saving dict for global step 50: auc = 0.54794496, global_step = 50, loss = 0.6682401, loss/loss/cross_entropy_loss = 0.6682401, loss/loss/total_loss = 0.6682401 [2022-11-01 17:06:15,063] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:2063] Saving 'checkpoint_path' summary for global step 50: experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:15,063][INFO] Saving 'checkpoint_path' summary for global step 50: experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:15,119] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 1.58188 [2022-11-01 17:06:15,119][INFO] global_step/sec: 1.58188 [2022-11-01 17:06:15,121] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.54047185, step = 50 (6.322 sec) [2022-11-01 17:06:15,121][INFO] loss = 0.54047185, step = 50 (6.322 sec) [2022-11-01 17:06:15,121] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 50,cross_entropy_loss = 0.52186686,regularization_loss = 0.018604979,total_loss = 0.54047185 [2022-11-01 17:06:15,121][INFO] lr = 1e-04,step = 50,cross_entropy_loss = 0.52186686,regularization_loss = 0.018604979,total_loss = 0.54047185 [2022-11-01 17:06:15,595] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 21.0217 [2022-11-01 17:06:15,595][INFO] global_step/sec: 21.0217 [2022-11-01 17:06:15,596] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.5152127, step = 60 (0.476 sec) [2022-11-01 17:06:15,596][INFO] loss = 0.5152127, step = 60 (0.476 sec) [2022-11-01 17:06:15,596] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 60,cross_entropy_loss = 0.49661177,regularization_loss = 0.018600943,total_loss = 0.5152127 [2022-11-01 17:06:15,596][INFO] lr = 1e-04,step = 60,cross_entropy_loss = 0.49661177,regularization_loss = 0.018600943,total_loss = 0.5152127 [2022-11-01 17:06:16,068] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 21.1563 [2022-11-01 17:06:16,068][INFO] global_step/sec: 21.1563 [2022-11-01 17:06:16,069] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.4868728, step = 70 (0.473 sec) [2022-11-01 17:06:16,069][INFO] loss = 0.4868728, step = 70 (0.473 sec) [2022-11-01 17:06:16,069] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 70,cross_entropy_loss = 0.46827665,regularization_loss = 0.018596135,total_loss = 0.4868728 [2022-11-01 17:06:16,069][INFO] lr = 1e-04,step = 70,cross_entropy_loss = 0.46827665,regularization_loss = 0.018596135,total_loss = 0.4868728 [2022-11-01 17:06:16,542] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 21.0787 [2022-11-01 17:06:16,542][INFO] global_step/sec: 21.0787 [2022-11-01 17:06:16,543] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.45134756, step = 80 (0.474 sec) [2022-11-01 17:06:16,543][INFO] loss = 0.45134756, step = 80 (0.474 sec) [2022-11-01 17:06:16,543] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 80,cross_entropy_loss = 0.4327584,regularization_loss = 0.01858916,total_loss = 0.45134756 [2022-11-01 17:06:16,543][INFO] lr = 1e-04,step = 80,cross_entropy_loss = 0.4327584,regularization_loss = 0.01858916,total_loss = 0.45134756 [2022-11-01 17:06:17,018] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:813] global_step/sec: 21.0082 [2022-11-01 17:06:17,018][INFO] global_step/sec: 21.0082 [2022-11-01 17:06:17,019] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:307] loss = 0.42948166, step = 90 (0.476 sec) [2022-11-01 17:06:17,019][INFO] loss = 0.42948166, step = 90 (0.476 sec) [2022-11-01 17:06:17,019] [INFO] [4585#MainThread] [tensorflow/python/training/basic_session_run_hooks.py:301] lr = 1e-04,step = 90,cross_entropy_loss = 0.4109,regularization_loss = 0.018581651,total_loss = 0.42948166 [2022-11-01 17:06:17,019][INFO] lr = 1e-04,step = 90,cross_entropy_loss = 0.4109,regularization_loss = 0.018581651,total_loss = 0.42948166 [2022-11-01 17:06:17,448][INFO] Saving checkpoints for 100 into experiments/dwd_avazu_ctr/model.ckpt. [2022-11-01 17:06:18,142] [INFO] [4585#MainThread] [tensorflow/python/estimator/training.py:528] Skip the current checkpoint eval due to throttle secs (10 secs). [2022-11-01 17:06:18,142][INFO] Skip the current checkpoint eval due to throttle secs (10 secs). [2022-11-01 17:06:18,158][INFO] eval files[1]: data/dwd_avazu_ctr_deepmodel_test.csv [2022-11-01 17:06:18,222] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1226] Calling model_fn. [2022-11-01 17:06:18,222][INFO] Calling model_fn. [2022-11-01 17:06:18,222][INFO] shared embeddings[num=0] [2022-11-01 17:06:20,651][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:20,752][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:20,991] [INFO] [4585#MainThread] [easy_rec/python/model/easy_rec_estimator.py:374] metric_dict keys: dict_keys(['auc', 'loss/loss/cross_entropy_loss', 'loss/loss/total_loss']) [2022-11-01 17:06:20,991][INFO] metric_dict keys: dict_keys(['auc', 'loss/loss/cross_entropy_loss', 'loss/loss/total_loss']) [2022-11-01 17:06:20,991] [INFO] [4585#MainThread] [easy_rec/python/model/easy_rec_estimator.py:377] eval graph construct finished. Time 2.770s [2022-11-01 17:06:20,991][INFO] eval graph construct finished. Time 2.770s [2022-11-01 17:06:20,992] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1228] Done calling model_fn. [2022-11-01 17:06:20,992][INFO] Done calling model_fn. [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/evaluation.py:257] Starting evaluation at 2022-11-01-09:06:21 [2022-11-01 17:06:21,011][INFO] Starting evaluation at 2022-11-01-09:06:21 [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:734] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:06:21,011][INFO] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fcfd04f7978>] [2022-11-01 17:06:21,011][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fcfd04f7978>] [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fcfd0415588>] [2022-11-01 17:06:21,011][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fcfd0415588>] [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:742] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:06:21,011][INFO] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:914] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:06:21,011][INFO] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:06:21,011] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:629] [ChiefSessionCreator create_session] [2022-11-01 17:06:21,011][INFO] [ChiefSessionCreator create_session] [2022-11-01 17:06:21,012] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:631] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:06:21,012][INFO] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:06:21,240] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:239] Graph was finalized. [2022-11-01 17:06:21,240][INFO] Graph was finalized. [2022-11-01 17:06:21,240] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:633] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:06:21,240][INFO] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:06:21,240] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:641] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:06:21,240][INFO] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:06:21,241] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:307] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:06:21,241][INFO] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:06:21.241360] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:21.241398] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:21.241404] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:21.241412] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:21.241491] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:21.241890] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:21.241915] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:21.241921] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:21.241928] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:21.241983] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:21.274360] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:21.274388] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:21.274395] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:21.274403] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:21.274460] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:21,307] [INFO] [4585#MainThread] [tensorflow/python/training/saver.py:1995] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:21,307][INFO] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:21,535] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:317] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:06:21,535][INFO] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:06:21,535] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:335] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:06:21,535][INFO] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:06:21,653] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:544] Running local_init_op: name: "group_deps_2" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:06:21,653][INFO] Running local_init_op: name: "group_deps_2" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:06:21,688] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:546] Done running local_init_op. [2022-11-01 17:06:21,688][INFO] Done running local_init_op. [2022-11-01 17:06:21,688] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:337] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:06:21,688][INFO] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:06:21,688] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:345] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:06:21,688][INFO] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:06:21,768] [INFO] [4585#MainThread] [tensorflow/python/training/session_manager.py:347] [SessionManager prepare_session][model is ready] [2022-11-01 17:06:21,768][INFO] [SessionManager prepare_session][model is ready] [2022-11-01 17:06:21,768] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:652] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:06:21,768][INFO] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:06:21,768] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:916] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:06:21,768][INFO] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:06:21,768] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:924] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:06:21,768][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:06:21,768] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fcfd04f7978>] [2022-11-01 17:06:21,768][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fcfd04f7978>] [2022-11-01 17:06:21,812] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fcfd0415588>] [2022-11-01 17:06:21,812][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fcfd0415588>] [2022-11-01 17:06:21,813] [INFO] [4585#MainThread] [tensorflow/python/training/monitored_session.py:928] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:21,813][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:22,908] [INFO] [4585#MainThread] [tensorflow/python/training/evaluation.py:277] Finished evaluation at 2022-11-01-09:06:22 [2022-11-01 17:06:22,908][INFO] Finished evaluation at 2022-11-01-09:06:22 [2022-11-01 17:06:22,909] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:2003] Saving dict for global step 100: auc = 0.5781586, global_step = 100, loss = 0.64041096, loss/loss/cross_entropy_loss = 0.64041096, loss/loss/total_loss = 0.64041096 [2022-11-01 17:06:22,909][INFO] Saving dict for global step 100: auc = 0.5781586, global_step = 100, loss = 0.64041096, loss/loss/cross_entropy_loss = 0.64041096, loss/loss/total_loss = 0.64041096 [2022-11-01 17:06:22,909] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:2063] Saving 'checkpoint_path' summary for global step 100: experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:22,909][INFO] Saving 'checkpoint_path' summary for global step 100: experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:22,910] [INFO] [4585#MainThread] [easy_rec/python/compat/exporter.py:380] Performing the final export in the end of training. [2022-11-01 17:06:22,910][INFO] Performing the final export in the end of training. [2022-11-01 17:06:22,926][INFO] input_name: hour, dtype: <dtype: 'string'> [2022-11-01 17:06:22,927][INFO] input_name: c1, dtype: <dtype: 'string'> [2022-11-01 17:06:22,927][INFO] input_name: banner_pos, dtype: <dtype: 'string'> [2022-11-01 17:06:22,928][INFO] input_name: site_id, dtype: <dtype: 'string'> [2022-11-01 17:06:22,928][INFO] input_name: site_domain, dtype: <dtype: 'string'> [2022-11-01 17:06:22,929][INFO] input_name: site_category, dtype: <dtype: 'string'> [2022-11-01 17:06:22,930][INFO] input_name: app_id, dtype: <dtype: 'string'> [2022-11-01 17:06:22,944][INFO] input_name: app_domain, dtype: <dtype: 'string'> [2022-11-01 17:06:22,945][INFO] input_name: app_category, dtype: <dtype: 'string'> [2022-11-01 17:06:22,945][INFO] input_name: device_id, dtype: <dtype: 'string'> [2022-11-01 17:06:22,946][INFO] input_name: device_ip, dtype: <dtype: 'string'> [2022-11-01 17:06:22,947][INFO] input_name: device_model, dtype: <dtype: 'string'> [2022-11-01 17:06:22,947][INFO] input_name: device_type, dtype: <dtype: 'string'> [2022-11-01 17:06:22,948][INFO] input_name: device_conn_type, dtype: <dtype: 'string'> [2022-11-01 17:06:22,948][INFO] input_name: c14, dtype: <dtype: 'string'> [2022-11-01 17:06:22,949][INFO] input_name: c15, dtype: <dtype: 'string'> [2022-11-01 17:06:22,950][INFO] input_name: c16, dtype: <dtype: 'string'> [2022-11-01 17:06:22,950][INFO] input_name: c17, dtype: <dtype: 'string'> [2022-11-01 17:06:22,951][INFO] input_name: c18, dtype: <dtype: 'string'> [2022-11-01 17:06:22,951][INFO] input_name: c19, dtype: <dtype: 'string'> [2022-11-01 17:06:22,952][INFO] input_name: c20, dtype: <dtype: 'string'> [2022-11-01 17:06:22,952][INFO] input_name: c21, dtype: <dtype: 'string'> [2022-11-01 17:06:22,953] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1226] Calling model_fn. [2022-11-01 17:06:22,953][INFO] Calling model_fn. [2022-11-01 17:06:22,954][INFO] shared embeddings[num=0] [2022-11-01 17:06:25,413][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:25,516][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:25,598][INFO] building default outputs [2022-11-01 17:06:25,598] [INFO] [4585#MainThread] [easy_rec/python/model/easy_rec_estimator.py:459] output probs shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:25,598][INFO] output probs shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:25,599] [INFO] [4585#MainThread] [easy_rec/python/model/easy_rec_estimator.py:459] output logits shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:25,599][INFO] output logits shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:25,600] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:1228] Done calling model_fn. [2022-11-01 17:06:25,600][INFO] Done calling model_fn. [2022-11-01 17:06:25,601] [INFO] [4585#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Classify: None [2022-11-01 17:06:25,601][INFO] Signatures INCLUDED in export for Classify: None [2022-11-01 17:06:25,601] [INFO] [4585#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Regress: None [2022-11-01 17:06:25,601][INFO] Signatures INCLUDED in export for Regress: None [2022-11-01 17:06:25,601] [INFO] [4585#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Predict: ['serving_default'] [2022-11-01 17:06:25,601][INFO] Signatures INCLUDED in export for Predict: ['serving_default'] [2022-11-01 17:06:25,601] [INFO] [4585#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Train: None [2022-11-01 17:06:25,601][INFO] Signatures INCLUDED in export for Train: None [2022-11-01 17:06:25,601] [INFO] [4585#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Eval: None [2022-11-01 17:06:25,601][INFO] Signatures INCLUDED in export for Eval: None [2022-11-01 17:06:25.602077] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:25.602118] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:25.602125] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:25.602131] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:25.602212] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:25.602597] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:25.602631] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:25.602638] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:25.602646] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:25.602701] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:25.629937] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:25.629966] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:25.629972] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:25.629980] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:25.630035] [INFO] [4585#4585] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:25,745] [INFO] [4585#MainThread] [tensorflow/python/training/saver.py:1995] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:25,745][INFO] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:25,856] [WARNING] [4585#MainThread] [tensorflow/python/util/deprecation.py:487] From /home/pai/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py:1076: calling SavedModelBuilder.add_meta_graph_and_variables (from tensorflow.python.saved_model.builder_impl) with legacy_init_op is deprecated and will be removed in a future version. Instructions for updating: Pass your op to the equivalent parameter main_op instead. [2022-11-01 17:06:25,856][WARNING] From /home/pai/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py:1076: calling SavedModelBuilder.add_meta_graph_and_variables (from tensorflow.python.saved_model.builder_impl) with legacy_init_op is deprecated and will be removed in a future version. Instructions for updating: Pass your op to the equivalent parameter main_op instead. [2022-11-01 17:06:25,857] [INFO] [4585#MainThread] [tensorflow/python/saved_model/builder_impl.py:520] Assets added to graph. [2022-11-01 17:06:25,857][INFO] Assets added to graph. [2022-11-01 17:06:25,871] [INFO] [4585#MainThread] [tensorflow/python/saved_model/builder_impl.py:142] Assets written to: experiments/dwd_avazu_ctr/export/final/temp-b'1667293582'/assets [2022-11-01 17:06:25,871][INFO] Assets written to: experiments/dwd_avazu_ctr/export/final/temp-b'1667293582'/assets [2022-11-01 17:06:26,361] [INFO] [4585#MainThread] [tensorflow/python/saved_model/builder_impl.py:474] SavedModel written to: experiments/dwd_avazu_ctr/export/final/temp-b'1667293582'/saved_model.pb [2022-11-01 17:06:26,361][INFO] SavedModel written to: experiments/dwd_avazu_ctr/export/final/temp-b'1667293582'/saved_model.pb [2022-11-01 17:06:26,747] [INFO] [4585#MainThread] [tensorflow/python/estimator/estimator.py:386] Loss for final step: 0.4169205. [2022-11-01 17:06:26,747][INFO] Loss for final step: 0.4169205. [2022-11-01 17:06:26,756][INFO] Train and evaluate finish 1.4 Evaluate前面训练好的模型 这里需要指定checkpoint的文件路径以及模型训练所使用的的config文件 !python3 -m easy_rec.python.eval --pipeline_config_path dwd_avazu_ctr_deepmodel_dsw.config --checkpoint_path experiments/dwd_avazu_ctr/model.ckpt-50 ================================================ | PAI Tensorflow powered by Aliyun PAI Team. | ================================================ Please ignore the following import error if you are using tunnel table io. No module named '_common_io' [2022-11-01 17:06:30,382][WARNING] pyhive is not installed. [2022-11-01 17:06:37,116] [WARNING] [5002#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/input/csv_input.py:14: ignore_errors (from tensorflow.contrib.data.python.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.experimental.ignore_errors()`. [2022-11-01 17:06:37,116][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/input/csv_input.py:14: ignore_errors (from tensorflow.contrib.data.python.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.experimental.ignore_errors()`. [2022-11-01 17:06:37,117][WARNING] DataHub is not installed. You can install it by: pip install pydatahub easy_rec version: 0.4.7 Usage: easy_rec.help() [2022-11-01 17:06:37,124] [INFO] [5002#MainThread] [tensorflow/python/util/auto_strategy_utils.py:108] Disable Auto Strategy. [2022-11-01 17:06:37,124][INFO] Disable Auto Strategy. [2022-11-01 17:06:37,133] [INFO] [5002#MainThread] [tensorflow/python/estimator/estimator.py:205] Using config: {'_model_dir': 'experiments/dwd_avazu_ctr/', '_tf_random_seed': None, '_save_summary_steps': 10, '_save_checkpoints_steps': 50, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" gpu_options { } allow_soft_placement: true , '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 10, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc352f5c898>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} [2022-11-01 17:06:37,133][INFO] Using config: {'_model_dir': 'experiments/dwd_avazu_ctr/', '_tf_random_seed': None, '_save_summary_steps': 10, '_save_checkpoints_steps': 50, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" gpu_options { } allow_soft_placement: true , '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 10, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc352f5c898>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} [2022-11-01 17:06:37,134][INFO] check_mode: False [2022-11-01 17:06:37,134][INFO] check_mode: False [2022-11-01 17:06:37,142][INFO] eval files[1]: data/dwd_avazu_ctr_deepmodel_test.csv [2022-11-01 17:06:37,218] [INFO] [5002#MainThread] [tensorflow/python/estimator/estimator.py:1226] Calling model_fn. [2022-11-01 17:06:37,218][INFO] Calling model_fn. [2022-11-01 17:06:37,219][INFO] shared embeddings[num=0] [2022-11-01 17:06:37,229] [WARNING] [5002#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:206: EmbeddingColumn._get_dense_tensor (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,229][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:206: EmbeddingColumn._get_dense_tensor (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,230] [WARNING] [5002#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3643: HashedCategoricalColumn._get_sparse_tensors (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,230][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3643: HashedCategoricalColumn._get_sparse_tensors (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,231] [WARNING] [5002#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:2119: HashedCategoricalColumn._transform_feature (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,231][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:2119: HashedCategoricalColumn._transform_feature (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,236] [WARNING] [5002#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3537: HashedCategoricalColumn._num_buckets (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,236][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3537: HashedCategoricalColumn._num_buckets (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,274] [WARNING] [5002#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:207: EmbeddingColumn._variable_shape (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:37,274][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:207: EmbeddingColumn._variable_shape (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:39,562][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:39,663][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:39,896] [INFO] [5002#MainThread] [easy_rec/python/model/easy_rec_estimator.py:374] metric_dict keys: dict_keys(['auc', 'loss/loss/cross_entropy_loss', 'loss/loss/total_loss']) [2022-11-01 17:06:39,896][INFO] metric_dict keys: dict_keys(['auc', 'loss/loss/cross_entropy_loss', 'loss/loss/total_loss']) [2022-11-01 17:06:39,896] [INFO] [5002#MainThread] [easy_rec/python/model/easy_rec_estimator.py:377] eval graph construct finished. Time 2.678s [2022-11-01 17:06:39,896][INFO] eval graph construct finished. Time 2.678s [2022-11-01 17:06:39,897] [INFO] [5002#MainThread] [tensorflow/python/estimator/estimator.py:1228] Done calling model_fn. [2022-11-01 17:06:39,897][INFO] Done calling model_fn. [2022-11-01 17:06:39,916] [INFO] [5002#MainThread] [tensorflow/python/training/evaluation.py:257] Starting evaluation at 2022-11-01-09:06:39 [2022-11-01 17:06:39,916][INFO] Starting evaluation at 2022-11-01-09:06:39 [2022-11-01 17:06:39,916] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:734] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:06:39,916][INFO] [_MonitoredSession __init__][call hook begin] [2022-11-01 17:06:39,916] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fc352f67978>] [2022-11-01 17:06:39,916][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fc352f67978>] [2022-11-01 17:06:39,917] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:736] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fc350a01dd8>] [2022-11-01 17:06:39,917][INFO] [_MonitoredSession __init__][hook begin:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fc350a01dd8>] [2022-11-01 17:06:39,918] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:742] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:06:39,918][INFO] [_MonitoredSession __init__][call hook is finish] [2022-11-01 17:06:39,918] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:914] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:06:39,918][INFO] [_CoordinatedSessionCreator create_session][create_session start] [2022-11-01 17:06:39,918] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:629] [ChiefSessionCreator create_session] [2022-11-01 17:06:39,918][INFO] [ChiefSessionCreator create_session] [2022-11-01 17:06:39,918] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:631] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:06:39,918][INFO] [ChiefSessionCreator create_session][scaffold finalize start] [2022-11-01 17:06:40,141] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:239] Graph was finalized. [2022-11-01 17:06:40,141][INFO] Graph was finalized. [2022-11-01 17:06:40,141] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:633] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:06:40,141][INFO] [ChiefSessionCreator create_session][scaffold finalize end] [2022-11-01 17:06:40,141] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:641] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:06:40,141][INFO] [ChiefSessionCreator create_session][session_manager prepare_session start] [2022-11-01 17:06:40,141] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:307] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:06:40,141][INFO] [SessionManager prepare_session][_restore_checkpoint begin] [2022-11-01 17:06:40.141772] [INFO] [5002#5002] [tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F [2022-11-01 17:06:40.329141] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Found device 0 with properties: name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:4f:00.0 totalMemory: 31.75GiB freeMemory: 30.47GiB [2022-11-01 17:06:40.329180] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:40.649111] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:40.649144] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:40.649152] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:40.649529] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:40.653012] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:40.653040] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:40.653048] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:40.653055] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:40.653112] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:40.690528] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:40.690557] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:40.690564] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:40.690571] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:40.690633] [INFO] [5002#5002] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:40,729] [INFO] [5002#MainThread] [tensorflow/python/training/saver.py:1995] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:40,729][INFO] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:40,877] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:317] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:06:40,877][INFO] [SessionManager prepare_session][_restore_checkpoint end] [2022-11-01 17:06:40,877] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:335] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:06:40,877][INFO] [SessionManager prepare_session][_try_run_local_init_op begin] [2022-11-01 17:06:40,960] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:544] Running local_init_op: name: "group_deps_2" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:06:40,960][INFO] Running local_init_op: name: "group_deps_2" op: "NoOp" input: "^init_2" input: "^init_all_tables" input: "^init_3" [2022-11-01 17:06:40,992] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:546] Done running local_init_op. [2022-11-01 17:06:40,992][INFO] Done running local_init_op. [2022-11-01 17:06:40,993] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:337] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:06:40,993][INFO] [SessionManager prepare_session][_try_run_local_init_op end] [2022-11-01 17:06:40,993] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:345] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:06:40,993][INFO] [SessionManager prepare_session][wait for model ready][Tensor("concat_3:0", shape=(?,), dtype=string)] [2022-11-01 17:06:41,071] [INFO] [5002#MainThread] [tensorflow/python/training/session_manager.py:347] [SessionManager prepare_session][model is ready] [2022-11-01 17:06:41,071][INFO] [SessionManager prepare_session][model is ready] [2022-11-01 17:06:41,071] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:652] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:06:41,071][INFO] [ChiefSessionCreator create_session][session_manager prepare_session end] [2022-11-01 17:06:41,071] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:916] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:06:41,071][INFO] [_CoordinatedSessionCreator create_session][create_session end] [2022-11-01 17:06:41,072] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:924] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:06:41,072][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session] [2022-11-01 17:06:41,072] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fc352f67978>] [2022-11-01 17:06:41,072][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.estimator.util._DatasetInitializerHook object at 0x7fc352f67978>] [2022-11-01 17:06:41,116] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:926] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fc350a01dd8>] [2022-11-01 17:06:41,116][INFO] [_CoordinatedSessionCreator create_session][after_create_session:<tensorflow.python.training.basic_session_run_hooks.FinalOpsHook object at 0x7fc350a01dd8>] [2022-11-01 17:06:41,117] [INFO] [5002#MainThread] [tensorflow/python/training/monitored_session.py:928] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:41,117][INFO] [_CoordinatedSessionCreator create_session][call hook after_create_session is finish] [2022-11-01 17:06:42,435] [INFO] [5002#MainThread] [tensorflow/python/training/evaluation.py:277] Finished evaluation at 2022-11-01-09:06:42 [2022-11-01 17:06:42,435][INFO] Finished evaluation at 2022-11-01-09:06:42 [2022-11-01 17:06:42,435] [INFO] [5002#MainThread] [tensorflow/python/estimator/estimator.py:2003] Saving dict for global step 50: auc = 0.54794496, global_step = 50, loss = 0.6682401, loss/loss/cross_entropy_loss = 0.6682401, loss/loss/total_loss = 0.6682401 [2022-11-01 17:06:42,435][INFO] Saving dict for global step 50: auc = 0.54794496, global_step = 50, loss = 0.6682401, loss/loss/cross_entropy_loss = 0.6682401, loss/loss/total_loss = 0.6682401 [2022-11-01 17:06:43,042] [INFO] [5002#MainThread] [tensorflow/python/estimator/estimator.py:2063] Saving 'checkpoint_path' summary for global step 50: experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:43,042][INFO] Saving 'checkpoint_path' summary for global step 50: experiments/dwd_avazu_ctr/model.ckpt-50 [2022-11-01 17:06:43,042][INFO] Evaluate finish eval_result = {'auc': 0.54794496, 'loss': 0.6682401, 'loss/loss/cross_entropy_loss': 0.6682401, 'loss/loss/total_loss': 0.6682401, 'global_step': 50} [2022-11-01 17:06:43,043][INFO] eval_result = {'auc': 0.54794496, 'loss': 0.6682401, 'loss/loss/cross_entropy_loss': 0.6682401, 'loss/loss/total_loss': 0.6682401, 'global_step': 50} [2022-11-01 17:06:43,043][INFO] save eval result to file experiments/dwd_avazu_ctr/eval_result.txt [2022-11-01 17:06:43,048][INFO] auc: 0.54794496 [2022-11-01 17:06:43,048][INFO] global_step: 50 [2022-11-01 17:06:43,048][INFO] loss: 0.6682401 [2022-11-01 17:06:43,048][INFO] loss/loss/cross_entropy_loss: 0.6682401 [2022-11-01 17:06:43,048][INFO] loss/loss/total_loss: 0.6682401
2.5 保存模型并且导出
!rm -rf experiments/deepfm_export/ !python3 -m easy_rec.python.export --pipeline_config_path dwd_avazu_ctr_deepmodel_dsw.config --export_dir experiments/deepfm_export/
================================================ | PAI Tensorflow powered by Aliyun PAI Team. | ================================================ Please ignore the following import error if you are using tunnel table io. No module named '_common_io' [2022-11-01 17:06:46,350][WARNING] pyhive is not installed. [2022-11-01 17:06:52,954] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/input/csv_input.py:14: ignore_errors (from tensorflow.contrib.data.python.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.experimental.ignore_errors()`. [2022-11-01 17:06:52,954][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/input/csv_input.py:14: ignore_errors (from tensorflow.contrib.data.python.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.experimental.ignore_errors()`. [2022-11-01 17:06:52,955][WARNING] DataHub is not installed. You can install it by: pip install pydatahub easy_rec version: 0.4.7 Usage: easy_rec.help() [2022-11-01 17:06:52,962] [INFO] [5142#MainThread] [tensorflow/python/util/auto_strategy_utils.py:108] Disable Auto Strategy. [2022-11-01 17:06:52,962][INFO] Disable Auto Strategy. [2022-11-01 17:06:52,975] [INFO] [5142#MainThread] [tensorflow/python/estimator/estimator.py:205] Using config: {'_model_dir': 'experiments/dwd_avazu_ctr/', '_tf_random_seed': None, '_save_summary_steps': 10, '_save_checkpoints_steps': 50, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" gpu_options { } allow_soft_placement: true , '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 10, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9d48cb2278>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} [2022-11-01 17:06:52,975][INFO] Using config: {'_model_dir': 'experiments/dwd_avazu_ctr/', '_tf_random_seed': None, '_save_summary_steps': 10, '_save_checkpoints_steps': 50, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" gpu_options { } allow_soft_placement: true , '_keep_checkpoint_max': 10, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 10, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9d48cb2278>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} [2022-11-01 17:06:52,976][INFO] check_mode: False [2022-11-01 17:06:52,991][INFO] input_name: hour, dtype: <dtype: 'string'> [2022-11-01 17:06:52,992][INFO] input_name: c1, dtype: <dtype: 'string'> [2022-11-01 17:06:52,992][INFO] input_name: banner_pos, dtype: <dtype: 'string'> [2022-11-01 17:06:52,993][INFO] input_name: site_id, dtype: <dtype: 'string'> [2022-11-01 17:06:52,993][INFO] input_name: site_domain, dtype: <dtype: 'string'> [2022-11-01 17:06:52,994][INFO] input_name: site_category, dtype: <dtype: 'string'> [2022-11-01 17:06:52,995][INFO] input_name: app_id, dtype: <dtype: 'string'> [2022-11-01 17:06:52,995][INFO] input_name: app_domain, dtype: <dtype: 'string'> [2022-11-01 17:06:52,996][INFO] input_name: app_category, dtype: <dtype: 'string'> [2022-11-01 17:06:52,996][INFO] input_name: device_id, dtype: <dtype: 'string'> [2022-11-01 17:06:52,997][INFO] input_name: device_ip, dtype: <dtype: 'string'> [2022-11-01 17:06:52,997][INFO] input_name: device_model, dtype: <dtype: 'string'> [2022-11-01 17:06:52,998][INFO] input_name: device_type, dtype: <dtype: 'string'> [2022-11-01 17:06:52,998][INFO] input_name: device_conn_type, dtype: <dtype: 'string'> [2022-11-01 17:06:52,999][INFO] input_name: c14, dtype: <dtype: 'string'> [2022-11-01 17:06:53,000][INFO] input_name: c15, dtype: <dtype: 'string'> [2022-11-01 17:06:53,000][INFO] input_name: c16, dtype: <dtype: 'string'> [2022-11-01 17:06:53,001][INFO] input_name: c17, dtype: <dtype: 'string'> [2022-11-01 17:06:53,001][INFO] input_name: c18, dtype: <dtype: 'string'> [2022-11-01 17:06:53,002][INFO] input_name: c19, dtype: <dtype: 'string'> [2022-11-01 17:06:53,002][INFO] input_name: c20, dtype: <dtype: 'string'> [2022-11-01 17:06:53,003][INFO] input_name: c21, dtype: <dtype: 'string'> [2022-11-01 17:06:53,004] [INFO] [5142#MainThread] [tensorflow/python/estimator/estimator.py:1226] Calling model_fn. [2022-11-01 17:06:53,004][INFO] Calling model_fn. [2022-11-01 17:06:53,004][INFO] shared embeddings[num=0] [2022-11-01 17:06:53,012] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:206: EmbeddingColumn._get_dense_tensor (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,012][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:206: EmbeddingColumn._get_dense_tensor (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,014] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3643: HashedCategoricalColumn._get_sparse_tensors (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,014][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3643: HashedCategoricalColumn._get_sparse_tensors (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,015] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:2119: HashedCategoricalColumn._transform_feature (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,015][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:2119: HashedCategoricalColumn._transform_feature (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,022] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3537: HashedCategoricalColumn._num_buckets (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,022][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column_v2.py:3537: HashedCategoricalColumn._num_buckets (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,062] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:305] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:207: EmbeddingColumn._variable_shape (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:53,062][WARNING] From /home/pai/lib/python3.6/site-packages/easy_rec/python/compat/feature_column/feature_column.py:207: EmbeddingColumn._variable_shape (from easy_rec.python.compat.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead. [2022-11-01 17:06:55,376][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:55,483][INFO] dnn activation function = tf.nn.relu [2022-11-01 17:06:55,565][INFO] building default outputs [2022-11-01 17:06:55,565] [INFO] [5142#MainThread] [easy_rec/python/model/easy_rec_estimator.py:459] output probs shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:55,565][INFO] output probs shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:55,565] [INFO] [5142#MainThread] [easy_rec/python/model/easy_rec_estimator.py:459] output logits shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:55,565][INFO] output logits shape: [None] type: <dtype: 'float32'> [2022-11-01 17:06:55,566] [INFO] [5142#MainThread] [tensorflow/python/estimator/estimator.py:1228] Done calling model_fn. [2022-11-01 17:06:55,566][INFO] Done calling model_fn. [2022-11-01 17:06:55,566] [INFO] [5142#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Classify: None [2022-11-01 17:06:55,566][INFO] Signatures INCLUDED in export for Classify: None [2022-11-01 17:06:55,566] [INFO] [5142#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Regress: None [2022-11-01 17:06:55,566][INFO] Signatures INCLUDED in export for Regress: None [2022-11-01 17:06:55,566] [INFO] [5142#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Predict: ['serving_default'] [2022-11-01 17:06:55,566][INFO] Signatures INCLUDED in export for Predict: ['serving_default'] [2022-11-01 17:06:55,566] [INFO] [5142#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Train: None [2022-11-01 17:06:55,566][INFO] Signatures INCLUDED in export for Train: None [2022-11-01 17:06:55,567] [INFO] [5142#MainThread] [tensorflow/python/estimator/export/export.py:584] Signatures INCLUDED in export for Eval: None [2022-11-01 17:06:55,567][INFO] Signatures INCLUDED in export for Eval: None [2022-11-01 17:06:55.567234] [INFO] [5142#5142] [tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F [2022-11-01 17:06:55.754356] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Found device 0 with properties: name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:4f:00.0 totalMemory: 31.75GiB freeMemory: 30.47GiB [2022-11-01 17:06:55.754394] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:56.074414] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:56.074448] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:56.074456] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:56.074592] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:56.078106] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:56.078135] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:56.078142] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:56.078148] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:56.078198] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:56.108883] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1563] Adding visible gpu devices: 0 [2022-11-01 17:06:56.108910] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1010] Device interconnect StreamExecutor with strength 1 edge matrix: [2022-11-01 17:06:56.108916] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1016] 0 [2022-11-01 17:06:56.108925] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1029] 0: N [2022-11-01 17:06:56.108973] [INFO] [5142#5142] [tensorflow/core/common_runtime/gpu/gpu_device.cc:1167] Created TensorFlow device (/device:GPU:0 with 29559 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:4f:00.0, compute capability: 7.0) [2022-11-01 17:06:56,225] [INFO] [5142#MainThread] [tensorflow/python/training/saver.py:1995] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:56,225][INFO] Restoring parameters from experiments/dwd_avazu_ctr/model.ckpt-100 [2022-11-01 17:06:56,360] [WARNING] [5142#MainThread] [tensorflow/python/util/deprecation.py:487] From /home/pai/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py:1076: calling SavedModelBuilder.add_meta_graph_and_variables (from tensorflow.python.saved_model.builder_impl) with legacy_init_op is deprecated and will be removed in a future version. Instructions for updating: Pass your op to the equivalent parameter main_op instead. [2022-11-01 17:06:56,360][WARNING] From /home/pai/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py:1076: calling SavedModelBuilder.add_meta_graph_and_variables (from tensorflow.python.saved_model.builder_impl) with legacy_init_op is deprecated and will be removed in a future version. Instructions for updating: Pass your op to the equivalent parameter main_op instead. [2022-11-01 17:06:56,360] [INFO] [5142#MainThread] [tensorflow/python/saved_model/builder_impl.py:520] Assets added to graph. [2022-11-01 17:06:56,360][INFO] Assets added to graph. [2022-11-01 17:06:56,373] [INFO] [5142#MainThread] [tensorflow/python/saved_model/builder_impl.py:142] Assets written to: experiments/deepfm_export/temp-b'1667293612'/assets [2022-11-01 17:06:56,373][INFO] Assets written to: experiments/deepfm_export/temp-b'1667293612'/assets [2022-11-01 17:06:56,888] [INFO] [5142#MainThread] [tensorflow/python/saved_model/builder_impl.py:474] SavedModel written to: experiments/deepfm_export/temp-b'1667293612'/saved_model.pb [2022-11-01 17:06:56,888][INFO] SavedModel written to: experiments/deepfm_export/temp-b'1667293612'/saved_model.pb [2022-11-01 17:06:56,975][INFO] model has been exported to experiments/deepfm_export/1667293612 successfully
2. 使用测试数据集验证模型的效果
from easy_rec.python.inference.predictor import Predictor predictor = Predictor('experiments/deepfm_export') with open('data/dwd_avazu_ctr_deepmodel_test.csv', 'r') as fin: batch_input = [] for line_str in fin: line_str = line_str.strip() line_tok = line_str.split(',') line_tok = line_tok[1:] batch_input.append(line_tok) output = predictor.predict(batch_input, batch_size=1024) # print first 32 predictions print(output[:32])
[2022-11-01 17:06:57,797] [INFO] [1487#MainThread] [easy_rec/python/inference/predictor.py:187] loading model from experiments/deepfm_export
[2022-11-01 17:06:57,797][INFO] loading model from experiments/deepfm_export [2022-11-01 17:06:57,808][INFO] model find in experiments/deepfm_export/1667293612
[2022-11-01 17:06:58,992] [INFO] [1487#MainThread] [tensorflow/python/training/saver.py:1995] Restoring parameters from experiments/deepfm_export/1667293612/variables/variables
[2022-11-01 17:06:58,992][INFO] Restoring parameters from experiments/deepfm_export/1667293612/variables/variables [2022-11-01 17:06:59,194][INFO] Load input binding: c19 -> input_20:0 [2022-11-01 17:06:59,195][INFO] Load input binding: device_type -> input_13:0 [2022-11-01 17:06:59,195][INFO] Load input binding: app_category -> input_9:0 [2022-11-01 17:06:59,196][INFO] Load input binding: app_domain -> input_8:0 [2022-11-01 17:06:59,196][INFO] Load input binding: device_id -> input_10:0 [2022-11-01 17:06:59,196][INFO] Load input binding: site_domain -> input_5:0 [2022-11-01 17:06:59,197][INFO] Load input binding: device_model -> input_12:0 [2022-11-01 17:06:59,197][INFO] Load input binding: app_id -> input_7:0 [2022-11-01 17:06:59,198][INFO] Load input binding: c16 -> input_17:0 [2022-11-01 17:06:59,198][INFO] Load input binding: banner_pos -> input_3:0 [2022-11-01 17:06:59,198][INFO] Load input binding: c14 -> input_15:0 [2022-11-01 17:06:59,199][INFO] Load input binding: hour -> input_1:0 [2022-11-01 17:06:59,199][INFO] Load input binding: c1 -> input_2:0 [2022-11-01 17:06:59,199][INFO] Load input binding: c20 -> input_21:0 [2022-11-01 17:06:59,200][INFO] Load input binding: device_ip -> input_11:0 [2022-11-01 17:06:59,200][INFO] Load input binding: c18 -> input_19:0 [2022-11-01 17:06:59,201][INFO] Load input binding: c15 -> input_16:0 [2022-11-01 17:06:59,201][INFO] Load input binding: device_conn_type -> input_14:0 [2022-11-01 17:06:59,204][INFO] Load input binding: site_category -> input_6:0 [2022-11-01 17:06:59,204][INFO] Load input binding: site_id -> input_4:0 [2022-11-01 17:06:59,204][INFO] Load input binding: c21 -> input_22:0 [2022-11-01 17:06:59,205][INFO] Load input binding: c17 -> input_18:0 [2022-11-01 17:06:59,205][INFO] Load output binding: logits -> Squeeze:0 [2022-11-01 17:06:59,205][INFO] Load output binding: probs -> Sigmoid:0 [2022-11-01 17:06:59,207][INFO] {'pipeline.config': 'experiments/deepfm_export/1667293612/assets/pipeline.config'} [2022-11-01 17:06:59,208][INFO] all_input_names: []
[{'logits': -0.1952547, 'probs': 0.45134082}, {'logits': -0.11913895, 'probs': 0.47025046}, {'logits': -0.116702095, 'probs': 0.47085756}, {'logits': -0.1305175, 'probs': 0.46741685}, {'logits': -0.18660952, 'probs': 0.45348254}, {'logits': -0.1715414, 'probs': 0.4572195}, {'logits': -0.19325311, 'probs': 0.45183653}, {'logits': -0.17194295, 'probs': 0.45711985}, {'logits': -0.17738754, 'probs': 0.45576906}, {'logits': -0.13334118, 'probs': 0.46671402}, {'logits': -0.12814105, 'probs': 0.46800846}, {'logits': -0.13306546, 'probs': 0.4667826}, {'logits': -0.18848173, 'probs': 0.45301855}, {'logits': -0.1448257, 'probs': 0.46385676}, {'logits': -0.11907825, 'probs': 0.47026554}, {'logits': -0.1274537, 'probs': 0.46817964}, {'logits': -0.21832734, 'probs': 0.44563395}, {'logits': -0.19236861, 'probs': 0.4520556}, {'logits': -0.16542831, 'probs': 0.45873702}, {'logits': -0.13001129, 'probs': 0.46754292}, {'logits': -0.17844774, 'probs': 0.45550612}, {'logits': -0.12868252, 'probs': 0.46787366}, {'logits': -0.13728587, 'probs': 0.4657323}, {'logits': -0.14307421, 'probs': 0.46429232}, {'logits': -0.12963952, 'probs': 0.46763542}, {'logits': -0.14125434, 'probs': 0.464745}, {'logits': -0.17874469, 'probs': 0.45543242}, {'logits': -0.1846297, 'probs': 0.45397323}, {'logits': -0.12691888, 'probs': 0.4683128}, {'logits': -0.19902793, 'probs': 0.4504066}, {'logits': -0.15309595, 'probs': 0.4618006}, {'logits': -0.16902995, 'probs': 0.45784286}]