前端智能化漫谈 (2) - pix2code实战篇

2019-07-31 11972

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： # 前端智能化漫谈 (2) - pix2code实战篇 ## 将pix2code跑起来先来干货介绍将pix2code跑起来的步骤： 1. 下载pix2code源代码 ``` git clone https://github.com/tonybeltramelli/pix2code ``` 网速慢的话需要等一等，.git就有700兆左右。

前端智能化漫谈 (2) - pix2code实战篇

将pix2code跑起来

先来干货介绍将pix2code跑起来的步骤：
1 下载pix2code源代码

git clone https://github.com/tonybeltramelli/pix2code

网速慢的话需要等一等，.git就有700兆左右。图像数据也有450兆左右。
2 解压数据

cd datasets
zip -F pix2code_datasets.zip --out datasets.zip
unzip datasets.zip

3 创建训练集和测试集

cd ../model

./build_datasets.py ../datasets/ios/all_data
./build_datasets.py ../datasets/android/all_data
./build_datasets.py ../datasets/web/all_data

4 安装python库

pip install opencv-python
pip install tensorflow
pip install keras

5 图像转数组

./convert_imgs_to_arrays.py ../datasets/ios/training_set ../datasets/ios/training_features
./convert_imgs_to_arrays.py ../datasets/android/training_set ../datasets/android/training_features
./convert_imgs_to_arrays.py ../datasets/web/training_set ../datasets/web/training_features

6 训练
以Android为例：
第一次训练的话：

mkdir bin
cd model
./train.py ../datasets/android/training_features ../bin 1

如果不是第一次训练：

cd model
./train.py ../datasets/android/training_features ../bin 1 ../bin/pix2code.h5

7 推理
找张Android或iOS或Web图片，来试验一下效果吧：

./sample.py ../bin pix2code ../test_gui.png ../code greedy

8 生成HTML/Android/iOS源文件

cd compiler

# compile .gui file to Android XML UI
./android-compiler.py <input file path>.gui

# compile .gui file to iOS Storyboard
./ios-compiler.py <input file path>.gui

# compile .gui file to HTML/CSS (Bootstrap style)
./web-compiler.py <input file path>.gui

pix2code过程解说

数据

为了避免github规定每个包最大50MB的规定(现在有git-lfs了应该不用了吧)，所以训练数据分成了10个压缩包。

首先我们把这些压缩包整理成一个。作者是采用了zip -F这个修复命令，生成一个新的datasets.zip包，大小为466,387,966字节，然后将其解压：

# reassemble and unzip the data
cd datasets
zip -F pix2code_datasets.zip --out datasets.zip
unzip datasets.zip

解压后会生成android，web和ios三个目录。

三个目录只有一层子目录，即all_data目当，下面分别有1750张png图片，以1750个对应的gui文件。

android图片的统一大小为688*1070，3通道RGB，144ppi。
ios图片大小为760*1340，3通道RGB，144ppi。
web图片为2400*1380，3通道RGB，144ppi。

下面我们将all_data数据分为训练集和验证集：

# split training set and evaluation set while ensuring no training example in the evaluation set
# usage: build_datasets.py <input path> <distribution (default: 6)>
./build_datasets.py ../datasets/ios/all_data
./build_datasets.py ../datasets/android/all_data
./build_datasets.py ../datasets/web/all_data

build_datasets.py会将all_data数据分为training_set和eval_set两部分。

TRAINING_SET_NAME = "training_set"
EVALUATION_SET_NAME = "eval_set"

默认情况下，1750个图像和DSL会被分为训练集1500个和测试集250个。我们以ios为例：

Splitting datasets, training samples: 1500, evaluation samples: 250
Training dataset: ../datasets/ios/training_set
Evaluation dataset: ../datasets/ios/eval_set

训练之前，我们将图像正则化一下，有助于提升训练效果。
这时候需要opencv-python库去做图片的resize，需要先安装一下。
RGB数据是0到255的整数，我们通过除以255将其转换成0到1之间的浮点数。
代码如下：

    @staticmethod
    def get_preprocessed_img(img_path, image_size):
        import cv2
        img = cv2.imread(img_path)
        img = cv2.resize(img, (image_size, image_size))
        img = img.astype('float32')
        img /= 255
        return img

加上复制.gui的完整逻辑如下：

print("Converting images to numpy arrays...")

for f in os.listdir(input_path):
    if f.find(".png") != -1:
        img = Utils.get_preprocessed_img("{}/{}".format(input_path, f), IMAGE_SIZE)
        file_name = f[:f.find(".png")]

        np.savez_compressed("{}/{}".format(output_path, file_name), features=img)
        retrieve = np.load("{}/{}.npz".format(output_path, file_name))["features"]

        assert np.array_equal(img, retrieve)

        shutil.copyfile("{}/{}.gui".format(input_path, file_name), "{}/{}.gui".format(output_path, file_name))

print("Numpy arrays saved in {}".format(output_path))

训练

数据准备好之后，我们就可以开始训练了。

训练使用train.py，一共4个参数.
train.py
第一个参数是输入路径，第二个是输出训练结果的路径，第三个是内存选项，第四个是加载已经预训练好的权值。
举几个例子，首先是最基本的情况，只有必要的输入和输出目录：

./train.py ../datasets/web/training_features ../bin

最复杂的例子是加载预训练的结果的：

./train.py ../datasets/android/training_features ../bin 1 ../bin/pix2code.h5

我们直接看代码：

if __name__ == "__main__":
    argv = sys.argv[1:]

    if len(argv) < 2:
        print("Error: not enough argument supplied:")
        print("train.py <input path> <output path> <is memory intensive (default: 0)> <pretrained weights (optional)>")
        exit(0)
    else:
        input_path = argv[0]
        output_path = argv[1]
        use_generator = False if len(argv) < 3 else True if int(argv[2]) == 1 else False
        pretrained_weigths = None if len(argv) < 4 else argv[3]

    run(input_path, output_path, is_memory_intensive=use_generator, pretrained_model=pretrained_weigths)

进入到run之后，首先设置Dataset的参数：

def run(input_path, output_path, is_memory_intensive=False, pretrained_model=None):
    np.random.seed(1234)

    dataset = Dataset()
    dataset.load(input_path, generate_binary_sequences=True)
    dataset.save_metadata(output_path)
    dataset.voc.save(output_path)

dataset的load过程中，需要对文本进行一些处理，第一步当然是先查找图片对应到的gui文本：

    def load(self, path, generate_binary_sequences=False):
        print("Loading data...")
        for f in os.listdir(path):
            if f.find(".gui") != -1:
                gui = open("{}/{}".format(path, f), 'r')
                file_name = f[:f.find(".gui")]

                if os.path.isfile("{}/{}.png".format(path, file_name)):
                    img = Utils.get_preprocessed_img("{}/{}.png".format(path, file_name), IMAGE_SIZE)
                    self.append(file_name, gui, img)
                elif os.path.isfile("{}/{}.npz".format(path, file_name)):
                    img = np.load("{}/{}.npz".format(path, file_name))["features"]
                    self.append(file_name, gui, img)

找到文本之后，将文本转成向量。我们先看下完整的流程，然后再分别看细节：

        print("Generating sparse vectors...")
        self.voc.create_binary_representation()
        self.next_words = self.sparsify_labels(self.next_words, self.voc)
        if generate_binary_sequences:
            self.partial_sequences = self.binarize(self.partial_sequences, self.voc)
        else:
            self.partial_sequences = self.indexify(self.partial_sequences, self.voc)

        self.size = len(self.ids)
        assert self.size == len(self.input_images) == len(self.partial_sequences) == len(self.next_words)
        assert self.voc.size == len(self.voc.vocabulary)

        print("Dataset size: {}".format(self.size))
        print("Vocabulary size: {}".format(self.voc.size))

        self.input_shape = self.input_images[0].shape
        self.output_size = self.voc.size

        print("Input shape: {}".format(self.input_shape))
        print("Output size: {}".format(self.output_size))

其中, create_binary_representation是将文本转成稀疏矩阵存储：

    def create_binary_representation(self):
        if sys.version_info >= (3,):
            items = self.vocabulary.items()
        else:
            items = self.vocabulary.iteritems()
        for key, value in items:
            binary = np.zeros(self.size)
            binary[value] = 1
            self.binary_vocabulary[key] = binary

数据加载好之后，根据设置的内存属性来加载数据集。
如果内存充足，就直接将数据集转成数组，直接开始训练了：

    if not is_memory_intensive:
        dataset.convert_arrays()

        input_shape = dataset.input_shape
        output_size = dataset.output_size

        print(len(dataset.input_images), len(dataset.partial_sequences), len(dataset.next_words))
        print(dataset.input_images.shape, dataset.partial_sequences.shape, dataset.next_words.shape)

这个convert_arrays()真的是如其名，只负责转array，调用np.array来干活：

    def convert_arrays(self):
        print("Convert arrays...")
        self.input_images = np.array(self.input_images)
        self.partial_sequences = np.array(self.partial_sequences)
        self.next_words = np.array(self.next_words)

如果内存不是那么可以浪费，就分成批次来处理吧. 记得上一讲曾讲过的BATCH_SIZE参数么，这时就派上用场了：

    else:
        gui_paths, img_paths = Dataset.load_paths_only(input_path)

        input_shape = dataset.input_shape
        output_size = dataset.output_size
        steps_per_epoch = dataset.size / BATCH_SIZE

        voc = Vocabulary()
        voc.retrieve(output_path)

        generator = Generator.data_generator(voc, gui_paths, img_paths, batch_size=BATCH_SIZE, generate_binary_sequences=True)

至此数据预处理完毕，可以开始建模了：

    model = pix2code(input_shape, output_size, output_path)

如果有预训练好的模型就加载进来：

    if pretrained_model is not None:
        model.model.load_weights(pretrained_model)

最后，还是根据传入的内存参数来决定是用哪一种fit:

    if not is_memory_intensive:
        model.fit(dataset.input_images, dataset.partial_sequences, dataset.next_words)
    else:
        model.fit_generator(generator, steps_per_epoch=steps_per_epoch)

前端智能化漫谈 (2) - pix2code实战篇

前端智能化漫谈 (2) - pix2code实战篇

将pix2code跑起来

pix2code过程解说

数据

训练

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

前端智能化漫谈 (2) - pix2code实战篇

前端智能化漫谈 (2) - pix2code实战篇

将pix2code跑起来

pix2code过程解说

数据

训练

热门文章

最新文章

相关课程

相关电子书

相关实验场景