Java【算法分享 02】道格拉斯-普克 Douglas-Peucker 抽稀算法分析及15w个坐标点抽稀到3.7w耗时从360s+优化到365ms接近1000倍的速度提升（并行流+多线程+泛型）-阿里云开发者社区

Java【算法分享 02】道格拉斯-普克 Douglas-Peucker 抽稀算法分析及15w个坐标点抽稀到3.7w耗时从360s+优化到365ms接近1000倍的速度提升（并行流+多线程+泛型）

2023-12-23 253

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： Java【算法分享 02】道格拉斯-普克 Douglas-Peucker 抽稀算法分析及15w个坐标点抽稀到3.7w耗时从360s+优化到365ms接近1000倍的速度提升（并行流+多线程+泛型）

1.分析

算法详细流程可查看《道格拉斯抽稀算法流程图解+使用JDK8方法实现+详细注解源码》经典的 D-P 算法描述如下【红色部分用于辅助理解可忽略】：

连接当前矢量曲线首尾点a、b，该直线AB为当前矢量曲线的弦；
计算首尾点a、b间所有坐标点到该弦AB的距离，并获取最大距离d及其的坐标点c；
比较最大距离d与给定的阈值thresholdVal，小于阈值，当前弦AB可作为曲线的近似【首尾间的坐标点将被抽稀】，该段矢量曲线处理完毕。
大于阈值，使用最大距离坐标点c分割当前矢量曲线，分割后的两段矢量曲线分别使用1~3的流程进行处理。
全部处理完毕后，依次连接各个分割点形成的折线，即可以作为整个曲线的近似。

算法耗时及问题分析：

点 M ( x 0 , y 0 ) M(x_{0},y_{0})M(x0,y0) 到直线 A x + B y + C = 0 Ax+By+C=0Ax+By+C=0 的距离 d ：

坐标点较多时，每个点都要计算 A、B、C 三个参数并求解 d 将消耗不少时间。【耗时】

得到分割点后，递归调用起止点计算标记删除点方法，可能出现栈溢出。【问题】
thresholdVal 值越小保留细节将越多，分割点会呈现指数级增长，耗时会增加，需要找到合理值。【耗时+问题】
返回对象包含字段不统一，抽稀算法无法接收不同类型参数。【问题】

2.优化

1️⃣ 个算法使用的封装类和 1️⃣ 个业务使用封装类用于举例【为了简洁注解未使用DOC规范】：

// 类1 坐标数据封装
@Data
public class PointData {
    // 标记是否删除（0保留 1删除）
    private int isDelete;
    // 数据点的 index 用于优化运算速度（核心优化点）
    private int indexEx;
  // 数据点的经度
    private double longitudeEx;
    // 数据点的纬度
    private double latitudeEx;
}
// 类2 业务查询返回的数据封装【主要是实现了 PointData 类】具体字段不再贴出
@Data
public class Mobile2g extends PointData implements Serializable {
}

线程池不做过多介绍：

public class ThreadAllBaseStation {
    private static int corePoolSize = Runtime.getRuntime().availableProcessors();
    private static ThreadFactory namedFactory = new ThreadFactoryBuilder().setNameFormat("XXX数据查询线程-%d").build();
    /**
     * corePoolSize用于指定核心线程数量
     * maximumPoolSize指定最大线程数
     * keepAliveTime和TimeUnit指定线程空闲后的最大存活时间
     */
    public static ThreadPoolExecutor executor = new ThreadPoolExecutor(corePoolSize, corePoolSize + 1, 10L, TimeUnit.SECONDS,
            new LinkedBlockingQueue<>(1000), namedFactory, new ThreadPoolExecutor.AbortPolicy());
}

优化后的算法核心代码【为了简洁部分注解未使用DOC规范】：

@Slf4j
public class Douglas {
    // 点数据
    private static List<? extends PointData> points = new ArrayList<>();
    // 距离阈值（值越大效果越好但丢失细节就越多）【这个值已经算很小的了】
    private static final double DISTANCE_THRESHOLD = 0.00001;
    // 最大线程数
    private static final int MAX_THREAD_SIZE = 60;
    /**
     * 标记要删除的点（并获取分割点数据）
     *
     * @param from 矢量曲线的起始点
     * @param to   矢量曲线的终止点
     * @return 起始点 终止点 分割点
     */
    private static PointData[] markDeletePointsAndGetSplitPoint(PointData from, PointData to) {
        // 计算由起始点和终止点构成直线方程一般式的系数
        double fromLat = from.getLatitudeEx();
        double fromLng = from.getLongitudeEx();
        double toLat = to.getLatitudeEx();
        double toLng = to.getLongitudeEx();
        // 求解点到直线距离方程的参数
        double fromLatMinusToLat = fromLat - toLat;
        double fromLngMinusToLng = fromLng - toLng;
        double sqrtPowAddVal = Math.sqrt(Math.pow(fromLatMinusToLat, 2) + Math.pow(fromLngMinusToLng, 2));
        double parameterA = (fromLatMinusToLat) / sqrtPowAddVal;
        double parameterB = (toLng - fromLng) / sqrtPowAddVal;
        double parameterC = (fromLng * toLat - toLng * fromLat) / sqrtPowAddVal;
        // 获取点数据下标（并排除相邻两点的情况）【最核心的优化】
        int fromIndex = from.getIndexEx();
        int toIndex = to.getIndexEx();
        if (toIndex == fromIndex + 1) {
            return new PointData[]{};
        }
        // 起止点之间的点到起止点连线的距离数据集合并获取最大距离及分割点
        double distanceMax = 0;
        PointData splitPoint = null;
        double powAddRes = Math.pow(parameterA, 2) + Math.pow(parameterB, 2);
        for (int i = fromIndex + 1; i < toIndex; i++) {
            double lng = points.get(i).getLongitudeEx();
            double lat = points.get(i).getLatitudeEx();
            // 点到直线的距离
            double distance = Math.abs(parameterA * (lng) + parameterB * (lat) + parameterC) / Math.sqrt(powAddRes);
            if (distance > distanceMax) {
                distanceMax = distance;
                splitPoint = points.get(i);
            }
        }
        // 最大距离与距离阈值进行比较
        if (distanceMax < DISTANCE_THRESHOLD) {
            // 小于则标记 points(fromIndex,toIndex) 内的坐标（用于删除）
            for (int i = fromIndex + 1; i < toIndex; i++) {
                points.get(i).setIsDelete(1);
            }
        } else {
            // 返回起止点和分割点坐标
            return new PointData[]{from, to, splitPoint};
        }
        return new PointData[]{};
    }
    /**
     * 进行抽稀：标记要删除的点并筛选
     *
     * @param source 矢量曲线点数据
     * @return 抽稀后的点数据
     */
    public static List<? extends PointData> toDilute(List<? extends PointData> source) throws InterruptedException {
        // 抽稀前的点数据
        points = source;
        // 进行标记
        if (!CollectionUtils.isEmpty(points)) {
            // 首次标记
            long s1 = System.currentTimeMillis();
            PointData[] pointData = markDeletePointsAndGetSplitPoint(points.get(0), points.get(points.size() - 1));
            long s2 = System.currentTimeMillis();
            log.info("首次标记用时：" + (s2 - s1) + "ms");
            // 如果有分割点将进行分割标记
            int pointDataLength = 3;
            if (pointData.length == pointDataLength) {
                CopyOnWriteArrayList<PointData[]> splitPointList = new CopyOnWriteArrayList<>();
                splitPointList.add(pointData);
                // 使用线程池进行分割标记
                do {
                    splitPointList = markDeletePointsAndGetSplitPointByThread(splitPointList);
                } while (!splitPointList.isEmpty());
            }
        }
        // 筛选掉标记为删除的点
        return points.parallelStream().filter(point -> point.getIsDelete() != 1).collect(Collectors.toList());
    }
    /**
     * 使用多线程：标记要删除的点（并获取分割点数据）
     *
     * @param splitPointList 起始点及分割点栈数据
     * @return 新的起始点及分割点栈数据
     * @throws InterruptedException 线程等待可能出现的问题
     */
    private static CopyOnWriteArrayList<PointData[]> markDeletePointsAndGetSplitPointByThread(CopyOnWriteArrayList<PointData[]> splitPointList) throws InterruptedException {
        // 新一轮要处理的分割数据
        CopyOnWriteArrayList<PointData[]> splitPointListRes = new CopyOnWriteArrayList<>();
        int pointDataLength = 3;
        int splitPointListSize = splitPointList.size();
        log.info("当前列表Size：" + splitPointListSize);
        int numberBatch = MAX_THREAD_SIZE;
        double number = splitPointListSize * 1.0 / numberBatch;
        int n = ((Double) Math.ceil(number)).intValue();
        for (int i = 0; i < n; i++) {
            int end = numberBatch * (i + 1);
            if (end > splitPointListSize) {
                end = splitPointListSize;
            }
            List<PointData[]> pointData = splitPointList.subList(numberBatch * i, end);
            log.info("当前执行 " + numberBatch * i + " 到 " + end + "条数据");
            CountDownLatch countDownLatch = new CountDownLatch(pointData.size());
            ThreadPoolExecutor executor = ThreadAllBaseStation.executor;
            pointData.forEach(point -> {
                executor.submit(() -> {
                    PointData[] pointDataLeft = markDeletePointsAndGetSplitPoint(point[0], point[2]);
                    if (pointDataLeft.length == pointDataLength) {
                        splitPointListRes.add(pointDataLeft);
                    }
                    PointData[] pointDataRight = markDeletePointsAndGetSplitPoint(point[2], point[1]);
                    if (pointDataRight.length == pointDataLength) {
                        splitPointListRes.add(pointDataRight);
                    }
                    countDownLatch.countDown();
                });
            });
            countDownLatch.await();
        }
        return splitPointListRes;
    }
}

算法使用结合业务【伪代码】：

@OverrideData
    public void douglasAlgorithmTest() {
        // 抽稀前数据查询（坐标点需要有序）
        List<Mobile2g> beforeDouglas = selectData("queryParam");
        // 抽稀字段处理（坐标点）【优化】
        beforeDouglas.forEach(bd -> {
          bd.setIndexEx(indexEx.getAndIncrement());
            bd.setLongitudeEx(bd.getLongitude());
            bd.setLatitudeEx(bd.getLatitude());
        });
        // 抽稀后保存
        List<Mobile2g> afterDouglas = null;
        try {
          // 调用抽稀算法
            afterDouglas = (List<Mobile2g>) Douglas.toDilute(beforeDouglas);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

3.总结

使用泛型解决了不同对象的抽稀问题。
优化最大距离及分割点的获取方法。
使用多线程和并行流计算缩短了抽稀时间【增大线程MAX_THREAD_SIZE仍然可以提升效率】。
最核心的优化

// 原始代码【这个速度是很慢的 要比对每个对象才能找到 index】
int fromIndex = points.indexOf(from); 
// 事先对 PointData 对象添加了indexEx 属性 结果是惊人的快
int fromIndex = from.getIndexEx();

Java【算法分享 02】道格拉斯-普克 Douglas-Peucker 抽稀算法分析及15w个坐标点抽稀到3.7w耗时从360s+优化到365ms接近1000倍的速度提升（并行流+多线程+泛型）

1.分析

2.优化

3.总结

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Java【算法分享 02】道格拉斯-普克 Douglas-Peucker 抽稀算法分析及15w个坐标点抽稀到3.7w耗时从360s+优化到365ms接近1000倍的速度提升（并行流+多线程+泛型）

1.分析

2.优化

3.总结

热门文章

最新文章

相关课程

相关电子书

相关实验场景