PostgreSQL 内核扩展之 - 管理十亿级3D扫描数据(基于Lidar产生的point cloud数据)

本文涉及的产品
云数据库 RDS SQL Server,基础系列 2核4GB
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介:

背景知识

还记得成龙演的那部《十二生肖》里用3D扫描和打印技术复制的生肖吗?
screenshot
3D打印是近几年兴起的一种技术,除了存储 物体表面的位置信息,还有颜色,密度等信息
而3D扫描其实在军用领域很早以前就有了。
如果使用普通的数据库来存储,得把这些属性拆开来存。
而在PostgreSQL中,你完全不需要把这些属性拆开,他们本来就是一体的,用好PG的扩展接口就好了。
PostgreSQL 扩展指南 :
https://yq.aliyun.com/articles/55981

什么是Lidar

3D扫描的基础知识,来自百度百科
LiDAR——Light Detection And Ranging,即激光探测与测量。
是利用GPS(Global Position System)和IMU(Inertial Measurement Unit,惯性测量装置)机载激光扫描。
其所测得的数据为数字表面模型(Digital Surface Model, DSM)的离散点表示,数据中含有 空间三维信息激光强度 信息。
应用分类(Classification)技术在这些原始数字表面模型中移除建筑物、人造物、覆盖植物等测点,即可获得数字高程模型(Digital Elevation Model, DEM),并同时得到地面覆盖物的高度。

机载LIDAR技术在国外的发展和应用已有十几年的历史,但是我国在这方面的研究和应用还只是刚刚起步,其中利用航空激光扫描探测数据进行困难地区DEM、DOM、DLG数据产品生产是当今的研究热点之一。
该技术在地形测绘、环境检测、三维城市建模等诸多领域具有广阔的发展前景和应用需求,有可能为测绘行业带来一场新的技术革命。
针对不同的应用领域及成果要求,结合灵活的搭载方式,LiDAR技术可以广泛应用于基础测绘、道路工程、电力电网、水利、石油管线、海岸线及海岛礁、数字城市等领域,提供高精度、大比例尺(1:500至1:10000)的空间数据成果。

例子
激光扫描设备装置可记录一个单发射脉冲返回的首回波﹑中间多个回波与最后回波,通过对每个回波时刻记录,可同时获得多个高程信息,将IMU/DGPS系统和激光扫描技术进行集成,飞机向前飞行时,扫描仪横向对地面发射连续的激光束,同时接受地面反射回波,IMU/DGPS系统记录每一个激光发射点的瞬间空间位置和姿态,从而可计算得到激光反射点的空间位置。
b03533fa828ba61e24f4d8e14134970a314e59e1

激光雷达的激光束扫瞄机场的滑道,探测飞机将遇到的风切变,即逆风的改变。
023b5bb5c9ea15ce520c2114b6003af33b87b2c9

什么是point cloud

https://en.wikipedia.org/wiki/Point_cloud

前面提到了LIDAR扫描的数据包括了位置信息,以及其他传感器在对应这个位置采集的其他信息如 RGB、时间、温度、湿度等等(你可以大开脑洞想,什么都有)。
也就是说,每一个点都包含了大量的信息。

The variables captured by LIDAR sensors varies by sensor and capture process.   
Some data sets might contain only X/Y/Z values.   
Others will contain dozens of variables: X, Y, Z; intensity and return number;   
red, green, and blue values; return times; and many more.
AI 代码解读

point cloud则是很多点的集合,所以信息量是非常庞大的。

A point cloud is a set of data points in some coordinate system.  

In a three-dimensional coordinate system, these points are usually defined by X, Y, and Z coordinates, and often are intended to represent the external surface of an object.  

Point clouds may be created by 3D scanners. These devices measure a large number of points on an object's surface, and often output a point cloud as a data file.   
The point cloud represents the set of points that the device has measured.
AI 代码解读


point cloud如何存入PostgreSQL

PostgreSQL 存储的point cloud格式与PDAL(Point Data Abstraction Library)库一致。
非常好的复用了PDAL的数据结构,操作方法等。
因为PDAL的通用性,所以兼容性也非常的好。

PostgreSQL Pointcloud deals with all this variability by using a "schema document" to describe the contents of any particular LIDAR point.   
Each point contains a number of dimensions, and each dimension can be of any data type, with scaling and/or offsets applied to move between the actual value and the value stored in the database.   
The schema document format used by PostgreSQL Pointcloud is the same one used by the PDAL library.
AI 代码解读

PostgreSQL Point Cloud支持的对象类型

PcPoint

基本类型,样例

{  
    "pcid" : 1,  
      "pt" : [0.01, 0.02, 0.03, 4]  
}
AI 代码解读

PcPatch

PcPoint的聚合类型,一个PcPatch存储的是一些临近PcPoint的聚合。

The structure of database storage is such that storing billions of points as individual records in a table is not an efficient use of resources. 
Instead, we collect a group of PcPoint into a PcPatch. Each patch should hopefully contain points that are near together.
AI 代码解读

样例

{  
    "pcid" : 1,  
     "pts" : [  
              [0.02, 0.03, 0.05, 6],  
              [0.02, 0.03, 0.05, 8]  
             ]  
}
AI 代码解读

PostgreSQL Point Cloud支持的Functions

PC_MakePoint(pcid integer, vals float8[]) returns pcpoint

构造pcpoint

SELECT PC_MakePoint(1, ARRAY[-127, 45, 124.0, 4.0]);  
010100000064CEFFFF94110000703000000400  

INSERT INTO points (pt)  
SELECT PC_MakePoint(1, ARRAY[x,y,z,intensity])  
FROM (  
  SELECT    
  -127+a/100.0 AS x,   
    45+a/100.0 AS y,  
         1.0*a AS z,  
          a/10 AS intensity  
  FROM generate_series(1,100) AS a  
) AS values;  
pcid是唯一标示一个点的ID
AI 代码解读

PC_AsText(p pcpoint) returns text

pcpoint转换成人类可读的文本

SELECT PC_AsText('010100000064CEFFFF94110000703000000400'::pcpoint);  

{"pcid":1,"pt":[-127,45,124,4]}
AI 代码解读

PC_PCId(p pcpoint) returns integer (from 1.1.0)

取pcid

SELECT PC_PCId('010100000064CEFFFF94110000703000000400'::pcpoint));  

1
AI 代码解读

PC_AsBinary(p pcpoint) returns bytea

将pcpoint转换为OGC WKB编码的二进制

SELECT PC_AsBinary('010100000064CEFFFF94110000703000000400'::pcpoint);  

\x01010000800000000000c05fc000000000008046400000000000005f40
AI 代码解读

PC_Get(pt pcpoint, dimname text) returns numeric

读取pcpoint包含的维度(指定属性)信息

SELECT PC_Get('010100000064CEFFFF94110000703000000400'::pcpoint, 'Intensity');  

4
AI 代码解读

PC_Get(pt pcpoint) returns float8[] (from 1.1.0)

读取pcpoint的属性,返回ARRAY

SELECT PC_Get('010100000064CEFFFF94110000703000000400'::pcpoint);  

{-127,45,124,4}
AI 代码解读

PC_Patch(pts pcpoint[]) returns pcpatch

把多个pcpoint聚合成一个pcpatch

INSERT INTO patches (pa)  
SELECT PC_Patch(pt) FROM points GROUP BY id/10;
AI 代码解读

PC_NumPoints(p pcpatch) returns integer

pcpatch中包含多少个pcpoint

SELECT PC_NumPoints(pa) FROM patches LIMIT 1;  

9
AI 代码解读

PC_PCId(p pcpatch) returns integer (from 1.1.0)

返回pcpatch包含的pcpoint的pcid

SELECT PC_PCId(pa) FROM patches LIMIT 1;  

1
AI 代码解读

PC_Envelope(p pcpatch) returns bytea

将pcpatch转换为OGC WKB编码的二进制

SELECT PC_Envelope(pa) FROM patches LIMIT 1;  

\x0103000000010000000500000090c2f5285cbf5fc0e17a  
14ae4781464090c2f5285cbf5fc0ec51b81e858b46400ad7  
a3703dba5fc0ec51b81e858b46400ad7a3703dba5fc0e17a  
14ae4781464090c2f5285cbf5fc0e17a14ae47814640
AI 代码解读

PC_AsText(p pcpatch) returns text

将pcpatch转换为文本

SELECT PC_AsText(pa) FROM patches LIMIT 1;  

{"pcid":1,"pts":[  
 [-126.99,45.01,1,0],[-126.98,45.02,2,0],[-126.97,45.03,3,0],  
 [-126.96,45.04,4,0],[-126.95,45.05,5,0],[-126.94,45.06,6,0],  
 [-126.93,45.07,7,0],[-126.92,45.08,8,0],[-126.91,45.09,9,0]  
]}
AI 代码解读

PC_Summary(p pcpatch) returns text (from 1.1.0)

返回json格式的文本

SELECT PC_Summary(pa) FROM patches LIMIT 1;  

{"pcid":1, "npts":9, "srid":4326, "compr":"dimensional","dims":[{"pos":0,"name":"X","size":4,"type":"int32_t","compr":"sigbits","stats":{"min":-126.99,"max":-126.91,"avg":-126.95}},{"pos":1,"name":"Y","size":4,"type":"int32_t","compr":"sigbits","stats":{"min":45.01,"max":45.09,"avg":45.05}},{"pos":2,"name":"Z","size":4,"type":"int32_t","compr":"sigbits","stats":{"min":1,"max":9,"avg":5}},{"pos":3,"name":"Intensity","size":2,"type":"uint16_t","compr":"rle","stats":{"min":0,"max":0,"avg":0}}]}
AI 代码解读

PC_Uncompress(p pcpatch) returns pcpatch

解压pcpatch

SELECT PC_Uncompress(pa) FROM patches   
   WHERE PC_NumPoints(pa) = 1;  

01010000000000000001000000C8CEFFFFF8110000102700000A00
AI 代码解读

PC_Union(p pcpatch[]) returns pcpatch

多个pcpatch聚合为一个pcpatch

-- Compare npoints(sum(patches)) to sum(npoints(patches))  
SELECT PC_NumPoints(PC_Union(pa)) FROM patches;  
SELECT Sum(PC_NumPoints(pa)) FROM patches;  

100
AI 代码解读

PC_Intersects(p1 pcpatch, p2 pcpatch) returns boolean

判断两个pcpatch是否有交叉

-- Patch should intersect itself  
SELECT PC_Intersects(  
         '01010000000000000001000000C8CEFFFFF8110000102700000A00'::pcpatch,  
         '01010000000000000001000000C8CEFFFFF8110000102700000A00'::pcpatch);  

t
AI 代码解读

PC_Explode(p pcpatch) returns SetOf[pcpoint]

将pcpatch转换为pcpoint

SELECT PC_AsText(PC_Explode(pa)), id   
FROM patches WHERE id = 7;  

              pc_astext               | id   
--------------------------------------+----  
 {"pcid":1,"pt":[-126.5,45.5,50,5]}   |  7  
 {"pcid":1,"pt":[-126.49,45.51,51,5]} |  7  
 {"pcid":1,"pt":[-126.48,45.52,52,5]} |  7  
 {"pcid":1,"pt":[-126.47,45.53,53,5]} |  7  
 {"pcid":1,"pt":[-126.46,45.54,54,5]} |  7  
 {"pcid":1,"pt":[-126.45,45.55,55,5]} |  7  
 {"pcid":1,"pt":[-126.44,45.56,56,5]} |  7  
 {"pcid":1,"pt":[-126.43,45.57,57,5]} |  7  
 {"pcid":1,"pt":[-126.42,45.58,58,5]} |  7  
 {"pcid":1,"pt":[-126.41,45.59,59,5]} |  7
AI 代码解读

PC_PatchAvg(p pcpatch, dimname text) returns numeric

求pcpatch中包含的某个维度的信息的平均值

SELECT PC_PatchAvg(pa, 'intensity')   
FROM patches WHERE id = 7;  

5.0000000000000000
AI 代码解读

PC_PatchMax(p pcpatch, dimname text) returns numeric

求pcpatch中包含的某个维度的信息的最大值

PC_PatchMin(p pcpatch, dimname text) returns numeric

求pcpatch中包含的某个维度的信息的最小值

PC_PatchAvg(p pcpatch) returns pcpoint (from 1.1.0)

求pcpatch中所有pcpoint的所有维度的平均值

PC_PatchMax(p pcpatch) returns pcpoint (from 1.1.0)

求pcpatch中所有pcpoint的所有维度的最大值

PC_PatchMin(p pcpatch) returns pcpoint (from 1.1.0)

求pcpatch中所有pcpoint的所有维度的最小值

PC_FilterGreaterThan(p pcpatch, dimname text, float8 value) returns pcpatch

返回pcpatch中在指定维度上大于指定值的pcpoint

SELECT PC_AsText(PC_FilterGreaterThan(pa, 'y', 45.57))   
FROM patches WHERE id = 7;  

 {"pcid":1,"pts":[[-126.42,45.58,58,5],[-126.41,45.59,59,5]]}
AI 代码解读

PC_FilterLessThan(p pcpatch, dimname text, float8 value) returns pcpatch

返回pcpatch中在指定维度上小于指定值的pcpoint

PC_FilterBetween(p pcpatch, dimname text, float8 value1, float8 value2) returns pcpatch

返回pcpatch中在指定维度上在指定范围的pcpoint

PC_FilterEquals(p pcpatch, dimname text, float8 value) returns pcpatch

返回pcpatch中在指定维度上等于指定值的pcpoint

PC_Compress(p pcpatch,global_compression_scheme text,compression_config text) returns pcpatch (from 1.1.0)

压缩pcpatch

Allowed global compression schemes are:  
auto -- determined by pcid  
ght -- no compression config supported  
laz -- no compression config supported  
dimensional configuration is a comma-separated list of per-dimension compressions from this list:  
  auto -- determined automatically, from values stats  
  zlib -- deflate compression  
  sigbits -- significant bits removal  
  rle -- run-length encoding
AI 代码解读

PC_PointN(p pcpatch, n int4) returns pcpoint

返回pcpatch中第n个pcpoint,正值从头开始计数,负值反向计数。


point cloud和PostGIS结合使用

CREATE EXTENSION postgis;  
CREATE EXTENSION pointcloud;  
CREATE EXTENSION pointcloud_postgis;
AI 代码解读

PC_Intersects(p pcpatch, g geometry) returns boolean

PC_Intersects(g geometry, p pcpatch) returns boolean

判断pcpatch和geometry是否有相交

SELECT PC_Intersects('SRID=4326;POINT(-126.451 45.552)'::geometry, pa)  
FROM patches WHERE id = 7;  

t
AI 代码解读

PC_Intersection(pcpatch, geometry) returns pcpatch

返回pcpatch中与geometry相交的点组成的一个新的pcpatch

SELECT PC_AsText(PC_Explode(PC_Intersection(  
      pa,   
      'SRID=4326;POLYGON((-126.451 45.552, -126.42 47.55, -126.40 45.552, -126.451 45.552))'::geometry  
)))  
FROM patches WHERE id = 7;  

             pc_astext                 
--------------------------------------  
 {"pcid":1,"pt":[-126.44,45.56,56,5]}  
 {"pcid":1,"pt":[-126.43,45.57,57,5]}  
 {"pcid":1,"pt":[-126.42,45.58,58,5]}  
 {"pcid":1,"pt":[-126.41,45.59,59,5]}
AI 代码解读

Geometry(pcpoint) returns geometry

pcpoint::geometry returns geometry

将pcpatch中的位置属性的信息转换为geometry类型

SELECT ST_AsText(PC_MakePoint(1, ARRAY[-127, 45, 124.0, 4.0])::geometry);  

POINT Z (-127 45 124)
AI 代码解读


point cloud的压缩

PostgreSQL point cloud,使用document输入时,可以指定压缩方法。
写法如下,

<pc:metadata>  
  <Metadata name="compression">dimensional</Metadata>  
</pc:metadata>
AI 代码解读

支持的压缩方法如下

None,   
  which stores points and patches as byte arrays using the type and formats described in the schema document.  

Dimensional,   
  which stores points the same as 'none' but stores patches as collections of dimensional data arrays, with an "appropriate" compression applied.   
Dimensional compression makes the most sense for smaller patch sizes, since small patches will tend to have more homogeneous dimensions.  

GHT or "GeoHash Tree",   
  which stores the points in a tree where each node stores the common values shared by all nodes below.   
For larger patch sizes, GHT should provide effective compression and performance for patch-wise operations.   
You must build Pointcloud with libght support to make use of the GHT compression.  

LAZ or "LASZip".   
  You must build Pointcloud with LAZPERF support to make use of the LAZ compression.  

If no compression is declared in <pc:metadata>, then a compression of "none" is assumed.
AI 代码解读


point cloud的二进制格式

The point and patch binary formats start with a common header, which provides:

endianness flag, to allow portability between architectures
pcid number, to look up the schema information in the pointcloud_formats table

Point Binary

byte:     endianness (1 = NDR, 0 = XDR)  
uint32:   pcid (key to POINTCLOUD_SCHEMAS)  
uchar[]:  pointdata (interpret relative to pcid)
AI 代码解读

The patch binary formats have additional standard header information:

the compression number, which indicates how to interpret the data
the number of points in the patch

Patch Binary (Uncompressed)

byte:         endianness (1 = NDR, 0 = XDR)  
uint32:       pcid (key to POINTCLOUD_SCHEMAS)  
uint32:       0 = no compression  
uint32:        npoints  
pointdata[]:  interpret relative to pcid
AI 代码解读

pcpatch的压缩格式的二进制表述请参考
https://github.com/pgpointcloud/pointcloud

如果将数据导入point cloud

有两种格式导入
From WKB

From PDAL

参考
https://github.com/pgpointcloud/pointcloud

pcpoint和pcpatch类型的SQL定义

CREATE TYPE pcpoint (  
    internallength = variable,  
    input = pcpoint_in,  
    output = pcpoint_out,  
    -- send = geometry_send,  
    -- receive = geometry_recv,  
    typmod_in = pc_typmod_in,  
    typmod_out = pc_typmod_out,  
    -- delimiter = ':',  
    -- alignment = double,  
    -- analyze = geometry_analyze,  
    storage = external -- do not try to compress it please  
);  

CREATE TYPE pcpatch (  
    internallength = variable,  
    input = pcpatch_in,  
    output = pcpatch_out,  
    -- send = geometry_send,  
    -- receive = geometry_recv,  
    typmod_in = pc_typmod_in,  
    typmod_out = pc_typmod_out,  
    -- delimiter = ':',  
    -- alignment = double,  
    -- analyze = geometry_analyze,  
    storage = external  
);  

CREATE TYPE pointcloud_abs (  
    internallength = 8,  
    input = pointcloud_abs_in,  
    output = pointcloud_abs_out,  
    alignment = double  
);
AI 代码解读


pcpoint 数据类型 输入输出 对应的C函数

PG_FUNCTION_INFO_V1(pcpoint_in);  
Datum pcpoint_in(PG_FUNCTION_ARGS)  
{  
    char *str = PG_GETARG_CSTRING(0);  
    /* Datum pc_oid = PG_GETARG_OID(1); Not needed. */  
    int32 typmod = 0;  
    uint32 pcid = 0;  
    PCPOINT *pt;  
    SERIALIZED_POINT *serpt = NULL;  

    if ( (PG_NARGS()>2) && (!PG_ARGISNULL(2)) )  
    {  
        typmod = PG_GETARG_INT32(2);  
        pcid = pcid_from_typmod(typmod);  
    }  

    /* Empty string. */  
    if ( str[0] == '\0' )  
    {  
        ereport(ERROR,(errmsg("pcpoint parse error - empty string")));  
    }  

    /* Binary or text form? Let's find out. */  
    if ( str[0] == '0' )  
    {  
        /* Hex-encoded binary */  
        pt = pc_point_from_hexwkb(str, strlen(str), fcinfo);  
        pcid_consistent(pt->schema->pcid, pcid);  
        serpt = pc_point_serialize(pt);  
        pc_point_free(pt);  
    }  
    else  
    {  
        ereport(ERROR,(errmsg("parse error - support for text format not yet implemented")));  
    }  

    if ( serpt ) PG_RETURN_POINTER(serpt);  
    else PG_RETURN_NULL();  
}  

PG_FUNCTION_INFO_V1(pcpoint_out);  
Datum pcpoint_out(PG_FUNCTION_ARGS)  
{  
    PCPOINT *pcpt = NULL;  
    PCSCHEMA *schema = NULL;  
    SERIALIZED_POINT *serpt = NULL;  
    char *hexwkb = NULL;  

    serpt = PG_GETARG_SERPOINT_P(0);  
    schema = pc_schema_from_pcid(serpt->pcid, fcinfo);  
    pcpt = pc_point_deserialize(serpt, schema);  
    hexwkb = pc_point_to_hexwkb(pcpt);  
    pc_point_free(pcpt);  
    PG_RETURN_CSTRING(hexwkb);  
}
AI 代码解读


pcpatch 数据类型 输入输出 对应的C函数

PG_FUNCTION_INFO_V1(pcpatch_in);  
Datum pcpatch_in(PG_FUNCTION_ARGS)  
{  
    char *str = PG_GETARG_CSTRING(0);  
    /* Datum geog_oid = PG_GETARG_OID(1); Not needed. */  
    uint32 typmod = 0, pcid = 0;  
    PCPATCH *patch;  
    SERIALIZED_PATCH *serpatch = NULL;  

    if ( (PG_NARGS()>2) && (!PG_ARGISNULL(2)) )  
    {  
        typmod = PG_GETARG_INT32(2);  
        pcid = pcid_from_typmod(typmod);  
    }  

    /* Empty string. */  
    if ( str[0] == '\0' )  
    {  
        ereport(ERROR,(errmsg("pcpatch parse error - empty string")));  
    }  

    /* Binary or text form? Let's find out. */  
    if ( str[0] == '0' )  
    {  
        /* Hex-encoded binary */  
        patch = pc_patch_from_hexwkb(str, strlen(str), fcinfo);  
        pcid_consistent(patch->schema->pcid, pcid);  
        serpatch = pc_patch_serialize(patch, NULL);  
        pc_patch_free(patch);  
    }  
    else  
    {  
        ereport(ERROR,(errmsg("parse error - support for text format not yet implemented")));  
    }  

    if ( serpatch ) PG_RETURN_POINTER(serpatch);  
    else PG_RETURN_NULL();  
}  

PG_FUNCTION_INFO_V1(pcpatch_out);  
Datum pcpatch_out(PG_FUNCTION_ARGS)  
{  
    PCPATCH *patch = NULL;  
    SERIALIZED_PATCH *serpatch = NULL;  
    char *hexwkb = NULL;  
    PCSCHEMA *schema = NULL;  

    serpatch = PG_GETARG_SERPATCH_P(0);  
    schema = pc_schema_from_pcid(serpatch->pcid, fcinfo);  
    patch = pc_patch_deserialize(serpatch, schema);  
    hexwkb = pc_patch_to_hexwkb(patch);  
    pc_patch_free(patch);  
    PG_RETURN_CSTRING(hexwkb);  
}
AI 代码解读


参考

https://en.wikipedia.org/wiki/Lidar
http://baike.baidu.com/view/2922098.htm
https://github.com/pgpointcloud/pointcloud
http://pointcloud.org/
https://en.wikipedia.org/wiki/Point_cloud
http://www.pdal.io/

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
打赏
0
0
0
0
20696
分享
相关文章
在Linux环境下备份Docker中的MySQL数据并传输到其他服务器以实现数据级别的容灾
以上就是在Linux环境下备份Docker中的MySQL数据并传输到其他服务器以实现数据级别的容灾的步骤。这个过程就像是一场接力赛,数据从MySQL数据库中接力棒一样传递到备份文件,再从备份文件传递到其他服务器,最后再传递回MySQL数据库。这样,即使在灾难发生时,我们也可以快速恢复数据,保证业务的正常运行。
74 28
【YashanDB知识库】MySQL迁移至崖山char类型数据自动补空格问题
**简介**:在MySQL迁移到崖山环境时,若字段类型为char(2),而应用存储的数据仅为&#39;0&#39;或&#39;1&#39;,查询时崖山会自动补空格。原因是mysql的sql_mode可能启用了PAD_CHAR_TO_FULL_LENGTH模式,导致保留CHAR类型尾随空格。解决方法是与应用确认数据需求,可将崖山环境中的char类型改为varchar类型以规避补空格问题,适用于所有版本。
【YashanDB知识库】字符集latin1的MySQL中文数据如何迁移到YashanDB
本文探讨了在使用YMP 23.2.1.3迁移MySQL Server字符集为latin1的中文数据至YashanDB时出现乱码的问题。问题根源在于MySQL latin1字符集存放的是实际utf8编码的数据,而YMP尚未支持此类场景。文章提供了两种解决方法:一是通过DBeaver直接迁移表数据;二是将MySQL表数据转换为Insert语句后手动插入YashanDB。同时指出,这两种方法适合单张表迁移,多表迁移可能存在兼容性问题,建议对问题表单独处理。
【YashanDB知识库】字符集latin1的MySQL中文数据如何迁移到YashanDB
Redis和Mysql如何保证数据⼀致?
1. 先更新Mysql,再更新Redis,如果更新Redis失败,可能仍然不⼀致 2. 先删除Redis缓存数据,再更新Mysql,再次查询的时候在将数据添加到缓存中 这种⽅案能解决1 ⽅案的问题,但是在⾼并发下性能较低,⽽且仍然会出现数据不⼀致的问题,⽐如线程1删除了 Redis缓存数据,正在更新Mysql,此时另外⼀个查询再查询,那么就会把Mysql中⽼数据⼜查到 Redis中 1. 使用MQ异步同步, 保证数据的最终一致性 我们项目中会根据业务情况 , 使用不同的方案来解决Redis和Mysql的一致性问题 : 1. 对于一些一致性要求不高的场景 , 不做处理例如 : 用户行为数据 ,
SpringBoot 通过集成 Flink CDC 来实时追踪 MySql 数据变动
通过详细的步骤和示例代码,您可以在 SpringBoot 项目中成功集成 Flink CDC,并实时追踪 MySQL 数据库的变动。
516 43
MySQL底层概述—4.InnoDB数据文件
本文介绍了InnoDB表空间文件结构及其组成部分,包括表空间、段、区、页和行。表空间是最高逻辑层,包含多个段;段由若干个区组成,每个区包含64个连续的页,页用于存储多条行记录。文章还详细解析了Page结构,分为通用部分(文件头与文件尾)、数据记录部分和页目录部分。此外,文中探讨了行记录格式,包括四种行格式(Redundant、Compact、Dynamic和Compressed),重点介绍了Compact行记录格式及其溢出机制。最后,文章解释了不同行格式的特点及应用场景,帮助理解InnoDB存储引擎的工作原理。
MySQL底层概述—4.InnoDB数据文件
基于SQL Server / MySQL进行百万条数据过滤优化方案
对百万级别数据进行高效过滤查询,需要综合使用索引、查询优化、表分区、统计信息和视图等技术手段。通过合理的数据库设计和查询优化,可以显著提升查询性能,确保系统的高效稳定运行。
85 9
【YashanDB 知识库】从 PostgreSQL 迁移到 YashanDB 如何进行数据行数比对
【YashanDB 知识库】从 PostgreSQL 迁移到 YashanDB 如何进行数据行数比对
【YashanDB知识库】从PostgreSQL迁移到YashanDB如何进行数据行数比对
本文介绍了通过Oracle视图`vsqlvsql_plan`分析SQL性能的方法。首先,可通过`plan_hash_value`从`v$sql_plan`获取SQL执行计划,结合示例展示了具体查询方式。文章还创建了一个UDF函数`REPEAT`用于格式化输出,便于阅读复杂执行计划。最后,通过实例展示了如何根据`plan_hash_value`获取SQL文本及其内存中的执行计划,帮助优化性能问题。
MySQL和SQLSugar百万条数据查询分页优化
在面对百万条数据的查询时,优化MySQL和SQLSugar的分页性能是非常重要的。通过合理使用索引、调整查询语句、使用缓存以及采用高效的分页策略,可以显著提高查询效率。本文介绍的技巧和方法,可以为开发人员在数据处理和查询优化中提供有效的指导,提升系统的性能和用户体验。掌握这些技巧后,您可以在处理海量数据时更加游刃有余。
226 9

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版
  • 目录
    目录
    AI助理

    你好,我是AI助理

    可以解答问题、推荐解决方案等