问题背景
Protobuf/Flatbuffers是业界广泛使用的序列化库,服务于大量的业务场景。但随着业务场景的复杂化,Protobuf/Flatbuffers逐渐不能满足性能需求开始成为系统瓶颈,在这种情况下,用户不得不手写大量序列化逻辑来进行极致性能优化,但这带来了三个问题:
-
大量字段手写序列化逻辑冗长易出错;
-
手写重复序列化逻辑开发效率低下;
-
难以处理发送端和接收端字段变更的前后兼容性问题;
这里将介绍如何通过我们开发的序列化框架Fury来解决这些问题。Fury是我们开发的一个基于JIT的通用的高性能多语言序列化框架,通过在运行时基于对象类型动态生成序列化代码,和基于Unsafe的高性能内存操作,在保证类型前后兼容(可选)的情况下,实现了全自动的动态序列化能力,能够提供相比于Protobuf/Flatbuffers十倍以上的性能。
相比Protobuf/Flatbuffers,Fury不仅有着更高的性能,同时支持直接动态序列化Java原生对象,不需要进行IDL定义和编译,提供了更高的易用性。
本文将首先给出Fury相比于Protobuf&Flatbuffers的性能数据,然后给出快速从Protobuf/Flatbuffers切换到Fury的代码,最后给出对比Protobuf/Flatbuffers的JMH基准测试详细数据。
Fury/Protobuf/Flatbuffers性能对比
这里先给出性能对比TPS图表,纵轴是每秒序列化次数,值越高表示性能越好,横轴是JDK版本。可以看到Fury相比Protobuf最高有11.6倍的性能,相比Flatbuffers最高有8.5倍的性能:
详细数据如下:
-
JDK11序列化:
-
SAMPLE 序列化性能fury是flatbuffers的 7.4倍,是protobuf的 11.6倍;
-
MEDIA_CONTENT 序列化性能fury是flatbuffers的 8.5倍,是protobuf的 5倍;
-
SAMPLE 反序列化性能fury是flatbuffers的 1.9倍,是protobuf的 5.9倍;
-
MEDIA_CONTENT 反序列化性能fury是flatbuffers的 3倍,是protobuf的 3.5倍;
-
JDK8序列化:
-
SAMPLE 序列化性能fury是flatbuffers的 6.7倍,是protobuf的 9.8倍;
-
MEDIA_CONTENT 序列化性能fury是flatbuffers的 4.2倍,是protobuf的 5.7倍;
-
SAMPLE 反序列化性能fury是flatbuffers的 2.6倍,是protobuf的 5.2倍;
-
MEDIA_CONTENT 反序列化性能fury是flatbuffers的 1.9倍,是protobuf的 2.2倍;
如何快速使用Fury
安装Fury依赖
io.fury
fury-core
0.11.0
创建Fury实例
// 建议作为一个全局变量,避免重复创建
Fury fury = Fury.builder()
.withLanguage(Language.JAVA)
//开启共享引用/循环引用支持,不需要的话建议关闭,性能更快
.withReferenceTracking(true)
// 允许序列化未注册类型
// .withClassRegistrationRequired(false)
// 开启int/long压缩,减少序列化数据大小,无该类需求建议关闭,性能更好
// .withNumberCompressed(true)
.withCompatibleMode(CompatibleMode.SCHEMA_CONSISTENT)
// 开启类型前后兼容,允许序列化和反序列化字段不一致,无该类需求建议关闭,性能更好
// .withCompatibleMode(CompatibleMode.COMPATIBLE)
// 开启异步多线程编译
.withAsyncCompilationEnabled(true)
.build();
ThreadSafeFury fury = Fury.builder()
.withLanguage(Language.JAVA)
//开启共享引用/循环引用支持,不需要的话建议关闭,性能更快
.withReferenceTracking(true)
// 允许序列化未注册类型
// .withClassRegistrationRequired(false)
// 开启int/long压缩,减少序列化数据大小,无该类需求建议关闭,性能更好
// .withNumberCompressed(true)
.withCompatibleMode(CompatibleMode.SCHEMA_CONSISTENT)
// 开启类型前后兼容,允许序列化和反序列化字段不一致,无该类需求建议关闭,性能更好
// .withCompatibleMode(CompatibleMode.COMPATIBLE)
// 开启异步多线程编译
.withAsyncCompilationEnabled(true)
.buildThreadSafeFury();
byte[] bytes = fury.serialize(object);
System.out.println(fury.deserialize(bytes));
序列化任意对象
byte[] bytes = fury.serialize(object);
System.out.println(fury.deserialize(bytes));
JMH基准测试
测试环境
OS:MacBook Pro (16-inch, 2019)
CPU:2.6 GHz 6-Core Intel Core i7
内存:16 GB 2667 MHz DDR4
JMH version: 1.33
JDK version:
-
JDK 1.8.0_292, OpenJDK 64-Bit Server VM, 25.292-b10
-
JDK 11.0.15, OpenJDK 64-Bit Server VM, 11.0.15+10-LTS
测试数据
syntax = "proto3";
package protobuf;
option java_package = "io.fury.integration_tests.state.generated";
option java_outer_classname = "ProtoMessage";
message Sample {
int32 int_value = 1;
int64 long_value = 2;
float float_value = 3;
double double_value = 4;
int32 short_value = 5;
int32 char_value = 6;
bool boolean_value = 7;
int32 int_value_boxed = 8;
int64 long_value_boxed = 9;
float float_value_boxed = 10;
double double_value_boxed = 11;
int32 short_value_boxed = 12;
int32 char_value_boxed = 13;
bool boolean_value_boxed = 14;
repeated int32 int_array = 15;
repeated int64 long_array = 16;
repeated float float_array = 17;
repeated double double_array = 18;
repeated int32 short_array = 19;
repeated int32 char_array = 20;
repeated bool boolean_array = 21;
string string = 22;
}
message MediaContent {
Media media = 1;
repeated Image images = 2;
}
message Media {
string uri = 1;
optional string title = 2;
int32 width = 3;
int32 height = 4;
string format = 5;
int64 duration = 6;
int64 size = 7;
int32 bitrate = 8;
bool has_bitrate = 9;
repeated string persons = 10;
Player player = 11;
string copyright = 12;
}
message Image {
string uri = 1;
optional string title = 2; // Can be null.
int32 width = 3;
int32 height = 4;
Size size = 5;
optional Media media = 6; // Can be null.
}
enum Player {
JAVA = 0;
FLASH = 1;
}
enum Size {
SMALL = 0;
LARGE = 1;
}
JMH测试代码
data-link-href-cangjie="JMH参数:
-
序列化:io.*.integration_tests.UserTypeSerializeSuite.*buffer* -f 3 -wi 5 -i 5 -t 1 -w 2s -r 2s -rf csv
-
反序列化:io.*.integration_tests.UserTypeDeserializeSuite.*buffer.* -f 3 -wi 5 -i 5 -t 1 -w 2s -r 2s -rf csv
详细运行结果
JDK11序列化测试
测试结果:
-
SAMPLE序列化是flatbuffers的 7.4倍,是protobuf的 11.6倍;
-
MEDIA_CONTENT序列化是flatbuffers的 8.5倍,是protobuf的 5倍;
Benchmark |
Mode |
Samples |
Tps |
Unit |
bufferType |
objectType |
references |
Lib |
serialize |
thrpt |
15 |
8392449.276432 |
ops/s |
array |
SAMPLE |
False |
Fury |
serialize |
thrpt |
15 |
4763064.092995 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury |
serialize |
thrpt |
15 |
6988177.459275 |
ops/s |
array |
SAMPLE |
False |
Fury_compatible |
serialize |
thrpt |
15 |
3776080.754596 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury_compatible |
serialize |
thrpt |
15 |
1136577.337596 |
ops/s |
array |
SAMPLE |
False |
Flatbuffers |
serialize |
thrpt |
15 |
558153.211617 |
ops/s |
array |
MEDIA_CONTENT |
False |
Flatbuffers |
serialize |
thrpt |
15 |
725529.918252 |
ops/s |
array |
SAMPLE |
False |
Protobuffers |
serialize |
thrpt |
15 |
959661.012576 |
ops/s |
array |
MEDIA_CONTENT |
False |
Protobuffers |
原始数据:
Benchmark (bufferType) (objectType) (references) Mode Cnt Score Error Units
UserTypeSerializeSuite.fury_serialize array SAMPLE false thrpt 15 8392449.276 ± 719314.595 ops/s
UserTypeSerializeSuite.fury_serialize array MEDIA_CONTENT false thrpt 15 4763064.093 ± 570576.187 ops/s
UserTypeSerializeSuite.fury_serialize_compatible array SAMPLE false thrpt 15 6988177.459 ± 499285.797 ops/s
UserTypeSerializeSuite.fury_serialize_compatible array MEDIA_CONTENT false thrpt 15 3776080.755 ± 564717.770 ops/s
UserTypeSerializeSuite.flatbuffers_serialize array SAMPLE false thrpt 15 1136577.338 ± 112188.874 ops/s
UserTypeSerializeSuite.flatbuffers_serialize array MEDIA_CONTENT false thrpt 15 558153.212 ± 136300.879 ops/s
UserTypeSerializeSuite.protobuffers_serialize array SAMPLE false thrpt 15 725529.918 ± 105701.221 ops/s
UserTypeSerializeSuite.protobuffers_serialize array MEDIA_CONTENT false thrpt 15 959661.013 ± 60868.801 ops/s
JDK8序列化测试
测试结果:
-
SAMPLE序列化是flatbuffers的 6.7倍,是protobuf的 9.8倍;
-
MEDIA_CONTENT序列化是flatbuffers的 4.2倍,是protobuf的 5.7倍;
Benchmark |
Mode |
Samples |
Tps |
Unit |
bufferType |
objectType |
references |
Lib |
serialize |
thrpt |
15 |
7787459.437293 |
ops/s |
array |
SAMPLE |
False |
Fury |
serialize |
thrpt |
15 |
3320120.274126 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury |
serialize |
thrpt |
15 |
6728493.639114 |
ops/s |
array |
SAMPLE |
False |
Fury_compatible |
serialize |
thrpt |
15 |
2357805.065032 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury_compatible |
serialize |
thrpt |
15 |
1166516.473685 |
ops/s |
array |
SAMPLE |
False |
Flatbuffers |
serialize |
thrpt |
15 |
497051.925546 |
ops/s |
array |
MEDIA_CONTENT |
False |
Flatbuffers |
serialize |
thrpt |
15 |
796889.828561 |
ops/s |
array |
SAMPLE |
False |
Protobuffers |
serialize |
thrpt |
15 |
581706.298343 |
ops/s |
array |
MEDIA_CONTENT |
False |
Protobuffers |
原始数据:
Benchmark (bufferType) (objectType) (references) Mode Cnt Score Error Units
UserTypeSerializeSuite.flatbuffers_serialize array SAMPLE false thrpt 15 1166516.474 ± 148035.186 ops/s
UserTypeSerializeSuite.flatbuffers_serialize array MEDIA_CONTENT false thrpt 15 497051.926 ± 67025.722 ops/s
UserTypeSerializeSuite.protobuffers_serialize array SAMPLE false thrpt 15 796889.829 ± 81090.299 ops/s
UserTypeSerializeSuite.protobuffers_serialize array MEDIA_CONTENT false thrpt 15 581706.298 ± 45340.615 ops/s
UserTypeSerializeSuite.fury_serialize array SAMPLE false thrpt 15 7787459.437 ± 690347.739 ops/s
UserTypeSerializeSuite.fury_serialize array MEDIA_CONTENT false thrpt 15 3320120.274 ± 317451.967 ops/s
UserTypeSerializeSuite.fury_serialize_compatible array SAMPLE false thrpt 15 6728493.639 ± 724034.571 ops/s
UserTypeSerializeSuite.fury_serialize_compatible array MEDIA_CONTENT false thrpt 15 2357805.065 ± 146698.800 ops/s
JDK11反序列化测试
测试结果:
-
SAMPLE反序列化是flatbuffers的 1.9倍,是protobuf的 5.9倍;
-
MEDIA_CONTENT反序列化是flatbuffers的 3倍,是protobuf的 3.5倍;
Benchmark |
Mode |
Samples |
Tps |
Unit |
bufferType |
objectType |
references |
Lib |
deserialize |
thrpt |
15 |
3269220.525856 |
ops/s |
array |
SAMPLE |
False |
Fury |
deserialize |
thrpt |
15 |
2790613.161908 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury |
deserialize |
thrpt |
15 |
3200392.443117 |
ops/s |
array |
SAMPLE |
False |
Fury_deserialize_compatible |
deserialize |
thrpt |
15 |
2084558.335255 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury_deserialize_compatible |
deserialize |
thrpt |
15 |
1706566.911191 |
ops/s |
array |
SAMPLE |
False |
Flatbuffers |
deserialize |
thrpt |
15 |
914600.559409 |
ops/s |
array |
MEDIA_CONTENT |
False |
Flatbuffers |
deserialize |
thrpt |
15 |
550393.656196 |
ops/s |
array |
SAMPLE |
False |
Protobuffers |
deserialize |
thrpt |
15 |
804716.689997 |
ops/s |
array |
MEDIA_CONTENT |
False |
Protobuffers |
原始数据:
Benchmark (bufferType) (objectType) (references) Mode Cnt Score Error Units
UserTypeDeserializeSuite.flatbuffers_deserialize array SAMPLE false thrpt 15 1706566.911 ± 235634.962 ops/s
UserTypeDeserializeSuite.flatbuffers_deserialize array MEDIA_CONTENT false thrpt 15 914600.559 ± 399501.643 ops/s
UserTypeDeserializeSuite.protobuffers_deserialize array SAMPLE false thrpt 15 550393.656 ± 76434.813 ops/s
UserTypeDeserializeSuite.protobuffers_deserialize array MEDIA_CONTENT false thrpt 15 804716.690 ± 88169.705 ops/s
UserTypeDeserializeSuite.fury_deserialize array SAMPLE false thrpt 15 3269220.526 ± 257137.052 ops/s
UserTypeDeserializeSuite.fury_deserialize array MEDIA_CONTENT false thrpt 15 2790613.162 ± 334259.574 ops/s
UserTypeDeserializeSuite.fury_deserialize array STRUCT false thrpt 15 7372205.565 ± 225718.455 ops/s
UserTypeDeserializeSuite.fury_deserialize array STRUCT2 false thrpt 15 1382060.978 ± 57011.423 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array SAMPLE false thrpt 15 3200392.443 ± 191460.600 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array MEDIA_CONTENT false thrpt 15 2084558.335 ± 217782.898 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array STRUCT false thrpt 15 2995349.308 ± 124593.720 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array STRUCT2 false thrpt 15 1079316.796 ± 70189.820 ops/s
JDK8反序列化测试
测试结果:
-
SAMPLE反序列化是flatbuffers的 2.6倍,是protobuf的 5.2倍;
-
MEDIA_CONTENT反序列化是flatbuffers的 1.9倍,是protobuf的 2.2倍;
Benchmark |
Mode |
Samples |
Tps |
Unit |
bufferType |
objectType |
references |
Lib |
deserialize |
thrpt |
9 |
2378639.808937 |
ops/s |
array |
SAMPLE |
False |
Fury |
deserialize |
thrpt |
9 |
1800751.571585 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury |
deserialize |
thrpt |
9 |
2229307.602252 |
ops/s |
array |
SAMPLE |
False |
Fury_deserialize_compatible |
deserialize |
thrpt |
9 |
1520342.611135 |
ops/s |
array |
MEDIA_CONTENT |
False |
Fury_deserialize_compatible |
deserialize |
thrpt |
15 |
913254.562922 |
ops/s |
array |
SAMPLE |
False |
Flatbuffers |
deserialize |
thrpt |
15 |
956873.841489 |
ops/s |
array |
MEDIA_CONTENT |
False |
Flatbuffers |
deserialize |
thrpt |
15 |
455276.384332 |
ops/s |
array |
SAMPLE |
False |
Protobuffers |
deserialize |
thrpt |
15 |
819117.894807 |
ops/s |
array |
MEDIA_CONTENT |
False |
Protobuffers |
原始数据:
Benchmark (bufferType) (objectType) (references) Mode Cnt Score Error Units
UserTypeDeserializeSuite.flatbuffers_deserialize array SAMPLE false thrpt 15 913254.563 ± 38973.187 ops/s
UserTypeDeserializeSuite.flatbuffers_deserialize array MEDIA_CONTENT false thrpt 15 956873.841 ± 148809.901 ops/s
UserTypeDeserializeSuite.protobuffers_deserialize array SAMPLE false thrpt 15 455276.384 ± 63507.576 ops/s
UserTypeDeserializeSuite.protobuffers_deserialize array MEDIA_CONTENT false thrpt 15 819117.895 ± 79179.165 ops/s
UserTypeDeserializeSuite.fury_deserialize array SAMPLE false thrpt 9 2378639.809 ± 926405.942 ops/s
UserTypeDeserializeSuite.fury_deserialize array MEDIA_CONTENT false thrpt 9 1800751.572 ± 308042.100 ops/s
UserTypeDeserializeSuite.fury_deserialize array STRUCT false thrpt 9 6853906.995 ± 1416593.099 ops/s
UserTypeDeserializeSuite.fury_deserialize array STRUCT2 false thrpt 9 1435421.480 ± 255227.618 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array SAMPLE false thrpt 9 2229307.602 ± 338410.918 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array MEDIA_CONTENT false thrpt 9 1520342.611 ± 208019.019 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array STRUCT false thrpt 9 2579101.454 ± 741134.380 ops/s
UserTypeDeserializeSuite.fury_deserialize_compatible array STRUCT2 false thrpt 9 1084974.752 ± 212686.247 ops/s
关于开源
Fury目前已经完成开源流程审批,即将对外开源,如果有开源社区使用场景非常欢迎私聊@慕白!!!
联系我们
如果想进一步了解Fury,或者对Fury有任何使用问题,欢迎钉钉私聊和通过下方二维码加入Fury用户群(群号:35683646):