标签
PostgreSQL , Oracle , PL/SQL , 聚合函数 , 自定义聚合函数
背景
Oracle的自定义聚合函数的定义方法,在创建函数是,使用AGGREGATE USING Clause关键词。
AGGREGATE USING Clause
Specify AGGREGATE USING to identify this function as an aggregate function, or one that evaluates a group of rows and returns a single row. You can specify aggregate functions in the select list, HAVING clause, and ORDER BY clause.
When you specify a user-defined aggregate function in a query, you can treat it as an analytic function (one that operates on a query result set). To do so, use the OVER analytic_clause syntax available for built-in analytic functions. See "Analytic Functions" for syntax and semantics.
In the USING clause, specify the name of the implementation type of the function. The implementation type must be an object type containing the implementation of the ODCIAggregate routines. If you do not specify schema, Oracle Database assumes that the implementation type is in your own schema.
Restriction on Creating Aggregate Functions
If you specify this clause, you can specify only one input argument for the function.
自定义的聚合函数,与普通聚合函数一样,可以用于聚合、KEEP等操作SQL中。
PostgreSQL 聚合函数用法
https://www.postgresql.org/docs/10/static/functions-aggregate.html
《PostgreSQL aggregate function 1 : General-Purpose Aggregate Functions》
《PostgreSQL aggregate function 2 : Aggregate Functions for Statistics》
《PostgreSQL aggregate function 3 : Aggregate Functions for Ordered-Set》
《PostgreSQL aggregate function 4 : Hypothetical-Set Aggregate Functions》
PostgreSQL 自定义聚合函数
1、自定义普通聚合函数:
《PostgreSQL aggregate function customize》
2、自定义并行聚合函数:
《PostgreSQL Oracle 兼容性之 - 自定义并行聚合函数 PARALLEL_ENABLE AGGREGATE》
《PostgreSQL 10 自定义并行计算聚合函数的原理与实践 - (含array_agg合并多个数组为单个一元数组的例子)》
3、在postgres-xc中自定义多阶段分布式并行聚合函数:
《Postgres-XC customized aggregate introduction》
4、在greenplum中自定义多阶段分布式并行聚合函数:
《Greenplum 最佳实践 - 估值插件hll的使用(以及hll分式聚合函数优化)》
语法:
https://www.postgresql.org/docs/10/static/xaggr.html
https://www.postgresql.org/docs/10/static/sql-createaggregate.html
例子
1、普通聚合函数
目标:将结果聚合,并按某个字段输出为有序数组。
测试表
create table recommendation_mpt (user_id int8, app_id numeric, rating numeric);
insert into recommendation_mpt select generate_series(1,10000), generate_series(1,41), random();
聚合过程中最后一步,数据排序处理函数
create or replace function final_array_agg (i_text text) returns text[] as $$
declare
result text[];
begin
select array_agg(app_id||'_'||rating) into result from
(select split_part(i,'_',1) as app_id,
split_part(i,'_',2) as rating -- 按它排序
from
regexp_split_to_table(i_text,',') t(i)
order by 2 desc) t;
return result;
end;
$$ language plpgsql strict;
create aggregate agg_append (text) (
sfunc = textcat,
stype = text,
FINALFUNC = final_array_agg);
select agg_append(app_id||'_'||rating||',') from recommendation_mpt;
postgres=# select agg_append(app_id||'_'||rating||',') from recommendation_mpt;
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
agg_append | {24_0.91642474103719,3_0.86293408786878,21_0.824714167509228,41_0.823069900739938,28_0.82022201269865,17_0.800656013656408,33_0.764910507481545,25_0.760074479039758,30_0.757540909573436,13_0.707890411838889,20_0.704598274547607,5_0.675859381910414,40_0.674109968356788,37_0.671832457184792,31_0.666503502987325,35_0.641303175128996,23_0.640862574335188,12_0.639161774888635,10_0.634707988705486,1_0.630520141683519,39_0.589550276752561,7_0.547058736439794,4_0.541917834896594,15_0.535650313366205,34_0.529437590856105,29_0.468865198083222,14_0.456227377057076,36_0.440769889391959,27_0.431988585740328,26_0.408387354109436,22_0.359426050912589,18_0.329283143393695,19_0.266014957334846,38_0.188361912034452,16_0.150509809609503,8_0.148780386894941,6_0.142394866328686,11_0.116577256470919,32_0.0993853402324021,2_0.00736959790810943,9_0.00227751117199659,_}
当然,这个实际上现在PG已经内置了语法来支持,上面只是演示一下自定义聚合函数。
内置ORDER BY,通过string_agg进行聚合:
postgres=# select string_agg(app_id||'_'||rating, ',' order by rating desc) from recommendation_mpt ;
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
string_agg | 24_0.91642474103719,3_0.86293408786878,21_0.824714167509228,41_0.823069900739938,28_0.82022201269865,17_0.800656013656408,33_0.764910507481545,25_0.760074479039758,30_0.757540909573436,13_0.707890411838889,20_0.704598274547607,5_0.675859381910414,40_0.674109968356788,37_0.671832457184792,31_0.666503502987325,35_0.641303175128996,23_0.640862574335188,12_0.639161774888635,10_0.634707988705486,1_0.630520141683519,39_0.589550276752561,7_0.547058736439794,4_0.541917834896594,15_0.535650313366205,34_0.529437590856105,29_0.468865198083222,14_0.456227377057076,36_0.440769889391959,27_0.431988585740328,26_0.408387354109436,22_0.359426050912589,18_0.329283143393695,19_0.266014957334846,38_0.188361912034452,16_0.150509809609503,8_0.148780386894941,6_0.142394866328686,11_0.116577256470919,32_0.0993853402324021,2_0.00736959790810943,9_0.00227751117199659
参考
https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_5009.htm
https://www.postgresql.org/docs/10/static/xaggr.html
《PostgreSQL aggregate function customize》
《PostgreSQL Oracle 兼容性之 - 自定义并行聚合函数 PARALLEL_ENABLE AGGREGATE》
《PostgreSQL 10 自定义并行计算聚合函数的原理与实践 - (含array_agg合并多个数组为单个一元数组的例子)》
《Postgres-XC customized aggregate introduction》