开发者社区> 问答> 正文

使用Table AggregateFunction和ResultTypeQueryable时的ValidationException

我正在使用配置为使用flink-tablejar 的本地Flink 1.6集群(意味着我的程序的jar不包括在内flink-table)。使用以下代码

import org.apache.flink.api.common.typeinfo.TypeHint;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.operators.DataSource;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.api.java.BatchTableEnvironment;
import org.apache.flink.table.functions.AggregateFunction;
import org.apache.flink.types.Row;

import java.util.ArrayList;
import java.util.List;

public class JMain {

public static void main(String[] args) throws Exception {
    ExecutionEnvironment execEnv = ExecutionEnvironment.getExecutionEnvironment();
    BatchTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(execEnv);

    tableEnv.registerFunction("enlist", new Enlister());

    DataSource<Tuple2<String, String>> source = execEnv.fromElements(
            new Tuple2<>("a", "1"),
            new Tuple2<>("a", "2"),
            new Tuple2<>("b", "3")
    );

    Table table = tableEnv.fromDataSet(source, "a, b")
            .groupBy("a")
            .select("enlist(a, b)");

    tableEnv.toDataSet(table, Row.class)
            .print();
}

public static class Enlister
        extends AggregateFunction<List<String>, ArrayList<String>>
        implements ResultTypeQueryable<List<String>>
{
    @Override
    public ArrayList<String> createAccumulator() {
        return new ArrayList<>();
    }

    @Override
    public List<String> getValue(ArrayList<String> acc) {
        return acc;
    }

    @SuppressWarnings("unused")
    public void accumulate(ArrayList<String> acc, String a, String b) {
        acc.add(a + ":" + b);
    }

    @SuppressWarnings("unused")
    public void merge(ArrayList<String> acc, Iterable<ArrayList<String>> it) {
        for (ArrayList<String> otherAcc : it) {
            acc.addAll(otherAcc);
        }
    }

    @SuppressWarnings("unused")
    public void resetAccumulator(ArrayList<String> acc) {
        acc.clear();
    }

    @Override
    public TypeInformation<List<String>> getProducedType() {
        return TypeInformation.of(new TypeHint<List<String>>(){});
    }
}

}
我得到了这个奇怪的例外:

org.apache.flink.table.api.ValidationException: Expression Enlister(List('a, 'b)) failed on input check: Given parameters do not match any signature.
Actual: (java.lang.String, java.lang.String)
Expected: (java.lang.String, java.lang.String)
但是,如果我没有实现ResultTypeQueryable,我得到预期的输出:

Starting execution of program
[b:3]
[a:1, a:2]
Program execution finished
Job with JobID 20497bd3efe44fab0092a05a8eb7d9de has finished.
Job Runtime: 270 ms
Accumulator Results:

  • 56e0e5a9466b84ae44431c9c4b7aad71 (java.util.ArrayList) [2 elements]
    我的实际用例需要ResultTypeQueryable,否则我得到这个例外:

The return type of function ... could not be determined automatically,
due to type erasure. You can give type information hints by using the
returns(...) method on the result of the transformation call,
or by letting your function implement the 'ResultTypeQueryable' interface
我能解决这个问题吗?

展开
收起
flink小助手 2018-12-10 13:06:17 7155 0
1 条回答
写回答
取消 提交回答
  • flink小助手会定期更新直播回顾等资料和文章干货,还整合了大家在钉群提出的有关flink的问题及回答。

    ResultTypeQueryable在这种情况下,实施是不正确的。例外是误导性的。而是覆盖getResultType()和getAccumulatorType()。这背后的原因是,在为序列化器生成类型信息时,泛型通常会导致问题(由于Java的类型擦除)。

    2019-07-17 23:19:10
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
How to Build a Successful Data 立即下载
Why you should care about data layout in the file system 立即下载
Spark SQL:Past Present &Future 立即下载