Apache Zeppelin系列教程第六篇——Zengine调用Interpreter原理分析

简介: Apache Zeppelin系列教程第六篇——Zengine调用Interpreter原理分析

前文介绍jdbc interpreter和interpreter模块交互代码,本篇文章主要分析Zengine调用Interpreter模块代码。

介绍完这篇文章之后,我们即可将paragraph run的流程串起来(后面会将整个流程进行串讲)

同样,来看下这个测试类

zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterTest.java

  @Test
  public void testFIFOScheduler() throws InterruptedException, InterpreterException {
    LOGGER.info("===testFIFOScheduler====");
    interpreterSetting.getOption().setPerUser(InterpreterOption.SHARED);
    // by default SleepInterpreter would use FIFOScheduler
    LOGGER.info("===getInterpreter====");
    final Interpreter interpreter1 = interpreterSetting.getInterpreter("user1", note1Id, "sleep");
    LOGGER.info("===createDummyInterpreterContext====");
    final InterpreterContext context1 = createDummyInterpreterContext();
    // run this dummy interpret method first to launch the RemoteInterpreterProcess to avoid the
    // time overhead of launching the process.
    LOGGER.info("111");
    LOGGER.info("=====name:{}=======",interpreter1.getClassName());
    System.out.println(interpreter1.getClassName());
    interpreter1.interpret("10101", context1);
    LOGGER.info("222");
    Thread thread1 = new Thread() {
      @Override
      public void run() {
        try {
          assertEquals(Code.SUCCESS, interpreter1.interpret("100", context1).code());
        } catch (InterpreterException e) {
          e.printStackTrace();
          fail();
        }
      }
    };
    Thread thread2 = new Thread() {
      @Override
      public void run() {
        try {
          assertEquals(Code.SUCCESS, interpreter1.interpret("100", context1).code());
        } catch (InterpreterException e) {
          e.printStackTrace();
          fail();
        }
      }
    };
    long start = System.currentTimeMillis();
    thread1.start();
    thread2.start();
    thread1.join();
    thread2.join();
    long end = System.currentTimeMillis();
    assertTrue((end - start) >= 200);
  }

可以看下这个测试方法,这边加了一些日志

RemoteInterpreterTest 继承 AbstractInterpreterTest 里面的抽象类,会先执行setUp方法对读取配置文件信息interpreter 进行初始化

核心主要是执行RemoteInterpreter里面的 interpret 方法,

  @Override
  public InterpreterResult interpret(final String st, final InterpreterContext context)
      throws InterpreterException {
    LOGGER.info("st:\n{}", st);
    if (LOGGER.isDebugEnabled()) {
      LOGGER.debug("st:\n{}", st);
    }
    final FormType form = getFormType();
    RemoteInterpreterProcess interpreterProcess = null;
    try {
      interpreterProcess = getOrCreateInterpreterProcess();
    } catch (IOException e) {
      throw new InterpreterException(e);
    }
    if (!interpreterProcess.isRunning()) {
      return new InterpreterResult(InterpreterResult.Code.ERROR,
              "Interpreter process is not running\n" + interpreterProcess.getErrorMessage());
    }
    return interpreterProcess.callRemoteFunction(client -> {
          RemoteInterpreterResult remoteResult = client.interpret(
              sessionId, className, st, convert(context));
          Map<String, Object> remoteConfig = (Map<String, Object>) GSON.fromJson(
              remoteResult.getConfig(), new TypeToken<Map<String, Object>>() {
              }.getType());
          context.getConfig().clear();
          if (remoteConfig != null) {
            context.getConfig().putAll(remoteConfig);
          }
          GUI currentGUI = context.getGui();
          GUI currentNoteGUI = context.getNoteGui();
          if (form == FormType.NATIVE) {
            GUI remoteGui = GUI.fromJson(remoteResult.getGui());
            GUI remoteNoteGui = GUI.fromJson(remoteResult.getNoteGui());
            currentGUI.clear();
            currentGUI.setParams(remoteGui.getParams());
            currentGUI.setForms(remoteGui.getForms());
            currentNoteGUI.setParams(remoteNoteGui.getParams());
            currentNoteGUI.setForms(remoteNoteGui.getForms());
          } else if (form == FormType.SIMPLE) {
            final Map<String, Input> currentForms = currentGUI.getForms();
            final Map<String, Object> currentParams = currentGUI.getParams();
            final GUI remoteGUI = GUI.fromJson(remoteResult.getGui());
            final Map<String, Input> remoteForms = remoteGUI.getForms();
            final Map<String, Object> remoteParams = remoteGUI.getParams();
            currentForms.putAll(remoteForms);
            currentParams.putAll(remoteParams);
          }
          return convert(remoteResult);
        }
    );
  }

其中getOrCreateInterpreterProcess()一路点下去 最终是去调用zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/ExecRemoteInterpreterProcess.java 里面的start 方法,通过 commons-exec命令执行shell 或者cmd 脚本(bin/interpreter.sh) 启动一个独立的进程,shell 脚本里面具体执行的类(org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer),和前一篇文章interpreter 原理分析相呼应

  @Override
  public void start(String userName) throws IOException {
    // start server process
    CommandLine cmdLine = CommandLine.parse(interpreterRunner);
    cmdLine.addArgument("-d", false);
    cmdLine.addArgument(getInterpreterDir(), false);
    cmdLine.addArgument("-c", false);
    cmdLine.addArgument(getIntpEventServerHost(), false);
    cmdLine.addArgument("-p", false);
    cmdLine.addArgument(String.valueOf(intpEventServerPort), false);
    cmdLine.addArgument("-r", false);
    cmdLine.addArgument(getInterpreterPortRange(), false);
    cmdLine.addArgument("-i", false);
    cmdLine.addArgument(getInterpreterGroupId(), false);
    if (isUserImpersonated() && !userName.equals("anonymous")) {
      cmdLine.addArgument("-u", false);
      cmdLine.addArgument(userName, false);
    }
    cmdLine.addArgument("-l", false);
    cmdLine.addArgument(getLocalRepoDir(), false);
    cmdLine.addArgument("-g", false);
    cmdLine.addArgument(getInterpreterSettingName(), false);
    interpreterProcessLauncher = new InterpreterProcessLauncher(cmdLine, getEnv());
    interpreterProcessLauncher.launch();
    interpreterProcessLauncher.waitForReady(getConnectTimeout());
    if (interpreterProcessLauncher.isLaunchTimeout()) {
      throw new IOException(
          String.format("Interpreter Process creation is time out in %d seconds", getConnectTimeout() / 1000) + "\n"
              + "You can increase timeout threshold via "
              + "setting zeppelin.interpreter.connect.timeout of this interpreter.\n"
              + interpreterProcessLauncher.getErrorMessage());
    }
    if (!interpreterProcessLauncher.isRunning()) {
      throw new IOException("Fail to launch interpreter process:\n" + interpreterProcessLauncher.getErrorMessage());
    }
    if (isHadoopClientAvailable()) {
      String launchOutput = interpreterProcessLauncher.getProcessLaunchOutput();
      Matcher m = YARN_APP_PATTER.matcher(launchOutput);
      if (m.find()) {
        String appId = m.group(1);
        LOGGER.info("Detected yarn app: {}, add it to YarnAppMonitor", appId);
        YarnAppMonitor.get().addYarnApp(ConverterUtils.toApplicationId(appId), this);
      }
    }
  }

而实际调用thrift server 端服务的client 端代码

上述图片是代码运行的log,可以帮助我们定位代码的运行顺序

({FIFO-RemoteInterpreter-python-shared_process-shared_session-1} ProcessLauncher.java[launch]:96) - Process is launched: [.\\bin\interpreter.cmd, -d, ./\interpreter/python, -c, 10.4.144.223, -p, 52945, -r, :, -i, python-shared_process, -l, ./\local-repo\python, -g, python]
({FIFO-RemoteInterpreter-md-shared_process-shared_session-1} ProcessLauncher.java[launch]:96) - Process is launched: [.\\bin\interpreter.cmd, -d, ./\interpreter/md, -c, 10.4.144.223, -p, 52945, -r, :, -i, md-shared_process, -l, ./\local-repo\md, -g, md]
({FIFO-RemoteInterpreter-jdbc-shared_process-shared_session-1} ProcessLauncher.java[launch]:96) - Process is launched: [.\\bin\interpreter.cmd, -d, ./\interpreter/jdbc, -c, 10.4.144.223, -p, 52945, -r, :, -i, jdbc-shared_process, -l, ./\local-repo\jdbc, -g, jdbc]

参考

程序员的福音 - Apache Commons Exec - 知乎

Apache Thrift系列详解(一) - 概述与入门 - 掘金


相关文章
|
2月前
|
存储 Apache
Apache Hudi Savepoint实现分析
Apache Hudi Savepoint实现分析
38 0
|
2月前
|
Apache 索引
精进Hudi系列|Apache Hudi索引实现分析(五)之基于List的IndexFileFilter
精进Hudi系列|Apache Hudi索引实现分析(五)之基于List的IndexFileFilter
21 0
|
4月前
|
机器学习/深度学习 SQL 分布式计算
Apache Spark 的基本概念和在大数据分析中的应用
介绍 Apache Spark 的基本概念和在大数据分析中的应用
162 0
|
4月前
|
机器学习/深度学习 SQL 分布式计算
介绍 Apache Spark 的基本概念和在大数据分析中的应用。
介绍 Apache Spark 的基本概念和在大数据分析中的应用。
|
2月前
|
Apache
Apache Hudi Rollback实现分析
Apache Hudi Rollback实现分析
27 0
|
2月前
|
Shell Linux Apache
【Shell 命令集合 网络通讯 】Linux 管理Apache HTTP服务器 apachectl命令 使用教程
【Shell 命令集合 网络通讯 】Linux 管理Apache HTTP服务器 apachectl命令 使用教程
166 1
|
2月前
|
存储 SQL 消息中间件
Apache Hudi:统一批和近实时分析的存储和服务
Apache Hudi:统一批和近实时分析的存储和服务
39 0
|
2月前
|
缓存 Apache 索引
Apache Hudi索引实现分析(一)之HoodieBloomIndex
Apache Hudi索引实现分析(一)之HoodieBloomIndex
22 0
|
2月前
|
Apache 索引
Apache Hudi索引实现分析(二)之HoodieGlobalBloomIndex
Apache Hudi索引实现分析(二)之HoodieGlobalBloomIndex
27 0
|
2月前
|
存储 分布式数据库 Apache
Apache Hudi索引实现分析(三)之HBaseIndex
Apache Hudi索引实现分析(三)之HBaseIndex
23 0

推荐镜像

更多