这篇文章主要介绍“Flink区分运行环境的方法是什么”,在日常操作中,相信很多人在Flink区分运行环境的方法是什么问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”Flink区分运行环境的方法是什么”的疑惑有所帮助!接下来,请跟着小编一起来学习吧!
成都创新互联是专业的忻州网站建设公司,忻州接单;提供做网站、网站建设,网页设计,网站设计,建网站,PHP网站建设等专业做网站服务;采用PHP框架,可快速的进行忻州网站开发网页制作和功能扩展;专业做搜索引擎喜爱的网站,专业的做网站团队,希望更多企业前来合作!
Flink判断运行环境(本地、集群)的逻辑如下:
(1)在任务的main方法中,通过 StreamExecutionEnvironment 获取运行环境
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
(2)生成运行环境的工厂类放在ThreadLocal中;threadLocalContextEnvironmentFactory 是StreamExecutionEnvironment类的静态属性
/** The ThreadLocal used to store {@link StreamExecutionEnvironmentFactory}. */ private static final ThreadLocalthreadLocalContextEnvironmentFactory = new ThreadLocal<>();
①当是本地IDE直接运行任务main方法时,ThreadLocal中获取到的StreamExecutionEnvironmentFactory为空,此时生成本地运行环境LocalStreamEnvironment
public static StreamExecutionEnvironment getExecutionEnvironment() { return Utils.resolveFactory(threadLocalContextEnvironmentFactory, contextEnvironmentFactory) .map(StreamExecutionEnvironmentFactory::createExecutionEnvironment) .orElseGet(StreamExecutionEnvironment::createLocalEnvironment); }
当ThreadLocal中有StreamExecutionEnvironmentFactory时,则用其createExecutionEnvironment()方法来生成运行环境
②当集群环境时,是如何将StreamExecutionEnvironmentFactory放入到ThreadLocal中?
通过 bin/flink run .... 命令提交jar包到集群运行命令时,该脚本会调用 org.apache.flink.client.cli.CliFrontend 来运行用户程序,如下:
....... ....... # Add HADOOP_CLASSPATH to allow the usage of Hadoop file systems exec $JAVA_RUN $JVM_ARGS $FLINK_ENV_JAVA_OPTS "${log_setting[@]}" -classpath "`manglePathList "$CC_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" org.apache.flink.client.cli.CliFrontend "$@"
在CliFrontend中依次执行以下方法 main() -> parseParameters() -> run() -> executeProgram()
protected void executeProgram(final Configuration configuration, final PackagedProgram program) throws ProgramInvocationException { ClientUtils.executeProgram(new DefaultExecutorServiceLoader(), configuration, program, false, false); }
在org.apache.flink.client.ClientUtils的executeProgram()中调用 StreamContextEnvironment.setAsContext(...),StreamContextEnvironment继承自StreamExecutionEnvironment。setAsContext()代码如下
public static void setAsContext( final PipelineExecutorServiceLoader executorServiceLoader, final Configuration configuration, final ClassLoader userCodeClassLoader, final boolean enforceSingleJobExecution, final boolean suppressSysout) { StreamExecutionEnvironmentFactory factory = () -> new StreamContextEnvironment( executorServiceLoader, configuration, userCodeClassLoader, enforceSingleJobExecution, suppressSysout); initializeContextEnvironment(factory); }
创建生成运行环境的工厂类实例,在initializeContextEnvironment()方法中把实例放到StreamExecutionEnvironment类的静态属性threadLocalContextEnvironmentFactory 中 ,代码如下
protected static void initializeContextEnvironment(StreamExecutionEnvironmentFactory ctx) { contextEnvironmentFactory = ctx; threadLocalContextEnvironmentFactory.set(contextEnvironmentFactory); }
这样在用户程序 StreamExecutionEnvironment.getExecutionEnvironment() 时,获取到的运行环境就是StreamContextEnvironment类的setAsContext()方法中生成的
public static void setAsContext( final PipelineExecutorServiceLoader executorServiceLoader, final Configuration configuration, final ClassLoader userCodeClassLoader, final boolean enforceSingleJobExecution, final boolean suppressSysout) { StreamExecutionEnvironmentFactory factory = () -> new StreamContextEnvironment( executorServiceLoader, configuration, userCodeClassLoader, enforceSingleJobExecution, suppressSysout); ...... }
本地运行环境LocalStreamEnvironment 和 独立集群、flink on yarn等运行环境StreamContextEnvironment的主要区别在于,他们的成员属性 configuration不同。LocalStreamEnvironment中是创建的空键值对(new Configuration()),而StreamContextEnvironment是通过 CliFrontend生成的 Configuration 对象。
到此,关于“Flink区分运行环境的方法是什么”的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注创新互联网站,小编会继续努力为大家带来更多实用的文章!