Flink常用API之HDFS文件Source
原创
wx62be9d88ce294博主文章分类:大数据 ©著作权
文章标签 hdfs flink big data apache 文章分类 Hadoop 大数据
©著作权归作者所有:来自51CTO博客作者wx62be9d88ce294的原创作品,请联系作者获取转载授权,否则将追究法律责任
package sourceimport org.apache.flink.streaming.api.scala.StreamExecutionEnvironment object HDFSFileSource { def main(args: Array[String]): Unit = { val ev = StreamExecutionEnvironment.getExecutionEnvironment ev.setParallelism(1) import org.apache.flink.streaming.api.scala._ val stream: DataStream[String] = ev.readTextFile("hdfs://mycluster/wc.txt") stream.flatMap(_.split(" ")) .map((_,1)) .keyBy(0) .sum(1) .print() ev.execute("wordcount") }}
HDFS数据图
[root@node1 ~]21/12/25 14:52:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablehello tom andy joy hello rose hello joy mark andy hello tom andy rose hello joy
- 赞
- 收藏
- 评论
- *举报
下一篇:Flink 常用 API 详解
Original: https://blog.51cto.com/u_15704423/5434841
Author: wx62be9d88ce294
Title: Flink常用API之HDFS文件Source
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/516976/
转载文章受原作者版权保护。转载请注明原作者出处!