[Spark][Python]DataFrame的左右连接例子

[Spark][Python]DataFrame的左右连接例子

$ hdfs dfs -cat people.json

$ hdfs dfs -cat pcodes.json

$pyspark

sqlContext = HiveContext(sc)
peopleDF = sqlContext.read.json(“people.json”)
peopleDF.limit(5).show()

sqlContext = HiveContext(sc)
pcodesDF = sqlContext.read.json(“pcodes.json”)
pcodesDF.limit(5).show()

mydf000 = peopleDF.join(pcodesDF,”pcode”)
mydf000.limit(5).show()

mydf001=peopleDF.join(pcodesDF,”pcode”,”leftsemi”)
mydf001.limit(5).show()

mydf002=peopleDF.join(pcodesDF,”pcode”,”left_outer”)
mydf002.limit(5).show()

mydf003=peopleDF.join(pcodesDF,”pcode”,”right_outer”)
mydf003.limit(5).show()

Original: https://www.cnblogs.com/gaojian/p/7633001.html
Author: 健哥的数据花园
Title: [Spark][Python]DataFrame的左右连接例子

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/560180/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球