spark的jar包加载和crossJoin的支持

jar包顺序

因为在包中要使用http请求外部接口获取数据。而本身spark就自带http的请求的包。因为版本不一致,导致方法不一样或者某些方法没有而报错

解决方法

1
--conf spark.driver.userClassPathFirst=true

Detected implicit cartesian product for LEFT OUTER join问题

先看代码

1
2
spark.sql("select count(1) as wnum from A").createOrReplaceTempView("temp")
spark.sql("select wnum,1 as type from temp")

上面的代码第一句查询总数,第二段加一个type字段值为1,用于left join的判断条件

然后这段代码产生的df会给别的df进行关联查询,就用报一个错误:

1
2
Exception in thread "main" org.apache.spark.sql.AnalysisException: Detected implicit cartesian product for LEFT OUTER join between logical plans
Aggregate [count(1) AS wnum#429L, 1 AS type#432]

解决方法:

1
spark.conf.set("spark.sql.crossJoin.enabled", "true")