当前位置 : 主页 > 编程语言 > java >

spark错误整理

来源:互联网 收集:自由互联 发布时间:2021-06-28
spark错误 * Null value appeared in non-nullable fieldjava.lang.NullPointerException: Null value appeared in non-nullable field: top level row objectIf the schema is inferred from a Scala tuple/case class, or a Java bean, please try to use
spark错误
* Null value appeared in non-nullable field
java.lang.NullPointerException: Null value appeared in non-nullable field: top level row object
If the schema is inferred from a Scala tuple/case class, or a Java bean, please try to use scala.Option[_] or other nullable types (e.g. java.lang.Integer instead of int/scala.Int).
解决:在dataframe中增加过滤row==null的Row
df.filter(row -> row != null)

* 编译问题,map修改row不生效:
ERROR CodeGenerator: failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of method "processNext()V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator" grows beyond 64 KB
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIterator(references);
/* 003 */ }
......(省略上万行源码)
原因:dataset中出现如下schema类型为null的字段(rank),发生原因是sql中使用了 null as rank语法。
 |-- is_merchant_exclusive: integer (nullable = true)
 |-- comment_keywords: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- date: date (nullable = true)
 |-- generate_time: null (nullable = true)
 |-- rank: null (nullable = true)
 |-- prime: null (nullable = true)
 |-- activities: null (nullable = true)
 |-- categories: null (nullable = true)
 |-- total_heart_num: null (nullable = true)
 |-- ad_categories: null (nullable = true)
 解决办法:
 在dataset的map方法中,使用的schema必须先对上述null字段重新定义。
 newFields.set(oldSchema.fieldIndex("rank"), staticSchema.apply("rank"));
 ...


 *spark保存数据到hive时,Caused by: parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead
 原因:hive字段中存在map或array类型字段,但保存时,数据包含空array或空map的值。
 解决办法:将空array或空map值(简称为空集合),修改为null,保存成功。
 spark保存数据到hive时,不支持空集合,只能改为null再保存,但从数据文件导入到hive时则没有问题。
 所以用spark读取hive时,会带入空集合数据,保存前需要改为null.
上一篇:ionic cli
下一篇:仿赛百味点餐
网友评论