irpas技术客

spark写es 报错 Could not write all entries for bulk operation [47/1081_上杉仓南

网络 3534

报错详细日志信息:

es报错org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 55.0 failed 4 times, most recent failure: Lost task 0.3 in stage 55.0 (TID 4643, 192.168.1.203, executor 3): org.elasticsearch.hadoop.EsHadoopException: Could not write all entries for bulk operation [47/10813]. Error sample (first [5] error messages):failed to parse [edgeEasyPovertyStartDate] failed to parse [edgeEasyPovertyStartDate] failed to parse [edgeEasyPovertyStartDate] failed to parse [edgeEasyPovertyStartDate] failed to parse [edgeEasyPovertyStartDate] Bailing out... at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.flush(BulkProcessor.java:475) at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.add(BulkProcessor.java:106) at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:187) at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:168) at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:67) at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101) at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:100) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

从日志中分析得知:failed to parse [edgeEasyPovertyStartDate]? spark在写入es的时候,解析

"edgeEasyPovertyStartDate" 这个字段出现问题,于是我查了下这个es索引这个字段类型

?

?是integer 类型,而这个字段数据类型实际上是 date 类型,从新创建es索引,把?edgeEasyPovertyStartDate 字段类型integer--->date 就没有问题了。

spark 写 es 报错有许多是由于es索引字段和数据类型不匹配导致的。


1.本站遵循行业规范,任何转载的稿件都会明确标注作者和来源;2.本站的原创文章,会注明原创字样,如未注明都非原创,如有侵权请联系删除!;3.作者投稿可能会经我们编辑修改或补充;4.本站不提供任何储存功能只提供收集或者投稿人的网盘链接。

标签: #spark写es #报错 #could #not #write #all #entries #for