hello everybody, have kerberized hdp (hortonworks) cluster, can run spark jobs spark-submit (cli), talend big data, not eclipse.
we have windows client machine eclipse installed , mit windows kerberos client confgiured (tgt configuration). goal run spark job using eclipse. portion of java code related spark operational , tested via cli. below mentioned part of code job.
private void setconfigurationproperties() { try{ sconfig.setappname("abcd-name"); sconfig.setmaster("yarn-client"); sconfig.set("spark.serializer", "org.apache.spark.serializer.kryoserializer"); sconfig.set("spark.hadoop.yarn.resourcemanager.address", "rs.abcd.com:8032"); sconfig.set("spark.hadoop.yarn.resourcemanager.scheduler.address","rs.abcd.com:8030"); sconfig.set("spark.hadoop.mapreduce.jobhistory.address","rs.abcd.com:10020"); sconfig.set("spark.hadoop.yarn.app.mapreduce.am.staging-dir", "/dir"); sconfig.set("spark.executor.memory", "2g"); sconfig.set("spark.executor.cores", "4"); sconfig.set("spark.executor.instances", "24"); sconfig.set("spark.yarn.am.cores", "24"); sconfig.set("spark.yarn.am.memory", "16g"); sconfig.set("spark.eventlog.enabled", "true"); sconfig.set("spark.eventlog.dir", "hdfs:///spark-history"); sconfig.set("spark.shuffle.memoryfraction", "0.4"); sconfig.set("spark.hadoop." + "mapreduce.application.framework.path","/hdp/apps/version/mapreduce/mapreduce.tar.gz#mr-framework"); sconfig.set("spark.local.dir", "/tmp"); sconfig.set("spark.hadoop.yarn.resourcemanager.principal", "rm/_host@abcd.com"); sconfig.set("spark.hadoop.mapreduce.jobhistory.principal", "jhs/_host@abcd.com"); sconfig.set("spark.hadoop.dfs.namenode.kerberos.principal", "nn/_host@abcd.com"); sconfig.set("spark.hadoop.fs.defaultfs", "hdfs://hdfs.abcd.com:8020"); sconfig.set("spark.hadoop.dfs.client.use.datanode.hostname", "true"); } }
when run code following error pops up:
17/04/05 23:37:06 info remoting: starting remoting
17/04/05 23:37:06 info remoting: remoting started; listening on addresses :[akka.tcp://sparkdriveractorsystem@1.1.1.1:54356]
17/04/05 23:37:06 info utils: started service 'sparkdriveractorsystem' on port 54356.
17/04/05 23:37:06 info sparkenv: registering mapoutputtracker 17/04/05 23:37:06 info sparkenv: registering blockmanagermaster
17/04/05 23:37:06 info diskblockmanager: created local directory @ c:\tmp\blockmgr-baee2441-1977-4410-b52f-4275ff35d6c1
17/04/05 23:37:06 info memorystore: memorystore started capacity 2.4 gb
17/04/05 23:37:06 info sparkenv: registering outputcommitcoordinator
17/04/05 23:37:07 info utils: started service 'sparkui' on port 4040.
17/04/05 23:37:07 info sparkui: started sparkui @ http://1.1.1.1:4040
17/04/05 23:37:07 info rmproxy: connecting resourcemanager @ rs.abcd.com/1.1.1.1:8032
17/04/05 23:37:07 error sparkcontext: error initializing sparkcontext.
org.apache.hadoop.security.accesscontrolexception: simple authentication not enabled. available:[token, kerberos]
caused by: org.apache.hadoop.ipc.remoteexception(org.apache.hadoop.security.accesscontrolexception): simple authentication not enabled. available:[token, kerberos]
17/04/05 23:37:07 info sparkui: stopped spark web ui @ http://1.1.1.1:4040
please guide how specify in java code kerberos authentication method instead of simple. or how instruct client kerberos authentication request. , whole should process , right approach
thank
Comments
Post a Comment