[spark] 여러 모드에서 스파크 잡 실행하기 예제

scala 2018. 3. 14. 02:25

간단한 스파크 잡 실행하기 예제는 다음과 같다.

# 8코어에서 독립 실행 형 모드로 애플리케이션을 실행한다

SPARK_HOME/bin/spark-submit \

--class org.apache.spark.examples.Demo \

--master local[8] \

Demo-0.1-SNAPSHOT-jar-with-dependencies.jar

# YARN 클러스터에서 실행한다

export HADOOP_CONF_DIR=XXX

SPARK_HOME/bin/spark-submit \

--class org.apache.spark.examples.Demo \

--master yarn \

--deploy-mode cluster \ # 클러스터 모드로 클라이언트가 될 수 있다

--executor-memory 20G \

--num-executors 50 \

Demo-0.1-SNAPSHOT-jar-with-dependencies.jar

# supervise 플래그를 포함해 클러스터 배포 모드의 메소스(Mesos) 클러스터에서 실행한다

SPARK_HOME/bin/spark-submit \

--class org.apache.spark.examples.Demo \

--master mesos://207.184.161.138:7077 \ # IP 주소를 사용한다

--deploy-mode cluster \

--supervise \

--executor-memory 20G \

--total-executor-cores 100 \

Demo-0.1-SNAPSHOT-jar-with-dependencies.jar

supervise는 스탠드 얼론 모드에서 0이외의 값을 리턴, 비정상적인 종료일 때는 다시 실행하라는 의미를 가진다.

예제

https://spark.apache.org/docs/2.1.1/submitting-applications.html

# Run application locally on 8 cores
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master local[8] \
  /path/to/examples.jar \
  100

# Run on a Spark standalone cluster in client deploy mode
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

# Run on a Spark standalone cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

# Run on a YARN cluster
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000

# Run a Python application on a Spark standalone cluster
./bin/spark-submit \
  --master spark://207.184.161.138:7077 \
  examples/src/main/python/pi.py \
  1000

# Run on a Mesos cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  http://path/to/examples.jar \
  1000

저작자표시

'scala' 카테고리의 다른 글

[spark] log4j 직렬화하기 - org.apache.spark.SparkException: Task not serializable 해결하기 (0)	2018.03.25
[spark] 스파크 MLlib으로 비정상 데이터를 찾기에 좋은 참조 자료 (0)	2018.03.14
[play2] 간단한 인증 방식 구현 예제(basic authentication) (0)	2018.02.20
[play] scala play framework에서 인증/권한(authentication/authorization) 참조 문서 - 펌 (0)	2018.02.09
[play] JWT token 예제 (0)	2018.02.09

Posted by '김용환'

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

[spark] 여러 모드에서 스파크 잡 실행하기 예제

'scala' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바