hadoop 1.2.1 과 hive 1.0.1에 대한 설치 내용을 간략히 기술한다.



* hadoop 로컬 실행 모드 설치


1) ssh

hadoop을 로컬 실행 모드로 실행하기 위해서 ssh가 연결할 수 있는 환경이어야 한다.

./bin/start-all.sh실행시 ssh로 연결하니 ssh를 잘 연결해야 한다. 



$ ssh-keygen

$ cat ~/.ssh/id_rsa.pub | ssh localhost 'cat >> ~/.ssh/authorized_keys'

$ ssh -l 계정 localhost

또는 

$ ssh 계정@localhost

$ cat ~/.ssh/id_rsa.pub >>  ~/.ssh/known_hosts



맥 환경이면, 시스템 설정 -> 공유 -> 원격 로그인을 활성화하여 ssh접근 되게 함


2) 환경설정

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home




3) 하둡 다운로드

curl http://apache.tt.co.kr/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz 

설치 

mv 압축디렉토리 /usr/local/hadoop-1.2.1


4) 하둡 설정 수정 

$ vi /usr/local/hadoop-1.2.1/conf/mapred-site.xml

<configuration>

    <property>

        <name>mapred.job.tracker</name>

        <value>localhost:9001</value>

    </property>

</configuration>


$ vi /usr/local/hadoop-1.2.1/conf/hdfs-site.xml

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

</configuration>


$ vi /usr/local/hadoop-1.2.1/conf/core-site.xml

<configuration>

    <property>

        <name>fs.default.name</name>

        <value>hdfs://localhost:9000</value>

   </property>

</configuration>



5) 환경 변수 추가

.bashrc에 다음을 추가

(만약 JAVA_HOME을 설정안했다는 로그가 출력하면, .profile에도 추가한다.)


export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home

export PATH=/usr/local/hadoop-1.2.1/bin:$PATH



6) 네임 노드를 포맷한다.


$ ./bin/hadoop namenode -format

만약 권한 이슈관련 에러가 발생한다면, 에러가 발생한 네임노드 디렉토리에 chmod 755 명령어를 이용한다.

(네임노드 포맷을 하지 않으면, http://localhost:50070/dfshealth.jsp 페이지를 열 수 없으며, 

logs 디렉토리의 로그 파일 에서 ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000 이란 로그를 볼 수 있을 것이다.)


예) chmod 755 /tmp/hadoop/dfs/name




7) 실행


$ ./bin/start-all.sh


에러 없고, 데몬 잘 뜬 경우라면 잘된 경우이다. 




8) 확인


브라우져에서 http://localhost:50030/jobtracker.jsp를 열어서 jobtracker 페이지가 동작하는 지 확인한다.

브라우져에서 http://localhost:50070/dfshealth.jsp를 열어서 namenode 페이가 동작하는 지 확인한다.




* hive 설치 


1) hive 1.0.1 버전을 설치

http://apache.tt.co.kr/hive/hive-1.0.1/apache-hive-1.0.1-bin.tar.gz를 다운로드한다. 


압축을 풀고, /usr/local/hive-1.0.1에 복사한다. 


2) 권한 설정


hadoop fs -mkdir /tmp

hadoop fs -mkdir /user/hive/warehouse

hadoop fs -chmod go+w /tmp

hadoop fs -chmod go+w /user/hive/warehouse

hadoop fs -chmod go+w /tmp/hive



3) PATH 설정


bashrc에 PATH에 /usr/local/hive-1.0.1/bin를 추가한다. 


$ vi ~/.bashrc

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home

export PATH=/usr/local/hadoop-1.2.1/bin:/usr/local/hive-1.0.1/bin:$PATH



정상적으로 동작하는지 확인한다.


hive> show tables;

OK

Time taken: 0.012 seconds

hive> select 1 + 1;

OK

2

Time taken: 0.342 seconds, Fetched: 1 row(s)





'hadoop' 카테고리의 다른 글

[hive] hive cli history  (0) 2016.04.17
[hive] HiveServer2  (0) 2016.04.16
install hadoop 1.2.1 and hive 1.0.1  (0) 2016.03.29
[hive] 함수 설명 보기  (0) 2016.03.28
[hive] 하이브는 등가 조인(equal join)만 지원한다.  (0) 2016.03.25
[hive] 데이터를 하나로 합치기  (0) 2016.02.29
Posted by 김용환 '김용환'

댓글을 달아 주세요

[hive] 함수 설명 보기

hadoop 2016. 3. 28. 19:23


hive 함수를 보려면 다음 명령어를 사용한다.


SHOW FUNCTIONS; 

DESCRIBE FUNCTION 함수이름;

DESCRIBE FUNCTION EXTENDED 함수이름;

 


예)


> DESCRIBE FUNCTION  xpath_string;

OK

xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression

Time taken: 0.006 seconds, Fetched: 1 row(s)



> DESCRIBE FUNCTION EXTENDED xpath_string;

OK

xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression

Example:

  > SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') FROM src LIMIT 1;

  'cc'

  > SELECT xpath_string('<a><b>b1</b><b>b2</b></a>','a/b') FROM src LIMIT 1;

  'b1'

  > SELECT xpath_string('<a><b>b1</b><b>b2</b></a>','a/b[2]') FROM src LIMIT 1;

  'b2'

  > SELECT xpath_string('<a><b>b1</b><b>b2</b></a>','a') FROM src LIMIT 1;

  'b1b2'

Time taken: 0.01 seconds, Fetched: 10 row(s)



'hadoop' 카테고리의 다른 글

[hive] HiveServer2  (0) 2016.04.16
install hadoop 1.2.1 and hive 1.0.1  (0) 2016.03.29
[hive] 함수 설명 보기  (0) 2016.03.28
[hive] 하이브는 등가 조인(equal join)만 지원한다.  (0) 2016.03.25
[hive] 데이터를 하나로 합치기  (0) 2016.02.29
[hive] 날짜 구하기  (0) 2016.02.26
Posted by 김용환 '김용환'

댓글을 달아 주세요


하이브는 현재 등가 조인(equal join)만 지원한다. 비등가 조인(non equal join)은 지원하지 않는다. 

즉 = 만 된다.(예, on a.id = b.id)


https://issues.apache.org/jira/browse/HIVE-3133



대신, 다양한 조인을 지원한다. 


JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN,FULL OUTER JOIN, CROSS JOIN






Posted by 김용환 '김용환'

댓글을 달아 주세요


hive의 hql에서 하나의 테이블처럼 데이터를 하나로 합치는 방법은 다음과 같다. union all을 사용한다.



SELECT unioned.id, unioned.var1, unioned.var2

FROM (

  SELECT a.id, a.var1, a.var2

  FROM table_A a


  UNION ALL


  SELECT b.id, b.var1, b.var2

  from table_B b

) unioned;



주의할 점
1. FROM 절에 쓰는 테이블 명에 다른 alias 이름을 다르게 해줘야 한다.
2. 하위 질의의 필드명이 모두 동일해야 한다.
3. 특히 GROUP BY 절을 쓴다면, select에서 사용하는 as 절 뒷 부분이 아니라, 원래 select 원본 데이터를 기준으로 한다.


Posted by 김용환 '김용환'

댓글을 달아 주세요

[hive] 날짜 구하기

hadoop 2016. 2. 26. 19:21



hive에서 날짜를 구하는 방법을 작성한 예시 코드이다.


오늘 날짜를 구하려면 다음과 같이 unix_timestamp를 이용하거나 unix_timestamp와 from_unixtime() 함수를 이용한 방식이 있다. 사람이 읽을 수 있는 함수를 보려면 from_unixtime() 함수를 잘 활용한다.


> select unix_timestamp();

1456481927


> select from_unixtime(unix_timestamp());

2016-02-26 19:21:52

 

YYYYMMDD와 같은 형태로 날짜의 특정 항목만 얻고 싶다면, year(), month()등과 같은 함수를 사용한다. 


unix_timestamp을 사용하는 방식을 사용하면 에러가 발생한다. 


> select year(unix_timestamp()), month(unix_timestamp()), day(unix_timestamp()), hour(unix_timestamp()), minute(unix_timestamp()), second(unix_timestamp());

 


current_timestamp이나 current_date로 현재 날짜를 얻을 수 있다. 두 개의 큰 차이는 current_date는 hour, minute, second 함수를 함께 사용할 수 없다.



> select year(current_timestamp), month(current_timestamp), day(current_timestamp), hour(current_timestamp), minute(current_timestamp), second(current_timestamp);

 2016 2 26 19 21 3

 


> select year(current_date), month(current_date), day(current_date), hour(current_date), minute(current_date), second(current_date);

 2016 2 26 NULL NULL NULL


Posted by 김용환 '김용환'

댓글을 달아 주세요


hadoop streaming을 처음 하는 사람에게 추천하는 글



http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/




mapper.py

#!/usr/bin/env python import sys # input comes from STDIN (standard input) for line in sys.stdin: # remove leading and trailing whitespace line = line.strip() # split the line into words words = line.split() # increase counters for word in words: # write the results to STDOUT (standard output); # what we output here will be the input for the # Reduce step, i.e. the input for reducer.py # # tab-delimited; the trivial word count is 1 

print '%s\t%s' % (word, 1)




reducer.py


#!/usr/bin/env python from operator import itemgetter import sys current_word = None current_count = 0 word = None # input comes from STDIN for line in sys.stdin: # remove leading and trailing whitespace line = line.strip() # parse the input we got from mapper.py word, count = line.split('\t', 1) # convert count (currently a string) to int try: count = int(count) except ValueError: # count was not a number, so silently # ignore/discard this line continue # this IF-switch only works because Hadoop sorts map output # by key (here: word) before it is passed to the reducer if current_word == word: current_count += count else: if current_word: # write result to STDOUT print '%s\t%s' % (current_word, current_count) current_count = count current_word = word # do not forget to output the last word if needed! if current_word == word: 

print '%s\t%s' % (current_word, current_count)



실행


hadoop jar contrib/streaming/hadoop-*streaming*.jar \

-mapper ./mapper.py \

-reducer ./reducer.py \

-file ./mapper.py  \

-file ./reducer.py \ -input /user/hduser/gutenberg/* \

-output /user/hduser/gutenberg-output

'hadoop' 카테고리의 다른 글

[hive] 데이터를 하나로 합치기  (0) 2016.02.29
[hive] 날짜 구하기  (0) 2016.02.26
[펌] hadoop streaming 기초 지식 쌓기  (0) 2016.02.17
[hadoop] top n 소팅  (0) 2016.02.16
[hadoop] scoop 쓸 때 유의사항  (0) 2016.02.05
[hadoop] hadoop distcp  (0) 2016.02.05
Posted by 김용환 '김용환'

댓글을 달아 주세요

[hadoop] top n 소팅

hadoop 2016. 2. 16. 20:56


hadoop을 돌려서 키와 개수를 얻었고, 이에 대한 top n 소팅을 하고 싶다.


url별 개수별로 hadoop map-reduce를 돌려 다음과 같이 얻었다고 가정한다.


hadoop fs -text /user/google/count/2016/02/15/*


/search/test  15

/search/abc  10

/search/check  20

...





sort와 head를 그냥 사용하면 결과를 얻을 수 있다.

hadoop fs -text /user/google/count/2016/02/15/* | sort -n -k2 -r | head -n3


/search/check  20

/search/test  15

/search/abc  10





Posted by 김용환 '김용환'

댓글을 달아 주세요


1)


"AND \$CONDITIONS"를 WHERE 끝에 반드시 써야 한다!!! 아 삽질~


sqoop...

--query "SELECT id, name

                 FROM $db_table

                 WHERE id >=1  AND \$CONDITIONS " \

....



출처 : https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html

If you want to import the results of a query in parallel, then each map task will need to execute a copy of the query, with results partitioned by bounding conditions inferred by Sqoop. Your query must include the token $CONDITIONS which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by.

<Note>

If you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"





2) 


병렬로 돌리고 싶다면, num-mappers을 사용한다.

--num-mappers $num_mappers


Posted by 김용환 '김용환'

댓글을 달아 주세요

[hadoop] hadoop distcp

hadoop 2016. 2. 5. 11:59


hadoo2의 distcp를 사용하여 hdfs끼리 복사하는 예시이다.


(문서를 보면, distcp2를 설명하지만, 사실 지금은 distcp를 쓰면 자동으로 distcp2이다..)

/usr/lib/hadoop-mapreduce/hadoop-distcp-2.6.0-cdh5.5.1.jar)


$ hadoop distcp -m 12 hdfs://internal-hadoop1.google.com/user/www/score /user/www/score


m은 동시에 copy할 수 있는 map 개수인데, 12라고 적어도 12개의 mapper가 동작하지 않을 수 있다. 실제 보니까, 내부 hadoop에서 13개의 mapper가 동작했다. 


만약 특정 사용자를 owner로 하고 싶다면, HADOOP_USER_NAME를 사용한다.


$ HADOOP_USER_NAME=google hadoop distcp -m 12 hdfs://internal-hadoop1.google.com/user/www/score  /user/www/score


$ hadoop fs -ls  /user/www/score

drwxr-xr-x   - google   supergroup          0 2016-02-05 11:21 /user/www/score




만약 이미 존재하는 파일이 있으면, overwite가 되지 않는다. chown을 잘 쓰거나 -overwrite와 -delete 하려 super user 권한(hdfs)이 필요하다.


$ HADOOP_USER_NAME=hdfs hadoop fs -chown deploy  /user/www/score


$ HADOOP_USER_NAME=hdfs hadoop distcp -m 12 -overwrite -delete hdfs://internal-hadoop1.google.com/user/www/score  /user/www/score



권한이 없으면, 아래와 같이 에러가 발생한다.


Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=deploy, access=WRITE, inode="/user/www/score":hdfs:supergroup:drwxr-xr-x



또한, 다중 복사가 가능하다.

hadoop fs -cp A디렉토리 B디렉토리 C디렉토리 하면,  C 디렉토리에 A디렉토리와 B디렉토리가 복사된다. 



출처 :

https://hadoop.apache.org/docs/r1.2.1/distcp.html

Posted by 김용환 '김용환'

댓글을 달아 주세요


hadoop2 실행시, map 99%, reduce 33%에서 잠깐 10분을 블럭되었다가 그 이후에 처리된다.

hadoop이 왜 10분 정도 블럭되는지 조사한 내용이다.



16/02/03 17:49:35 INFO mapreduce.Job:  map 99% reduce 33%



hadoop 장비 로그는 다음과 같다. 뭔가 10분동안 블럭되는 데. spilling이라는 문구가 눈에 보인다.

중간 중간 spilling 처리 작업 로그가 있고, 그 다음에 finished spillin이 있다.


2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: bufstart = 103441438; bufend = 199707946; bufvoid = 268435456 2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: kvstart = 25860352(103441408); kvend = 63348756(253395024); length = 29620461/16777216 2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (EQUATOR) 226551477 kvi 56637864(226551456) 2016-02-03 16:01:46,190 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276100000/276093708/0 in:320301=276100000/862 [rec/s] out:320294=276093713/862 [rec/s] 2016-02-03 16:01:46,392 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276200000/276190849/0 in:320417=276200000/862 [rec/s] out:320407=276190849/862 [rec/s] 2016-02-03 16:01:46,604 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276300000/276292909/0 in:320533=276300000/862 [rec/s] out:320525=276292913/862 [rec/s] 2016-02-03 16:01:46,816 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276400000/276394703/0 in:320278=276400000/863 [rec/s] out:320271=276394703/863 [rec/s] 2016-02-03 16:01:47,040 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276500000/276483714/0 in:320393=276500000/863 [rec/s] out:320375=276483716/863 [rec/s] 2016-02-03 16:01:47,305 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276600000/276587566/0 in:320509=276600000/863 [rec/s] out:320495=276587568/863 [rec/s] 2016-02-03 16:01:47,556 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276700000/276688104/0 in:320625=276700000/863 [rec/s] out:320611=276688106/863 [rec/s] 2016-02-03 16:01:47,813 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276800000/276781368/0 in:320370=276800000/864 [rec/s] out:320348=276781372/864 [rec/s] 2016-02-03 16:01:48,094 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276900000/276885473/0 in:320486=276900000/864 [rec/s] out:320469=276885475/864 [rec/s] 2016-02-03 16:01:48,322 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277000000/276994801/0 in:320601=277000000/864 [rec/s] out:320595=276994807/864 [rec/s] 2016-02-03 16:01:48,526 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277100000/277091737/0 in:320717=277100000/864 [rec/s] out:320708=277091742/864 [rec/s] 2016-02-03 16:01:48,738 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277200000/277193750/0 in:320462=277200000/865 [rec/s] out:320455=277193754/865 [rec/s] 2016-02-03 16:01:48,938 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277300000/277290152/0 in:320578=277300000/865 [rec/s] out:320566=277290152/865 [rec/s] 2016-02-03 16:01:49,150 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277400000/277392006/0 in:320693=277400000/865 [rec/s] out:320684=277392009/865 [rec/s] 2016-02-03 16:01:49,362 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277500000/277494280/0 in:320809=277500000/865 [rec/s] out:320802=277494284/865 [rec/s] 2016-02-03 16:01:49,564 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277600000/277590991/0 in:320924=277600000/865 [rec/s] out:320914=277590995/865 [rec/s] 2016-02-03 16:01:49,774 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277700000/277692050/0 in:320669=277700000/866 [rec/s] out:320660=277692052/866 [rec/s] 2016-02-03 16:01:56,331 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill 36 2016-02-03 16:01:56,331 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (RESET) equator 226551477 kv 56637864(226551456) kvi 49926992(199707968) 2016-02-03 16:01:56,331 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=277774720/277764057 2016-02-03 16:01:56,386 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277800000/277793841/0 in:318577=277800000/872 [rec/s] out:318570=277793848/872 [rec/s] 2016-02-03 16:01:56,589 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277900000/277890374/0 in:318692=277900000/872 [rec/s] out:318681=277890374/872 [rec/s] 2016-02-03 16:01:56,804 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278000000/277992774/0 in:318442=278000000/873 [rec/s] out:318433=277992774/873 [rec/s] 2016-02-03 16:01:57,018 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278100000/278094543/0 in:318556=278100000/873 [rec/s] out:318550=278094543/873 [rec/s] 2016-02-03 16:01:57,224 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278200000/278190642/0 in:318671=278200000/873 [rec/s] out:318660=278190642/873 [rec/s] 2016-02-03 16:01:57,436 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278300000/278293042/0 in:318785=278300000/873 [rec/s] out:318777=278293042/873 [rec/s] 2016-02-03 16:01:57,648 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278400000/278395280/0 in:318535=278400000/874 [rec/s] out:318530=278395294/874 [rec/s] 2016-02-03 16:01:57,852 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278500000/278492944/0 in:318649=278500000/874 [rec/s] out:318641=278492957/874 [rec/s] 2016-02-03 16:01:58,061 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278600000/278593310/0 in:318764=278600000/874 [rec/s] out:318756=278593310/874 [rec/s] 2016-02-03 16:01:58,261 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278700000/278689724/0 in:318878=278700000/874 [rec/s] out:318866=278689724/874 [rec/s] 2016-02-03 16:01:58,474 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278800000/278792124/0 in:318993=278800000/874 [rec/s] out:318984=278792124/874 [rec/s] 2016-02-03 16:01:58,712 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278900000/278893894/0 in:318742=278900000/875 [rec/s] out:318735=278893894/875 [rec/s] 2016-02-03 16:01:58,918 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279000000/278990470/0 in:318857=279000000/875 [rec/s] out:318846=278990482/875 [rec/s] 2016-02-03 16:01:59,136 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279100000/279092392/0 in:318971=279100000/875 [rec/s] out:318962=279092392/875 [rec/s] 2016-02-03 16:01:59,354 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279200000/279194792/0 in:319085=279200000/875 [rec/s] out:319079=279194792/875 [rec/s] 2016-02-03 16:01:59,559 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279300000/279290635/0 in:319200=279300000/875 [rec/s] out:319189=279290649/875 [rec/s] 2016-02-03 16:01:59,778 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279400000/279393291/0 in:318949=279400000/876 [rec/s] out:318942=279393291/876 [rec/s] 2016-02-03 16:02:00,019 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279500000/279494745/0 in:319063=279500000/876 [rec/s] out:319057=279494745/876 [rec/s] 2016-02-03 16:02:00,257 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279600000/279591288/0 in:319178=279600000/876 [rec/s] out:319168=279591304/876 [rec/s] 2016-02-03 16:02:00,476 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279700000/279693244/0 in:319292=279700000/876 [rec/s] out:319284=279693244/876 [rec/s] 2016-02-03 16:02:00,681 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279800000/279789972/0 in:319042=279800000/877 [rec/s] out:319030=279789972/877 [rec/s] 2016-02-03 16:02:00,900 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279900000/279891742/0 in:319156=279900000/877 [rec/s] out:319146=279891742/877 [rec/s] 2016-02-03 16:02:01,145 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280000000/279993626/0 in:319270=280000000/877 [rec/s] out:319262=279993633/877 [rec/s] 2016-02-03 16:02:01,356 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280100000/280090386/0 in:319384=280100000/877 [rec/s] out:319373=280090399/877 [rec/s] 2016-02-03 16:02:01,573 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280200000/280192011/0 in:319498=280200000/877 [rec/s] out:319489=280192011/877 [rec/s] 2016-02-03 16:02:01,790 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280300000/280294095/0 in:319248=280300000/878 [rec/s] out:319241=280294095/878 [rec/s] 2016-02-03 16:02:01,997 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280400000/280390824/0 in:319362=280400000/878 [rec/s] out:319351=280390824/878 [rec/s] 2016-02-03 16:02:02,216 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280500000/280492909/0 in:319476=280500000/878 [rec/s] out:319468=280492909/878 [rec/s] 2016-02-03 16:02:02,434 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280600000/280595309/0 in:319589=280600000/878 [rec/s] out:319584=280595309/878 [rec/s] 2016-02-03 16:02:02,639 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280700000/280691092/0 in:319340=280700000/879 [rec/s] out:319330=280691092/879 [rec/s] 2016-02-03 16:02:02,861 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280800000/280793492/0 in:319453=280800000/879 [rec/s] out:319446=280793492/879 [rec/s] 2016-02-03 16:02:03,067 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280900000/280890221/0 in:319567=280900000/879 [rec/s] out:319556=280890221/879 [rec/s] 2016-02-03 16:02:03,286 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281000000/280992306/0 in:319681=281000000/879 [rec/s] out:319672=280992306/879 [rec/s] 2016-02-03 16:02:03,504 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281100000/281094076/0 in:319795=281100000/879 [rec/s] out:319788=281094076/879 [rec/s] 2016-02-03 16:02:03,709 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281200000/281190489/0 in:319545=281200000/880 [rec/s] out:319534=281190489/880 [rec/s] 2016-02-03 16:02:03,927 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281300000/281292259/0 in:319659=281300000/880 [rec/s] out:319650=281292259/880 [rec/s] 2016-02-03 16:02:04,147 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281400000/281395289/0 in:319772=281400000/880 [rec/s] out:319767=281395289/880 [rec/s] 2016-02-03 16:02:04,353 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281500000/281491386/0 in:319886=281500000/880 [rec/s] out:319876=281491388/880 [rec/s] 2016-02-03 16:02:04,571 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281600000/281593159/0 in:320000=281600000/880 [rec/s] out:319992=281593176/880 [rec/s] 2016-02-03 16:02:04,777 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281700000/281689571/0 in:319750=281700000/881 [rec/s] out:319738=281689571/881 [rec/s] 2016-02-03 16:02:04,996 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281800000/281791971/0 in:319863=281800000/881 [rec/s] out:319854=281791971/881 [rec/s] 2016-02-03 16:02:05,216 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281900000/281894686/0 in:319977=281900000/881 [rec/s] out:319971=281894686/881 [rec/s] 2016-02-03 16:02:05,417 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282000000/281990736/0 in:320090=282000000/881 [rec/s] out:320080=281990747/881 [rec/s] 2016-02-03 16:02:05,629 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282100000/282092555/0 in:319841=282100000/882 [rec/s] out:319832=282092555/882 [rec/s] 2016-02-03 16:02:05,843 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282200000/282195408/0 in:319954=282200000/882 [rec/s] out:319949=282195419/882 [rec/s] 2016-02-03 16:02:06,043 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282300000/282291497/0 in:320068=282300000/882 [rec/s] out:320058=282291509/882 [rec/s] 2016-02-03 16:02:06,256 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282400000/282393768/0 in:320181=282400000/882 [rec/s] out:320174=282393768/882 [rec/s] 2016-02-03 16:02:06,332 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=282437066/282429765 2016-02-03 16:02:06,456 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282500000/282489867/0 in:320294=282500000/882 [rec/s] out:320283=282489867/882 [rec/s] 2016-02-03 16:02:06,669 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282600000/282592267/0 in:320045=282600000/883 [rec/s] out:320036=282592267/883 [rec/s] 2016-02-03 16:02:06,881 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282700000/282694351/0 in:320158=282700000/883 [rec/s] out:320152=282694351/883 [rec/s] 2016-02-03 16:02:07,083 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282800000/282790765/0 in:320271=282800000/883 [rec/s] out:320261=282790765/883 [rec/s] 2016-02-03 16:02:07,300 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282900000/282893165/0 in:320385=282900000/883 [rec/s] out:320377=282893165/883 [rec/s] 2016-02-03 16:02:07,517 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283000000/282994935/0 in:320498=283000000/883 [rec/s] out:320492=282994935/883 [rec/s] 2016-02-03 16:02:07,723 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283100000/283091348/0 in:320248=283100000/884 [rec/s] out:320239=283091348/884 [rec/s] 2016-02-03 16:02:07,942 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283200000/283193748/0 in:320361=283200000/884 [rec/s] out:320354=283193748/884 [rec/s] 2016-02-03 16:02:08,148 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283300000/283290162/0 in:320475=283300000/884 [rec/s] out:320463=283290162/884 [rec/s] 2016-02-03 16:02:08,367 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283400000/283392247/0 in:320588=283400000/884 [rec/s] out:320579=283392247/884 [rec/s] 2016-02-03 16:02:08,578 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:02:08,578 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: bufstart = 226551477; bufend = 54382536; bufvoid = 268435449 2016-02-03 16:02:08,578 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: kvstart = 56637864(226551456); kvend = 27017404(108069616); length = 29620461/16777216 2016-02-03 16:02:08,579 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (EQUATOR) 81226068 kvi 20306512(81226048) 2016-02-03 16:02:08,584 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283500000/283494017/0 in:320701=283500000/884 [rec/s] out:320694=283494017/884 [rec/s] 2016-02-03 16:02:08,791 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283600000/283590837/0 in:320451=283600000/885 [rec/s] out:320441=283590843/885 [rec/s] 2016-02-03 16:02:09,011 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283700000/283693145/0 in:320564=283700000/885 [rec/s] out:320557=283693145/885 [rec/s] 2016-02-03 16:02:09,229 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283800000/283795485/0 in:320677=283800000/885 [rec/s] out:320672=283795491/885 [rec/s] 2016-02-03 16:02:09,436 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283900000/283891792/0 in:320790=283900000/885 [rec/s] out:320781=283891799/885 [rec/s] 2016-02-03 16:02:09,655 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284000000/283994044/0 in:320541=284000000/886 [rec/s] out:320535=283994044/886 [rec/s] 2016-02-03 16:02:09,861 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284100000/284090268/0 in:320654=284100000/886 [rec/s] out:320643=284090275/886 [rec/s] 2016-02-03 16:02:10,080 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284200000/284192542/0 in:320767=284200000/886 [rec/s] out:320759=284192542/886 [rec/s] 2016-02-03 16:02:10,300 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284300000/284294942/0 in:320880=284300000/886 [rec/s] out:320874=284294942/886 [rec/s] 2016-02-03 16:02:10,509 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284400000/284391671/0 in:320993=284400000/886 [rec/s] out:320983=284391671/886 [rec/s] 2016-02-03 16:02:10,721 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284500000/284493147/0 in:320744=284500000/887 [rec/s] out:320736=284493151/887 [rec/s] 2016-02-03 16:02:10,930 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284600000/284581734/0 in:320856=284600000/887 [rec/s] out:320836=284581735/887 [rec/s] 2016-02-03 16:02:11,338 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284700000/284683001/0 in:320969=284700000/887 [rec/s] out:320950=284683002/887 [rec/s] 2016-02-03 16:02:11,768 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284800000/284786400/0 in:320720=284800000/888 [rec/s] out:320705=284786401/888 [rec/s] 2016-02-03 16:02:12,123 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284900000/284881064/0 in:320833=284900000/888 [rec/s] out:320812=284881065/888 [rec/s] 2016-02-03 16:02:12,521 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285000000/284983177/0 in:320945=285000000/888 [rec/s] out:320927=284983178/888 [rec/s] 2016-02-03 16:02:12,944 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285100000/285084074/0 in:320697=285100000/889 [rec/s] out:320679=285084075/889 [rec/s] 2016-02-03 16:02:19,043 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill 37 2016-02-03 16:02:19,043 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (RESET) equator 81226068 kv 20306512(81226048) kvi 13595640(54382560) 2016-02-03 16:02:19,043 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=285194313/285169173 2016-02-03 16:02:19,062 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285200000/285190706/0 in:318659=285200000/895 [rec/s] out:318648=285190706/895 [rec/s] 2016-02-03 16:02:19,281 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285300000/285293106/0 in:318770=285300000/895 [rec/s] out:318763=285293106/895 [rec/s] 2016-02-03 16:02:19,488 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285400000/285390070/0 in:318882=285400000/895 [rec/s] out:318871=285390083/895 [rec/s] 2016-02-03 16:02:19,706 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285500000/285491604/0 in:318638=285500000/896 [rec/s] out:318629=285491604/896 [rec/s] 2016-02-03 16:02:19,925 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285600000/285594187/0 in:318750=285600000/896 [rec/s] out:318743=285594200/896 [rec/s] 2016-02-03 16:02:20,139 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285700000/285690733/0 in:318861=285700000/896 [rec/s] out:318851=285690733/896 [rec/s] 2016-02-03 16:02:20,382 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285800000/285793133/0 in:318973=285800000/896 [rec/s] out:318965=285793133/896 [rec/s] 2016-02-03 16:02:20,621 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285900000/285894903/0 in:318729=285900000/897 [rec/s] out:318723=285894903/897 [rec/s] 2016-02-03 16:02:20,829 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286000000/285991947/0 in:318840=286000000/897 [rec/s] out:318831=285991947/897 [rec/s] 2016-02-03 16:02:21,045 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286100000/286094086/0 in:318952=286100000/897 [rec/s] out:318945=286094100/897 [rec/s] 2016-02-03 16:02:21,266 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286200000/286189815/0 in:319063=286200000/897 [rec/s] out:319052=286189815/897 [rec/s] 2016-02-03 16:02:21,523 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286300000/286292272/0 in:319175=286300000/897 [rec/s] out:319166=286292278/897 [rec/s] 2016-02-03 16:02:21,780 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286400000/286394615/0 in:318930=286400000/898 [rec/s] out:318924=286394615/898 [rec/s] 2016-02-03 16:02:22,014 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286500000/286490713/0 in:319042=286500000/898 [rec/s] out:319031=286490713/898 [rec/s] 2016-02-03 16:02:22,234 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286600000/286593743/0 in:319153=286600000/898 [rec/s] out:319146=286593743/898 [rec/s] 2016-02-03 16:02:22,439 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286700000/286689784/0 in:319265=286700000/898 [rec/s] out:319253=286689796/898 [rec/s] 2016-02-03 16:02:22,658 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286800000/286792242/0 in:319021=286800000/899 [rec/s] out:319012=286792242/899 [rec/s] 2016-02-03 16:02:22,876 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286900000/286894327/0 in:319132=286900000/899 [rec/s] out:319126=286894327/899 [rec/s] 2016-02-03 16:02:23,083 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287000000/286990425/0 in:319243=287000000/899 [rec/s] out:319232=286990425/899 [rec/s] 2016-02-03 16:02:23,301 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287100000/287092511/0 in:319354=287100000/899 [rec/s] out:319346=287092524/899 [rec/s] 2016-02-03 16:02:23,510 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287200000/287189705/0 in:319466=287200000/899 [rec/s] out:319454=287189718/899 [rec/s] 2016-02-03 16:02:23,724 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287300000/287291698/0 in:319222=287300000/900 [rec/s] out:319213=287291711/900 [rec/s] 2016-02-03 16:02:23,942 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287400000/287394039/0 in:319333=287400000/900 [rec/s] out:319326=287394039/900 [rec/s] 2016-02-03 16:02:24,144 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287500000/287490452/0 in:319444=287500000/900 [rec/s] out:319433=287490452/900 [rec/s] 2016-02-03 16:02:24,356 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287600000/287592271/0 in:319555=287600000/900 [rec/s] out:319546=287592285/900 [rec/s] 2016-02-03 16:02:24,578 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287700000/287695567/0 in:319666=287700000/900 [rec/s] out:319661=287695567/900 [rec/s] 2016-02-03 16:02:24,783 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287800000/287791666/0 in:319422=287800000/901 [rec/s] out:319413=287791666/901 [rec/s] 2016-02-03 16:02:25,004 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287900000/287894052/0 in:319533=287900000/901 [rec/s] out:319527=287894066/901 [rec/s] 2016-02-03 16:02:25,226 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288000000/287990226/0 in:319644=288000000/901 [rec/s] out:319634=287990240/901 [rec/s] 2016-02-03 16:02:25,451 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288100000/288092564/0 in:319755=288100000/901 [rec/s] out:319747=288092564/901 [rec/s] 2016-02-03 16:02:25,671 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288200000/288194354/0 in:319512=288200000/902 [rec/s] out:319505=288194371/902 [rec/s] 2016-02-03 16:02:25,887 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288300000/288291063/0 in:319623=288300000/902 [rec/s] out:319613=288291063/902 [rec/s] 2016-02-03 16:02:26,108 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288400000/288393778/0 in:319733=288400000/902 [rec/s] out:319727=288393778/902 [rec/s] 2016-02-03 16:02:26,313 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288500000/288489876/0 in:319844=288500000/902 [rec/s] out:319833=288489876/902 [rec/s] 2016-02-03 16:02:26,532 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288600000/288592276/0 in:319955=288600000/902 [rec/s] out:319947=288592276/902 [rec/s] 2016-02-03 16:02:26,753 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288700000/288694676/0 in:319712=288700000/903 [rec/s] out:319706=288694676/903 [rec/s] 2016-02-03 16:02:26,979 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288800000/288790775/0 in:319822=288800000/903 [rec/s] out:319812=288790775/903 [rec/s] 2016-02-03 16:02:27,303 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288900000/288893740/0 in:319933=288900000/903 [rec/s] out:319926=288893777/903 [rec/s] 2016-02-03 16:02:27,550 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289000000/288990595/0 in:320044=289000000/903 [rec/s] out:320033=288990603/903 [rec/s] 2016-02-03 16:02:27,803 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289100000/289091673/0 in:319800=289100000/904 [rec/s] out:319791=289091673/904 [rec/s] 2016-02-03 16:02:28,061 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289200000/289194165/0 in:319911=289200000/904 [rec/s] out:319905=289194171/904 [rec/s] 2016-02-03 16:02:28,302 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289300000/289290487/0 in:320022=289300000/904 [rec/s] out:320011=289290487/904 [rec/s] 2016-02-03 16:02:28,553 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289400000/289392572/0 in:320132=289400000/904 [rec/s] out:320124=289392572/904 [rec/s] 2016-02-03 16:02:28,809 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289500000/289494657/0 in:319889=289500000/905 [rec/s] out:319883=289494657/905 [rec/s] 2016-02-03 16:02:29,044 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=289594400/289588236 2016-02-03 16:02:29,051 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289600000/289591070/0 in:320000=289600000/905 [rec/s] out:319990=289591070/905 [rec/s] 2016-02-03 16:02:29,308 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289700000/289693470/0 in:320110=289700000/905 [rec/s] out:320103=289693470/905 [rec/s] 2016-02-03 16:02:29,550 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289800000/289789569/0 in:320220=289800000/905 [rec/s] out:320209=289789569/905 [rec/s] 2016-02-03 16:02:29,806 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289900000/289891471/0 in:319977=289900000/906 [rec/s] out:319968=289891478/906 [rec/s] 2016-02-03 16:02:30,063 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290000000/289993739/0 in:320088=290000000/906 [rec/s] out:320081=289993739/906 [rec/s] 2016-02-03 16:02:30,304 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290100000/290089522/0 in:320198=290100000/906 [rec/s] out:320187=290089522/906 [rec/s] 2016-02-03 16:02:30,561 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290200000/290191914/0 in:320309=290200000/906 [rec/s] out:320300=290191922/906 [rec/s] 2016-02-03 16:02:30,819 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290300000/290294007/0 in:320066=290300000/907 [rec/s] out:320059=290294007/907 [rec/s] 2016-02-03 16:02:31,059 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290400000/290389790/0 in:320176=290400000/907 [rec/s] out:320165=290389790/907 [rec/s] 2016-02-03 16:02:31,316 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290500000/290492190/0 in:320286=290500000/907 [rec/s] out:320278=290492190/907 [rec/s] 2016-02-03 16:02:31,572 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290600000/290594275/0 in:320396=290600000/907 [rec/s] out:320390=290594275/907 [rec/s] 2016-02-03 16:02:31,813 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290700000/290690059/0 in:320154=290700000/908 [rec/s] out:320143=290690059/908 [rec/s] 2016-02-03 16:02:32,069 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290800000/290791856/0 in:320264=290800000/908 [rec/s] out:320255=290791863/908 [rec/s] 2016-02-03 16:02:32,326 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290900000/290894173/0 in:320374=290900000/908 [rec/s] out:320368=290894181/908 [rec/s] 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: bufstart = 81226068; bufend = 177492576; bufvoid = 268435456 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: kvstart = 20306512(81226048); kvend = 57794916(231179664); length = 29620461/16777216 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (EQUATOR) 204336112 kvi 51084024(204336096) 2016-02-03 16:02:32,539 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291000000/290990287/0 in:320484=291000000/908 [rec/s] out:320473=290990288/908 [rec/s] 2016-02-03 16:02:32,758 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291100000/291092483/0 in:320242=291100000/909 [rec/s] out:320233=291092484/909 [rec/s] 2016-02-03 16:02:32,976 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291200000/291194347/0 in:320352=291200000/909 [rec/s] out:320345=291194349/909 [rec/s] 2016-02-03 16:02:33,182 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291300000/291290726/0 in:320462=291300000/909 [rec/s] out:320451=291290728/909 [rec/s] 2016-02-03 16:02:33,399 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291400000/291392485/0 in:320572=291400000/909 [rec/s] out:320563=291392488/909 [rec/s] 2016-02-03 16:02:33,619 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291500000/291492639/0 in:320329=291500000/910 [rec/s] out:320321=291492641/910 [rec/s] 2016-02-03 16:02:33,824 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291600000/291588953/0 in:320439=291600000/910 [rec/s] out:320427=291588954/910 [rec/s] 2016-02-03 16:02:34,042 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291700000/291692874/0 in:320549=291700000/910 [rec/s] out:320541=291692879/910 [rec/s] 2016-02-03 16:02:34,260 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291800000/291794877/0 in:320659=291800000/910 [rec/s] out:320653=291794879/910 [rec/s] 2016-02-03 16:02:34,530 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291900000/291881089/0 in:320769=291900000/910 [rec/s] out:320748=291881093/910 [rec/s] 2016-02-03 16:02:34,869 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292000000/291983351/0 in:320526=292000000/911 [rec/s] out:320508=291983353/911 [rec/s] 2016-02-03 16:02:35,202 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292100000/292084554/0 in:320636=292100000/911 [rec/s] out:320619=292084555/911 [rec/s] 2016-02-03 16:02:35,520 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292200000/292180764/0 in:320746=292200000/911 [rec/s] out:320725=292180766/911 [rec/s] 2016-02-03 16:02:35,865 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292300000/292287374/0 in:320504=292300000/912 [rec/s] out:320490=292287375/912 [rec/s] 2016-02-03 16:02:36,186 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292400000/292387879/0 in:320614=292400000/912 [rec/s] out:320600=292387880/912 [rec/s] 2016-02-03 16:02:36,513 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292500000/292483039/0 in:320723=292500000/912 [rec/s] out:320705=292483041/912 [rec/s] 2016-02-03 16:02:43,788 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill 38 2016-02-03 16:02:43,788 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (RESET) equator 204336112 kv 51084024(204336096) kvi 44373148(177492592) 2016-02-03 16:02:43,788 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=292589102/292574290 2016-02-03 16:02:43,810 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292600000/292593753/0 in:318043=292600000/920 [rec/s] out:318036=292593753/920 [rec/s] 2016-02-03 16:02:44,028 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292700000/292695523/0 in:318152=292700000/920 [rec/s] out:318147=292695523/920 [rec/s] 2016-02-03 16:02:44,232 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292800000/292791307/0 in:318260=292800000/920 [rec/s] out:318251=292791307/920 [rec/s] 2016-02-03 16:02:44,451 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292900000/292893391/0 in:318369=292900000/920 [rec/s] out:318362=292893391/920 [rec/s] 2016-02-03 16:02:44,669 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293000000/292995476/0 in:318132=293000000/921 [rec/s] out:318127=292995476/921 [rec/s] 2016-02-03 16:02:44,875 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293100000/293091890/0 in:318241=293100000/921 [rec/s] out:318232=293091890/921 [rec/s] 2016-02-03 16:02:45,092 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293200000/293193030/0 in:318349=293200000/921 [rec/s] out:318342=293193030/921 [rec/s] 2016-02-03 16:02:45,311 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293300000/293295115/0 in:318458=293300000/921 [rec/s] out:318452=293295115/921 [rec/s] 2016-02-03 16:02:45,519 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293400000/293391454/0 in:318566=293400000/921 [rec/s] out:318557=293391467/921 [rec/s] 2016-02-03 16:02:45,738 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293500000/293493613/0 in:318329=293500000/922 [rec/s] out:318322=293493613/922 [rec/s] 2016-02-03 16:02:45,956 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293600000/293595383/0 in:318438=293600000/922 [rec/s] out:318433=293595383/922 [rec/s] 2016-02-03 16:02:46,163 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293700000/293691746/0 in:318546=293700000/922 [rec/s] out:318537=293691759/922 [rec/s] 2016-02-03 16:02:46,382 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293800000/293793566/0 in:318655=293800000/922 [rec/s] out:318648=293793566/922 [rec/s] 2016-02-03 16:02:46,587 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293900000/293889778/0 in:318763=293900000/922 [rec/s] out:318752=293889791/922 [rec/s] 2016-02-03 16:02:46,805 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=294000000/293991750/0 in:318526=294000000/923 [rec/s] out:318517=293991750/923 [rec/s] 2016-02-03 16:02:47,022 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=294100000/294093204/0 in:318634=294100000/923 [rec/s] out:318627=294093204/923 [rec/s] 2016-02-03 16:02:47,132 INFO [Thread-14] org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done 2016-02-03 16:02:47,132 INFO [main] org.apache.hadoop.streaming.PipeMapRed: mapRedFinished 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 204336112; bufend = 246541860; bufvoid = 268435456 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 51084024(204336096); kvend = 38097644(152390576); length = 12986381/16777216 2016-02-03 16:02:50,457 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 39 2016-02-03 16:02:50,474 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:02:50,479 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,480 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,481 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,482 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,483 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,484 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,485 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,486 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,487 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,487 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,488 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,489 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,490 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,491 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,492 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,493 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,493 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,494 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,495 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,496 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,497 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,498 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,499 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,500 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,501 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,501 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,502 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,503 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,504 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,505 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,506 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,507 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,508 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,509 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,509 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,510 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,511 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,512 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,513 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,514 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,515 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11576265 bytes 2016-02-03 16:02:55,997 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:02:56,017 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11524779 bytes 2016-02-03 16:03:01,378 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:01,398 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11597690 bytes 2016-02-03 16:03:06,753 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:06,772 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11584220 bytes 2016-02-03 16:03:12,195 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:12,214 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11554950 bytes 2016-02-03 16:03:17,596 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:17,615 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11527865 bytes 2016-02-03 16:03:22,982 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:23,001 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11606423 bytes 2016-02-03 16:03:28,343 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:28,366 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11530533 bytes 2016-02-03 16:03:33,597 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:33,616 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11546832 bytes 2016-02-03 16:03:39,020 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:39,043 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11504278 bytes 2016-02-03 16:03:44,336 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:44,354 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11560124 bytes 2016-02-03 16:03:49,646 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:49,664 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11507371 bytes 2016-02-03 16:03:54,891 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:54,910 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11553016 bytes 2016-02-03 16:04:00,273 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:00,291 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11584433 bytes 2016-02-03 16:04:05,615 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:05,638 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11563511 bytes 2016-02-03 16:04:11,095 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:11,113 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11543123 bytes 2016-02-03 16:04:16,364 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:16,382 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11561494 bytes 2016-02-03 16:04:21,754 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:21,772 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11533300 bytes 2016-02-03 16:04:27,097 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:27,118 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11546776 bytes 2016-02-03 16:04:32,408 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:32,425 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11524621 bytes 2016-02-03 16:04:37,744 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1450233158783_28583_m_000000_0 is done. And is in the process of committing 2016-02-03 16:04:37,818 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1450233158783_28583_m_000000_0' done. 2016-02-03 16:04:37,918 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system... 2016-02-03 16:04:37,919 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped. 2016-02-03 16:04:37,919 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.





reducer의 첫 번째 33%는 copy 단계이고, 다음 33%는 shuffle/sort 단계이고, 다음 (마지막) 33 %는 실제 reduce 작업이다.



map 태스크가 완료되면, map 태스크의 결과를 reduce 작업이 일어나는 곳으로 복사(copy)를 한다. map과 reduce는 동일 장비에서 일어나지 않기 때문에, map 태스트 작업이 완료되면, map 단계가 다 실행하기 전에 reduce 작업 단계에 일부 변경 내용을 알려 메모리를 복사시켜야 한다. 완료된 map 태스크 작업 내용이 있는 반면, 특정 map 태스크 작업 내용은 길어질 수 있다. 바로 이 부분 때문이다. 특정 map태스크가 길면 블럭이 되는 것이다.


모든 map 작업이 완료되어야 shuffle 작업을 시작한다. 또한 reduce의 전제 조건은 map에서 데이터를 다 받기 전까지 sort할 수 없기 때문이다. 



그림으로 설명하면, (출처 : http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.htm)

map 단계와 reduce 단계는 다음과 같다.






SPILLING의 단계는 다음과 같다.  (출처 : http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.htm)







Posted by 김용환 '김용환'

댓글을 달아 주세요