hadoop 1.2.1 과 hive 1.0.1에 대한 설치 내용을 간략히 기술한다.



* hadoop 로컬 실행 모드 설치


1) ssh

hadoop을 로컬 실행 모드로 실행하기 위해서 ssh가 연결할 수 있는 환경이어야 한다.

./bin/start-all.sh실행시 ssh로 연결하니 ssh를 잘 연결해야 한다. 



$ ssh-keygen

$ cat ~/.ssh/id_rsa.pub | ssh localhost 'cat >> ~/.ssh/authorized_keys'

$ ssh -l 계정 localhost

또는 

$ ssh 계정@localhost

$ cat ~/.ssh/id_rsa.pub >>  ~/.ssh/known_hosts



맥 환경이면, 시스템 설정 -> 공유 -> 원격 로그인을 활성화하여 ssh접근 되게 함


2) 환경설정

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home




3) 하둡 다운로드

curl http://apache.tt.co.kr/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz 

설치 

mv 압축디렉토리 /usr/local/hadoop-1.2.1


4) 하둡 설정 수정 

$ vi /usr/local/hadoop-1.2.1/conf/mapred-site.xml

<configuration>

    <property>

        <name>mapred.job.tracker</name>

        <value>localhost:9001</value>

    </property>

</configuration>


$ vi /usr/local/hadoop-1.2.1/conf/hdfs-site.xml

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

</configuration>


$ vi /usr/local/hadoop-1.2.1/conf/core-site.xml

<configuration>

    <property>

        <name>fs.default.name</name>

        <value>hdfs://localhost:9000</value>

   </property>

</configuration>



5) 환경 변수 추가

.bashrc에 다음을 추가

(만약 JAVA_HOME을 설정안했다는 로그가 출력하면, .profile에도 추가한다.)


export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home

export PATH=/usr/local/hadoop-1.2.1/bin:$PATH



6) 네임 노드를 포맷한다.


$ ./bin/hadoop namenode -format

만약 권한 이슈관련 에러가 발생한다면, 에러가 발생한 네임노드 디렉토리에 chmod 755 명령어를 이용한다.

(네임노드 포맷을 하지 않으면, http://localhost:50070/dfshealth.jsp 페이지를 열 수 없으며, 

logs 디렉토리의 로그 파일 에서 ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000 이란 로그를 볼 수 있을 것이다.)


예) chmod 755 /tmp/hadoop/dfs/name




7) 실행


$ ./bin/start-all.sh


에러 없고, 데몬 잘 뜬 경우라면 잘된 경우이다. 




8) 확인


브라우져에서 http://localhost:50030/jobtracker.jsp를 열어서 jobtracker 페이지가 동작하는 지 확인한다.

브라우져에서 http://localhost:50070/dfshealth.jsp를 열어서 namenode 페이가 동작하는 지 확인한다.




* hive 설치 


1) hive 1.0.1 버전을 설치

http://apache.tt.co.kr/hive/hive-1.0.1/apache-hive-1.0.1-bin.tar.gz를 다운로드한다. 


압축을 풀고, /usr/local/hive-1.0.1에 복사한다. 


2) 권한 설정


hadoop fs -mkdir /tmp

hadoop fs -mkdir /user/hive/warehouse

hadoop fs -chmod go+w /tmp

hadoop fs -chmod go+w /user/hive/warehouse

hadoop fs -chmod go+w /tmp/hive



3) PATH 설정


bashrc에 PATH에 /usr/local/hive-1.0.1/bin를 추가한다. 


$ vi ~/.bashrc

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home

export PATH=/usr/local/hadoop-1.2.1/bin:/usr/local/hive-1.0.1/bin:$PATH



정상적으로 동작하는지 확인한다.


hive> show tables;

OK

Time taken: 0.012 seconds

hive> select 1 + 1;

OK

2

Time taken: 0.342 seconds, Fetched: 1 row(s)





'hadoop' 카테고리의 다른 글

[hive] hive cli history  (0) 2016.04.17
[hive] HiveServer2  (0) 2016.04.16
[hive] 함수 설명 보기  (0) 2016.03.28
[hive] 하이브는 등가 조인(equal join)만 지원한다.  (0) 2016.03.25
[hive] 데이터를 하나로 합치기  (0) 2016.02.29
Posted by '김용환'
,

[hive] 함수 설명 보기

hadoop 2016. 3. 28. 19:23


hive 함수를 보려면 다음 명령어를 사용한다.


SHOW FUNCTIONS; 

DESCRIBE FUNCTION 함수이름;

DESCRIBE FUNCTION EXTENDED 함수이름;

 


예)


> DESCRIBE FUNCTION  xpath_string;

OK

xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression

Time taken: 0.006 seconds, Fetched: 1 row(s)



> DESCRIBE FUNCTION EXTENDED xpath_string;

OK

xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression

Example:

  > SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') FROM src LIMIT 1;

  'cc'

  > SELECT xpath_string('<a><b>b1</b><b>b2</b></a>','a/b') FROM src LIMIT 1;

  'b1'

  > SELECT xpath_string('<a><b>b1</b><b>b2</b></a>','a/b[2]') FROM src LIMIT 1;

  'b2'

  > SELECT xpath_string('<a><b>b1</b><b>b2</b></a>','a') FROM src LIMIT 1;

  'b1b2'

Time taken: 0.01 seconds, Fetched: 10 row(s)



'hadoop' 카테고리의 다른 글

[hive] HiveServer2  (0) 2016.04.16
install hadoop 1.2.1 and hive 1.0.1  (0) 2016.03.29
[hive] 하이브는 등가 조인(equal join)만 지원한다.  (0) 2016.03.25
[hive] 데이터를 하나로 합치기  (0) 2016.02.29
[hive] 날짜 구하기  (0) 2016.02.26
Posted by '김용환'
,


하이브는 현재 등가 조인(equal join)만 지원한다. 비등가 조인(non equal join)은 지원하지 않는다. 

즉 = 만 된다.(예, on a.id = b.id)


https://issues.apache.org/jira/browse/HIVE-3133



대신, 다양한 조인을 지원한다. 


JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN,FULL OUTER JOIN, CROSS JOIN






'hadoop' 카테고리의 다른 글

install hadoop 1.2.1 and hive 1.0.1  (0) 2016.03.29
[hive] 함수 설명 보기  (0) 2016.03.28
[hive] 데이터를 하나로 합치기  (0) 2016.02.29
[hive] 날짜 구하기  (0) 2016.02.26
[펌] hadoop streaming 기초 지식 쌓기  (0) 2016.02.17
Posted by '김용환'
,


hive의 hql에서 하나의 테이블처럼 데이터를 하나로 합치는 방법은 다음과 같다. union all을 사용한다.



SELECT unioned.id, unioned.var1, unioned.var2

FROM (

  SELECT a.id, a.var1, a.var2

  FROM table_A a


  UNION ALL


  SELECT b.id, b.var1, b.var2

  from table_B b

) unioned;



주의할 점
1. FROM 절에 쓰는 테이블 명에 다른 alias 이름을 다르게 해줘야 한다.
2. 하위 질의의 필드명이 모두 동일해야 한다.
3. 특히 GROUP BY 절을 쓴다면, select에서 사용하는 as 절 뒷 부분이 아니라, 원래 select 원본 데이터를 기준으로 한다.


'hadoop' 카테고리의 다른 글

[hive] 함수 설명 보기  (0) 2016.03.28
[hive] 하이브는 등가 조인(equal join)만 지원한다.  (0) 2016.03.25
[hive] 날짜 구하기  (0) 2016.02.26
[펌] hadoop streaming 기초 지식 쌓기  (0) 2016.02.17
[hadoop] top n 소팅  (0) 2016.02.16
Posted by '김용환'
,

[hive] 날짜 구하기

hadoop 2016. 2. 26. 19:21



hive에서 날짜를 구하는 방법을 작성한 예시 코드이다.


오늘 날짜를 구하려면 다음과 같이 unix_timestamp를 이용하거나 unix_timestamp와 from_unixtime() 함수를 이용한 방식이 있다. 사람이 읽을 수 있는 함수를 보려면 from_unixtime() 함수를 잘 활용한다.


> select unix_timestamp();

1456481927


> select from_unixtime(unix_timestamp());

2016-02-26 19:21:52

 

YYYYMMDD와 같은 형태로 날짜의 특정 항목만 얻고 싶다면, year(), month()등과 같은 함수를 사용한다. 


unix_timestamp을 사용하는 방식을 사용하면 에러가 발생한다. 


> select year(unix_timestamp()), month(unix_timestamp()), day(unix_timestamp()), hour(unix_timestamp()), minute(unix_timestamp()), second(unix_timestamp());

 


current_timestamp이나 current_date로 현재 날짜를 얻을 수 있다. 두 개의 큰 차이는 current_date는 hour, minute, second 함수를 함께 사용할 수 없다.



> select year(current_timestamp), month(current_timestamp), day(current_timestamp), hour(current_timestamp), minute(current_timestamp), second(current_timestamp);

 2016 2 26 19 21 3

 


> select year(current_date), month(current_date), day(current_date), hour(current_date), minute(current_date), second(current_date);

 2016 2 26 NULL NULL NULL


Posted by '김용환'
,


hadoop streaming을 처음 하는 사람에게 추천하는 글



http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/




mapper.py

#!/usr/bin/env python import sys # input comes from STDIN (standard input) for line in sys.stdin: # remove leading and trailing whitespace line = line.strip() # split the line into words words = line.split() # increase counters for word in words: # write the results to STDOUT (standard output); # what we output here will be the input for the # Reduce step, i.e. the input for reducer.py # # tab-delimited; the trivial word count is 1 

print '%s\t%s' % (word, 1)




reducer.py


#!/usr/bin/env python from operator import itemgetter import sys current_word = None current_count = 0 word = None # input comes from STDIN for line in sys.stdin: # remove leading and trailing whitespace line = line.strip() # parse the input we got from mapper.py word, count = line.split('\t', 1) # convert count (currently a string) to int try: count = int(count) except ValueError: # count was not a number, so silently # ignore/discard this line continue # this IF-switch only works because Hadoop sorts map output # by key (here: word) before it is passed to the reducer if current_word == word: current_count += count else: if current_word: # write result to STDOUT print '%s\t%s' % (current_word, current_count) current_count = count current_word = word # do not forget to output the last word if needed! if current_word == word: 

print '%s\t%s' % (current_word, current_count)



실행


hadoop jar contrib/streaming/hadoop-*streaming*.jar \

-mapper ./mapper.py \

-reducer ./reducer.py \

-file ./mapper.py  \

-file ./reducer.py \ -input /user/hduser/gutenberg/* \

-output /user/hduser/gutenberg-output

'hadoop' 카테고리의 다른 글

[hive] 데이터를 하나로 합치기  (0) 2016.02.29
[hive] 날짜 구하기  (0) 2016.02.26
[hadoop] top n 소팅  (0) 2016.02.16
[hadoop] scoop 쓸 때 유의사항  (0) 2016.02.05
[hadoop] hadoop distcp  (0) 2016.02.05
Posted by '김용환'
,

[hadoop] top n 소팅

hadoop 2016. 2. 16. 20:56


hadoop을 돌려서 키와 개수를 얻었고, 이에 대한 top n 소팅을 하고 싶다.


url별 개수별로 hadoop map-reduce를 돌려 다음과 같이 얻었다고 가정한다.


hadoop fs -text /user/google/count/2016/02/15/*


/search/test  15

/search/abc  10

/search/check  20

...





sort와 head를 그냥 사용하면 결과를 얻을 수 있다.

hadoop fs -text /user/google/count/2016/02/15/* | sort -n -k2 -r | head -n3


/search/check  20

/search/test  15

/search/abc  10





Posted by '김용환'
,


1)


"AND \$CONDITIONS"를 WHERE 끝에 반드시 써야 한다!!! 아 삽질~


sqoop...

--query "SELECT id, name

                 FROM $db_table

                 WHERE id >=1  AND \$CONDITIONS " \

....



출처 : https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html

If you want to import the results of a query in parallel, then each map task will need to execute a copy of the query, with results partitioned by bounding conditions inferred by Sqoop. Your query must include the token $CONDITIONS which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by.

<Note>

If you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"





2) 


병렬로 돌리고 싶다면, num-mappers을 사용한다.

--num-mappers $num_mappers


Posted by '김용환'
,

[hadoop] hadoop distcp

hadoop 2016. 2. 5. 11:59


hadoo2의 distcp를 사용하여 hdfs끼리 복사하는 예시이다.


(문서를 보면, distcp2를 설명하지만, 사실 지금은 distcp를 쓰면 자동으로 distcp2이다..)

/usr/lib/hadoop-mapreduce/hadoop-distcp-2.6.0-cdh5.5.1.jar)


$ hadoop distcp -m 12 hdfs://internal-hadoop1.google.com/user/www/score /user/www/score


m은 동시에 copy할 수 있는 map 개수인데, 12라고 적어도 12개의 mapper가 동작하지 않을 수 있다. 실제 보니까, 내부 hadoop에서 13개의 mapper가 동작했다. 


만약 특정 사용자를 owner로 하고 싶다면, HADOOP_USER_NAME를 사용한다.


$ HADOOP_USER_NAME=google hadoop distcp -m 12 hdfs://internal-hadoop1.google.com/user/www/score  /user/www/score


$ hadoop fs -ls  /user/www/score

drwxr-xr-x   - google   supergroup          0 2016-02-05 11:21 /user/www/score




만약 이미 존재하는 파일이 있으면, overwite가 되지 않는다. chown을 잘 쓰거나 -overwrite와 -delete 하려 super user 권한(hdfs)이 필요하다.


$ HADOOP_USER_NAME=hdfs hadoop fs -chown deploy  /user/www/score


$ HADOOP_USER_NAME=hdfs hadoop distcp -m 12 -overwrite -delete hdfs://internal-hadoop1.google.com/user/www/score  /user/www/score



권한이 없으면, 아래와 같이 에러가 발생한다.


Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=deploy, access=WRITE, inode="/user/www/score":hdfs:supergroup:drwxr-xr-x



또한, 다중 복사가 가능하다.

hadoop fs -cp A디렉토리 B디렉토리 C디렉토리 하면,  C 디렉토리에 A디렉토리와 B디렉토리가 복사된다. 



출처 :

https://hadoop.apache.org/docs/r1.2.1/distcp.html

Posted by '김용환'
,


hadoop2 실행시, map 99%, reduce 33%에서 잠깐 10분을 블럭되었다가 그 이후에 처리된다.

hadoop이 왜 10분 정도 블럭되는지 조사한 내용이다.



16/02/03 17:49:35 INFO mapreduce.Job:  map 99% reduce 33%



hadoop 장비 로그는 다음과 같다. 뭔가 10분동안 블럭되는 데. spilling이라는 문구가 눈에 보인다.

중간 중간 spilling 처리 작업 로그가 있고, 그 다음에 finished spillin이 있다.


2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: bufstart = 103441438; bufend = 199707946; bufvoid = 268435456 2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: kvstart = 25860352(103441408); kvend = 63348756(253395024); length = 29620461/16777216 2016-02-03 16:01:46,175 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (EQUATOR) 226551477 kvi 56637864(226551456) 2016-02-03 16:01:46,190 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276100000/276093708/0 in:320301=276100000/862 [rec/s] out:320294=276093713/862 [rec/s] 2016-02-03 16:01:46,392 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276200000/276190849/0 in:320417=276200000/862 [rec/s] out:320407=276190849/862 [rec/s] 2016-02-03 16:01:46,604 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276300000/276292909/0 in:320533=276300000/862 [rec/s] out:320525=276292913/862 [rec/s] 2016-02-03 16:01:46,816 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276400000/276394703/0 in:320278=276400000/863 [rec/s] out:320271=276394703/863 [rec/s] 2016-02-03 16:01:47,040 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276500000/276483714/0 in:320393=276500000/863 [rec/s] out:320375=276483716/863 [rec/s] 2016-02-03 16:01:47,305 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276600000/276587566/0 in:320509=276600000/863 [rec/s] out:320495=276587568/863 [rec/s] 2016-02-03 16:01:47,556 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276700000/276688104/0 in:320625=276700000/863 [rec/s] out:320611=276688106/863 [rec/s] 2016-02-03 16:01:47,813 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276800000/276781368/0 in:320370=276800000/864 [rec/s] out:320348=276781372/864 [rec/s] 2016-02-03 16:01:48,094 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=276900000/276885473/0 in:320486=276900000/864 [rec/s] out:320469=276885475/864 [rec/s] 2016-02-03 16:01:48,322 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277000000/276994801/0 in:320601=277000000/864 [rec/s] out:320595=276994807/864 [rec/s] 2016-02-03 16:01:48,526 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277100000/277091737/0 in:320717=277100000/864 [rec/s] out:320708=277091742/864 [rec/s] 2016-02-03 16:01:48,738 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277200000/277193750/0 in:320462=277200000/865 [rec/s] out:320455=277193754/865 [rec/s] 2016-02-03 16:01:48,938 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277300000/277290152/0 in:320578=277300000/865 [rec/s] out:320566=277290152/865 [rec/s] 2016-02-03 16:01:49,150 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277400000/277392006/0 in:320693=277400000/865 [rec/s] out:320684=277392009/865 [rec/s] 2016-02-03 16:01:49,362 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277500000/277494280/0 in:320809=277500000/865 [rec/s] out:320802=277494284/865 [rec/s] 2016-02-03 16:01:49,564 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277600000/277590991/0 in:320924=277600000/865 [rec/s] out:320914=277590995/865 [rec/s] 2016-02-03 16:01:49,774 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277700000/277692050/0 in:320669=277700000/866 [rec/s] out:320660=277692052/866 [rec/s] 2016-02-03 16:01:56,331 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill 36 2016-02-03 16:01:56,331 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (RESET) equator 226551477 kv 56637864(226551456) kvi 49926992(199707968) 2016-02-03 16:01:56,331 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=277774720/277764057 2016-02-03 16:01:56,386 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277800000/277793841/0 in:318577=277800000/872 [rec/s] out:318570=277793848/872 [rec/s] 2016-02-03 16:01:56,589 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=277900000/277890374/0 in:318692=277900000/872 [rec/s] out:318681=277890374/872 [rec/s] 2016-02-03 16:01:56,804 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278000000/277992774/0 in:318442=278000000/873 [rec/s] out:318433=277992774/873 [rec/s] 2016-02-03 16:01:57,018 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278100000/278094543/0 in:318556=278100000/873 [rec/s] out:318550=278094543/873 [rec/s] 2016-02-03 16:01:57,224 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278200000/278190642/0 in:318671=278200000/873 [rec/s] out:318660=278190642/873 [rec/s] 2016-02-03 16:01:57,436 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278300000/278293042/0 in:318785=278300000/873 [rec/s] out:318777=278293042/873 [rec/s] 2016-02-03 16:01:57,648 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278400000/278395280/0 in:318535=278400000/874 [rec/s] out:318530=278395294/874 [rec/s] 2016-02-03 16:01:57,852 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278500000/278492944/0 in:318649=278500000/874 [rec/s] out:318641=278492957/874 [rec/s] 2016-02-03 16:01:58,061 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278600000/278593310/0 in:318764=278600000/874 [rec/s] out:318756=278593310/874 [rec/s] 2016-02-03 16:01:58,261 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278700000/278689724/0 in:318878=278700000/874 [rec/s] out:318866=278689724/874 [rec/s] 2016-02-03 16:01:58,474 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278800000/278792124/0 in:318993=278800000/874 [rec/s] out:318984=278792124/874 [rec/s] 2016-02-03 16:01:58,712 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=278900000/278893894/0 in:318742=278900000/875 [rec/s] out:318735=278893894/875 [rec/s] 2016-02-03 16:01:58,918 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279000000/278990470/0 in:318857=279000000/875 [rec/s] out:318846=278990482/875 [rec/s] 2016-02-03 16:01:59,136 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279100000/279092392/0 in:318971=279100000/875 [rec/s] out:318962=279092392/875 [rec/s] 2016-02-03 16:01:59,354 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279200000/279194792/0 in:319085=279200000/875 [rec/s] out:319079=279194792/875 [rec/s] 2016-02-03 16:01:59,559 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279300000/279290635/0 in:319200=279300000/875 [rec/s] out:319189=279290649/875 [rec/s] 2016-02-03 16:01:59,778 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279400000/279393291/0 in:318949=279400000/876 [rec/s] out:318942=279393291/876 [rec/s] 2016-02-03 16:02:00,019 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279500000/279494745/0 in:319063=279500000/876 [rec/s] out:319057=279494745/876 [rec/s] 2016-02-03 16:02:00,257 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279600000/279591288/0 in:319178=279600000/876 [rec/s] out:319168=279591304/876 [rec/s] 2016-02-03 16:02:00,476 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279700000/279693244/0 in:319292=279700000/876 [rec/s] out:319284=279693244/876 [rec/s] 2016-02-03 16:02:00,681 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279800000/279789972/0 in:319042=279800000/877 [rec/s] out:319030=279789972/877 [rec/s] 2016-02-03 16:02:00,900 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=279900000/279891742/0 in:319156=279900000/877 [rec/s] out:319146=279891742/877 [rec/s] 2016-02-03 16:02:01,145 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280000000/279993626/0 in:319270=280000000/877 [rec/s] out:319262=279993633/877 [rec/s] 2016-02-03 16:02:01,356 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280100000/280090386/0 in:319384=280100000/877 [rec/s] out:319373=280090399/877 [rec/s] 2016-02-03 16:02:01,573 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280200000/280192011/0 in:319498=280200000/877 [rec/s] out:319489=280192011/877 [rec/s] 2016-02-03 16:02:01,790 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280300000/280294095/0 in:319248=280300000/878 [rec/s] out:319241=280294095/878 [rec/s] 2016-02-03 16:02:01,997 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280400000/280390824/0 in:319362=280400000/878 [rec/s] out:319351=280390824/878 [rec/s] 2016-02-03 16:02:02,216 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280500000/280492909/0 in:319476=280500000/878 [rec/s] out:319468=280492909/878 [rec/s] 2016-02-03 16:02:02,434 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280600000/280595309/0 in:319589=280600000/878 [rec/s] out:319584=280595309/878 [rec/s] 2016-02-03 16:02:02,639 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280700000/280691092/0 in:319340=280700000/879 [rec/s] out:319330=280691092/879 [rec/s] 2016-02-03 16:02:02,861 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280800000/280793492/0 in:319453=280800000/879 [rec/s] out:319446=280793492/879 [rec/s] 2016-02-03 16:02:03,067 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=280900000/280890221/0 in:319567=280900000/879 [rec/s] out:319556=280890221/879 [rec/s] 2016-02-03 16:02:03,286 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281000000/280992306/0 in:319681=281000000/879 [rec/s] out:319672=280992306/879 [rec/s] 2016-02-03 16:02:03,504 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281100000/281094076/0 in:319795=281100000/879 [rec/s] out:319788=281094076/879 [rec/s] 2016-02-03 16:02:03,709 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281200000/281190489/0 in:319545=281200000/880 [rec/s] out:319534=281190489/880 [rec/s] 2016-02-03 16:02:03,927 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281300000/281292259/0 in:319659=281300000/880 [rec/s] out:319650=281292259/880 [rec/s] 2016-02-03 16:02:04,147 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281400000/281395289/0 in:319772=281400000/880 [rec/s] out:319767=281395289/880 [rec/s] 2016-02-03 16:02:04,353 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281500000/281491386/0 in:319886=281500000/880 [rec/s] out:319876=281491388/880 [rec/s] 2016-02-03 16:02:04,571 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281600000/281593159/0 in:320000=281600000/880 [rec/s] out:319992=281593176/880 [rec/s] 2016-02-03 16:02:04,777 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281700000/281689571/0 in:319750=281700000/881 [rec/s] out:319738=281689571/881 [rec/s] 2016-02-03 16:02:04,996 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281800000/281791971/0 in:319863=281800000/881 [rec/s] out:319854=281791971/881 [rec/s] 2016-02-03 16:02:05,216 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=281900000/281894686/0 in:319977=281900000/881 [rec/s] out:319971=281894686/881 [rec/s] 2016-02-03 16:02:05,417 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282000000/281990736/0 in:320090=282000000/881 [rec/s] out:320080=281990747/881 [rec/s] 2016-02-03 16:02:05,629 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282100000/282092555/0 in:319841=282100000/882 [rec/s] out:319832=282092555/882 [rec/s] 2016-02-03 16:02:05,843 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282200000/282195408/0 in:319954=282200000/882 [rec/s] out:319949=282195419/882 [rec/s] 2016-02-03 16:02:06,043 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282300000/282291497/0 in:320068=282300000/882 [rec/s] out:320058=282291509/882 [rec/s] 2016-02-03 16:02:06,256 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282400000/282393768/0 in:320181=282400000/882 [rec/s] out:320174=282393768/882 [rec/s] 2016-02-03 16:02:06,332 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=282437066/282429765 2016-02-03 16:02:06,456 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282500000/282489867/0 in:320294=282500000/882 [rec/s] out:320283=282489867/882 [rec/s] 2016-02-03 16:02:06,669 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282600000/282592267/0 in:320045=282600000/883 [rec/s] out:320036=282592267/883 [rec/s] 2016-02-03 16:02:06,881 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282700000/282694351/0 in:320158=282700000/883 [rec/s] out:320152=282694351/883 [rec/s] 2016-02-03 16:02:07,083 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282800000/282790765/0 in:320271=282800000/883 [rec/s] out:320261=282790765/883 [rec/s] 2016-02-03 16:02:07,300 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=282900000/282893165/0 in:320385=282900000/883 [rec/s] out:320377=282893165/883 [rec/s] 2016-02-03 16:02:07,517 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283000000/282994935/0 in:320498=283000000/883 [rec/s] out:320492=282994935/883 [rec/s] 2016-02-03 16:02:07,723 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283100000/283091348/0 in:320248=283100000/884 [rec/s] out:320239=283091348/884 [rec/s] 2016-02-03 16:02:07,942 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283200000/283193748/0 in:320361=283200000/884 [rec/s] out:320354=283193748/884 [rec/s] 2016-02-03 16:02:08,148 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283300000/283290162/0 in:320475=283300000/884 [rec/s] out:320463=283290162/884 [rec/s] 2016-02-03 16:02:08,367 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283400000/283392247/0 in:320588=283400000/884 [rec/s] out:320579=283392247/884 [rec/s] 2016-02-03 16:02:08,578 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:02:08,578 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: bufstart = 226551477; bufend = 54382536; bufvoid = 268435449 2016-02-03 16:02:08,578 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: kvstart = 56637864(226551456); kvend = 27017404(108069616); length = 29620461/16777216 2016-02-03 16:02:08,579 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (EQUATOR) 81226068 kvi 20306512(81226048) 2016-02-03 16:02:08,584 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283500000/283494017/0 in:320701=283500000/884 [rec/s] out:320694=283494017/884 [rec/s] 2016-02-03 16:02:08,791 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283600000/283590837/0 in:320451=283600000/885 [rec/s] out:320441=283590843/885 [rec/s] 2016-02-03 16:02:09,011 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283700000/283693145/0 in:320564=283700000/885 [rec/s] out:320557=283693145/885 [rec/s] 2016-02-03 16:02:09,229 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283800000/283795485/0 in:320677=283800000/885 [rec/s] out:320672=283795491/885 [rec/s] 2016-02-03 16:02:09,436 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=283900000/283891792/0 in:320790=283900000/885 [rec/s] out:320781=283891799/885 [rec/s] 2016-02-03 16:02:09,655 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284000000/283994044/0 in:320541=284000000/886 [rec/s] out:320535=283994044/886 [rec/s] 2016-02-03 16:02:09,861 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284100000/284090268/0 in:320654=284100000/886 [rec/s] out:320643=284090275/886 [rec/s] 2016-02-03 16:02:10,080 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284200000/284192542/0 in:320767=284200000/886 [rec/s] out:320759=284192542/886 [rec/s] 2016-02-03 16:02:10,300 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284300000/284294942/0 in:320880=284300000/886 [rec/s] out:320874=284294942/886 [rec/s] 2016-02-03 16:02:10,509 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284400000/284391671/0 in:320993=284400000/886 [rec/s] out:320983=284391671/886 [rec/s] 2016-02-03 16:02:10,721 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284500000/284493147/0 in:320744=284500000/887 [rec/s] out:320736=284493151/887 [rec/s] 2016-02-03 16:02:10,930 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284600000/284581734/0 in:320856=284600000/887 [rec/s] out:320836=284581735/887 [rec/s] 2016-02-03 16:02:11,338 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284700000/284683001/0 in:320969=284700000/887 [rec/s] out:320950=284683002/887 [rec/s] 2016-02-03 16:02:11,768 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284800000/284786400/0 in:320720=284800000/888 [rec/s] out:320705=284786401/888 [rec/s] 2016-02-03 16:02:12,123 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=284900000/284881064/0 in:320833=284900000/888 [rec/s] out:320812=284881065/888 [rec/s] 2016-02-03 16:02:12,521 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285000000/284983177/0 in:320945=285000000/888 [rec/s] out:320927=284983178/888 [rec/s] 2016-02-03 16:02:12,944 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285100000/285084074/0 in:320697=285100000/889 [rec/s] out:320679=285084075/889 [rec/s] 2016-02-03 16:02:19,043 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill 37 2016-02-03 16:02:19,043 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (RESET) equator 81226068 kv 20306512(81226048) kvi 13595640(54382560) 2016-02-03 16:02:19,043 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=285194313/285169173 2016-02-03 16:02:19,062 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285200000/285190706/0 in:318659=285200000/895 [rec/s] out:318648=285190706/895 [rec/s] 2016-02-03 16:02:19,281 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285300000/285293106/0 in:318770=285300000/895 [rec/s] out:318763=285293106/895 [rec/s] 2016-02-03 16:02:19,488 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285400000/285390070/0 in:318882=285400000/895 [rec/s] out:318871=285390083/895 [rec/s] 2016-02-03 16:02:19,706 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285500000/285491604/0 in:318638=285500000/896 [rec/s] out:318629=285491604/896 [rec/s] 2016-02-03 16:02:19,925 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285600000/285594187/0 in:318750=285600000/896 [rec/s] out:318743=285594200/896 [rec/s] 2016-02-03 16:02:20,139 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285700000/285690733/0 in:318861=285700000/896 [rec/s] out:318851=285690733/896 [rec/s] 2016-02-03 16:02:20,382 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285800000/285793133/0 in:318973=285800000/896 [rec/s] out:318965=285793133/896 [rec/s] 2016-02-03 16:02:20,621 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=285900000/285894903/0 in:318729=285900000/897 [rec/s] out:318723=285894903/897 [rec/s] 2016-02-03 16:02:20,829 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286000000/285991947/0 in:318840=286000000/897 [rec/s] out:318831=285991947/897 [rec/s] 2016-02-03 16:02:21,045 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286100000/286094086/0 in:318952=286100000/897 [rec/s] out:318945=286094100/897 [rec/s] 2016-02-03 16:02:21,266 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286200000/286189815/0 in:319063=286200000/897 [rec/s] out:319052=286189815/897 [rec/s] 2016-02-03 16:02:21,523 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286300000/286292272/0 in:319175=286300000/897 [rec/s] out:319166=286292278/897 [rec/s] 2016-02-03 16:02:21,780 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286400000/286394615/0 in:318930=286400000/898 [rec/s] out:318924=286394615/898 [rec/s] 2016-02-03 16:02:22,014 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286500000/286490713/0 in:319042=286500000/898 [rec/s] out:319031=286490713/898 [rec/s] 2016-02-03 16:02:22,234 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286600000/286593743/0 in:319153=286600000/898 [rec/s] out:319146=286593743/898 [rec/s] 2016-02-03 16:02:22,439 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286700000/286689784/0 in:319265=286700000/898 [rec/s] out:319253=286689796/898 [rec/s] 2016-02-03 16:02:22,658 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286800000/286792242/0 in:319021=286800000/899 [rec/s] out:319012=286792242/899 [rec/s] 2016-02-03 16:02:22,876 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=286900000/286894327/0 in:319132=286900000/899 [rec/s] out:319126=286894327/899 [rec/s] 2016-02-03 16:02:23,083 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287000000/286990425/0 in:319243=287000000/899 [rec/s] out:319232=286990425/899 [rec/s] 2016-02-03 16:02:23,301 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287100000/287092511/0 in:319354=287100000/899 [rec/s] out:319346=287092524/899 [rec/s] 2016-02-03 16:02:23,510 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287200000/287189705/0 in:319466=287200000/899 [rec/s] out:319454=287189718/899 [rec/s] 2016-02-03 16:02:23,724 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287300000/287291698/0 in:319222=287300000/900 [rec/s] out:319213=287291711/900 [rec/s] 2016-02-03 16:02:23,942 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287400000/287394039/0 in:319333=287400000/900 [rec/s] out:319326=287394039/900 [rec/s] 2016-02-03 16:02:24,144 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287500000/287490452/0 in:319444=287500000/900 [rec/s] out:319433=287490452/900 [rec/s] 2016-02-03 16:02:24,356 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287600000/287592271/0 in:319555=287600000/900 [rec/s] out:319546=287592285/900 [rec/s] 2016-02-03 16:02:24,578 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287700000/287695567/0 in:319666=287700000/900 [rec/s] out:319661=287695567/900 [rec/s] 2016-02-03 16:02:24,783 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287800000/287791666/0 in:319422=287800000/901 [rec/s] out:319413=287791666/901 [rec/s] 2016-02-03 16:02:25,004 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=287900000/287894052/0 in:319533=287900000/901 [rec/s] out:319527=287894066/901 [rec/s] 2016-02-03 16:02:25,226 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288000000/287990226/0 in:319644=288000000/901 [rec/s] out:319634=287990240/901 [rec/s] 2016-02-03 16:02:25,451 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288100000/288092564/0 in:319755=288100000/901 [rec/s] out:319747=288092564/901 [rec/s] 2016-02-03 16:02:25,671 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288200000/288194354/0 in:319512=288200000/902 [rec/s] out:319505=288194371/902 [rec/s] 2016-02-03 16:02:25,887 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288300000/288291063/0 in:319623=288300000/902 [rec/s] out:319613=288291063/902 [rec/s] 2016-02-03 16:02:26,108 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288400000/288393778/0 in:319733=288400000/902 [rec/s] out:319727=288393778/902 [rec/s] 2016-02-03 16:02:26,313 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288500000/288489876/0 in:319844=288500000/902 [rec/s] out:319833=288489876/902 [rec/s] 2016-02-03 16:02:26,532 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288600000/288592276/0 in:319955=288600000/902 [rec/s] out:319947=288592276/902 [rec/s] 2016-02-03 16:02:26,753 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288700000/288694676/0 in:319712=288700000/903 [rec/s] out:319706=288694676/903 [rec/s] 2016-02-03 16:02:26,979 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288800000/288790775/0 in:319822=288800000/903 [rec/s] out:319812=288790775/903 [rec/s] 2016-02-03 16:02:27,303 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=288900000/288893740/0 in:319933=288900000/903 [rec/s] out:319926=288893777/903 [rec/s] 2016-02-03 16:02:27,550 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289000000/288990595/0 in:320044=289000000/903 [rec/s] out:320033=288990603/903 [rec/s] 2016-02-03 16:02:27,803 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289100000/289091673/0 in:319800=289100000/904 [rec/s] out:319791=289091673/904 [rec/s] 2016-02-03 16:02:28,061 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289200000/289194165/0 in:319911=289200000/904 [rec/s] out:319905=289194171/904 [rec/s] 2016-02-03 16:02:28,302 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289300000/289290487/0 in:320022=289300000/904 [rec/s] out:320011=289290487/904 [rec/s] 2016-02-03 16:02:28,553 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289400000/289392572/0 in:320132=289400000/904 [rec/s] out:320124=289392572/904 [rec/s] 2016-02-03 16:02:28,809 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289500000/289494657/0 in:319889=289500000/905 [rec/s] out:319883=289494657/905 [rec/s] 2016-02-03 16:02:29,044 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=289594400/289588236 2016-02-03 16:02:29,051 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289600000/289591070/0 in:320000=289600000/905 [rec/s] out:319990=289591070/905 [rec/s] 2016-02-03 16:02:29,308 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289700000/289693470/0 in:320110=289700000/905 [rec/s] out:320103=289693470/905 [rec/s] 2016-02-03 16:02:29,550 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289800000/289789569/0 in:320220=289800000/905 [rec/s] out:320209=289789569/905 [rec/s] 2016-02-03 16:02:29,806 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=289900000/289891471/0 in:319977=289900000/906 [rec/s] out:319968=289891478/906 [rec/s] 2016-02-03 16:02:30,063 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290000000/289993739/0 in:320088=290000000/906 [rec/s] out:320081=289993739/906 [rec/s] 2016-02-03 16:02:30,304 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290100000/290089522/0 in:320198=290100000/906 [rec/s] out:320187=290089522/906 [rec/s] 2016-02-03 16:02:30,561 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290200000/290191914/0 in:320309=290200000/906 [rec/s] out:320300=290191922/906 [rec/s] 2016-02-03 16:02:30,819 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290300000/290294007/0 in:320066=290300000/907 [rec/s] out:320059=290294007/907 [rec/s] 2016-02-03 16:02:31,059 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290400000/290389790/0 in:320176=290400000/907 [rec/s] out:320165=290389790/907 [rec/s] 2016-02-03 16:02:31,316 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290500000/290492190/0 in:320286=290500000/907 [rec/s] out:320278=290492190/907 [rec/s] 2016-02-03 16:02:31,572 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290600000/290594275/0 in:320396=290600000/907 [rec/s] out:320390=290594275/907 [rec/s] 2016-02-03 16:02:31,813 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290700000/290690059/0 in:320154=290700000/908 [rec/s] out:320143=290690059/908 [rec/s] 2016-02-03 16:02:32,069 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290800000/290791856/0 in:320264=290800000/908 [rec/s] out:320255=290791863/908 [rec/s] 2016-02-03 16:02:32,326 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=290900000/290894173/0 in:320374=290900000/908 [rec/s] out:320368=290894181/908 [rec/s] 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: bufstart = 81226068; bufend = 177492576; bufvoid = 268435456 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: kvstart = 20306512(81226048); kvend = 57794916(231179664); length = 29620461/16777216 2016-02-03 16:02:32,332 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (EQUATOR) 204336112 kvi 51084024(204336096) 2016-02-03 16:02:32,539 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291000000/290990287/0 in:320484=291000000/908 [rec/s] out:320473=290990288/908 [rec/s] 2016-02-03 16:02:32,758 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291100000/291092483/0 in:320242=291100000/909 [rec/s] out:320233=291092484/909 [rec/s] 2016-02-03 16:02:32,976 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291200000/291194347/0 in:320352=291200000/909 [rec/s] out:320345=291194349/909 [rec/s] 2016-02-03 16:02:33,182 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291300000/291290726/0 in:320462=291300000/909 [rec/s] out:320451=291290728/909 [rec/s] 2016-02-03 16:02:33,399 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291400000/291392485/0 in:320572=291400000/909 [rec/s] out:320563=291392488/909 [rec/s] 2016-02-03 16:02:33,619 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291500000/291492639/0 in:320329=291500000/910 [rec/s] out:320321=291492641/910 [rec/s] 2016-02-03 16:02:33,824 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291600000/291588953/0 in:320439=291600000/910 [rec/s] out:320427=291588954/910 [rec/s] 2016-02-03 16:02:34,042 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291700000/291692874/0 in:320549=291700000/910 [rec/s] out:320541=291692879/910 [rec/s] 2016-02-03 16:02:34,260 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291800000/291794877/0 in:320659=291800000/910 [rec/s] out:320653=291794879/910 [rec/s] 2016-02-03 16:02:34,530 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=291900000/291881089/0 in:320769=291900000/910 [rec/s] out:320748=291881093/910 [rec/s] 2016-02-03 16:02:34,869 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292000000/291983351/0 in:320526=292000000/911 [rec/s] out:320508=291983353/911 [rec/s] 2016-02-03 16:02:35,202 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292100000/292084554/0 in:320636=292100000/911 [rec/s] out:320619=292084555/911 [rec/s] 2016-02-03 16:02:35,520 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292200000/292180764/0 in:320746=292200000/911 [rec/s] out:320725=292180766/911 [rec/s] 2016-02-03 16:02:35,865 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292300000/292287374/0 in:320504=292300000/912 [rec/s] out:320490=292287375/912 [rec/s] 2016-02-03 16:02:36,186 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292400000/292387879/0 in:320614=292400000/912 [rec/s] out:320600=292387880/912 [rec/s] 2016-02-03 16:02:36,513 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292500000/292483039/0 in:320723=292500000/912 [rec/s] out:320705=292483041/912 [rec/s] 2016-02-03 16:02:43,788 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill 38 2016-02-03 16:02:43,788 INFO [Thread-13] org.apache.hadoop.mapred.MapTask: (RESET) equator 204336112 kv 51084024(204336096) kvi 44373148(177492592) 2016-02-03 16:02:43,788 INFO [Thread-13] org.apache.hadoop.streaming.PipeMapRed: Records R/W=292589102/292574290 2016-02-03 16:02:43,810 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292600000/292593753/0 in:318043=292600000/920 [rec/s] out:318036=292593753/920 [rec/s] 2016-02-03 16:02:44,028 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292700000/292695523/0 in:318152=292700000/920 [rec/s] out:318147=292695523/920 [rec/s] 2016-02-03 16:02:44,232 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292800000/292791307/0 in:318260=292800000/920 [rec/s] out:318251=292791307/920 [rec/s] 2016-02-03 16:02:44,451 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=292900000/292893391/0 in:318369=292900000/920 [rec/s] out:318362=292893391/920 [rec/s] 2016-02-03 16:02:44,669 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293000000/292995476/0 in:318132=293000000/921 [rec/s] out:318127=292995476/921 [rec/s] 2016-02-03 16:02:44,875 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293100000/293091890/0 in:318241=293100000/921 [rec/s] out:318232=293091890/921 [rec/s] 2016-02-03 16:02:45,092 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293200000/293193030/0 in:318349=293200000/921 [rec/s] out:318342=293193030/921 [rec/s] 2016-02-03 16:02:45,311 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293300000/293295115/0 in:318458=293300000/921 [rec/s] out:318452=293295115/921 [rec/s] 2016-02-03 16:02:45,519 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293400000/293391454/0 in:318566=293400000/921 [rec/s] out:318557=293391467/921 [rec/s] 2016-02-03 16:02:45,738 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293500000/293493613/0 in:318329=293500000/922 [rec/s] out:318322=293493613/922 [rec/s] 2016-02-03 16:02:45,956 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293600000/293595383/0 in:318438=293600000/922 [rec/s] out:318433=293595383/922 [rec/s] 2016-02-03 16:02:46,163 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293700000/293691746/0 in:318546=293700000/922 [rec/s] out:318537=293691759/922 [rec/s] 2016-02-03 16:02:46,382 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293800000/293793566/0 in:318655=293800000/922 [rec/s] out:318648=293793566/922 [rec/s] 2016-02-03 16:02:46,587 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=293900000/293889778/0 in:318763=293900000/922 [rec/s] out:318752=293889791/922 [rec/s] 2016-02-03 16:02:46,805 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=294000000/293991750/0 in:318526=294000000/923 [rec/s] out:318517=293991750/923 [rec/s] 2016-02-03 16:02:47,022 INFO [main] org.apache.hadoop.streaming.PipeMapRed: R/W/S=294100000/294093204/0 in:318634=294100000/923 [rec/s] out:318627=294093204/923 [rec/s] 2016-02-03 16:02:47,132 INFO [Thread-14] org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done 2016-02-03 16:02:47,132 INFO [main] org.apache.hadoop.streaming.PipeMapRed: mapRedFinished 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 204336112; bufend = 246541860; bufvoid = 268435456 2016-02-03 16:02:47,133 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 51084024(204336096); kvend = 38097644(152390576); length = 12986381/16777216 2016-02-03 16:02:50,457 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 39 2016-02-03 16:02:50,474 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:02:50,479 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,480 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,481 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,482 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,483 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,484 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,485 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,486 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,487 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,487 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,488 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,489 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,490 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,491 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,492 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,493 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,493 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,494 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,495 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,496 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,497 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,498 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,499 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,500 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,501 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,501 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,502 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,503 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,504 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,505 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,506 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,507 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,508 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,509 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,509 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,510 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,511 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,512 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,513 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,514 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.snappy] 2016-02-03 16:02:50,515 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11576265 bytes 2016-02-03 16:02:55,997 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:02:56,017 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11524779 bytes 2016-02-03 16:03:01,378 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:01,398 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11597690 bytes 2016-02-03 16:03:06,753 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:06,772 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11584220 bytes 2016-02-03 16:03:12,195 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:12,214 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11554950 bytes 2016-02-03 16:03:17,596 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:17,615 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11527865 bytes 2016-02-03 16:03:22,982 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:23,001 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11606423 bytes 2016-02-03 16:03:28,343 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:28,366 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11530533 bytes 2016-02-03 16:03:33,597 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:33,616 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11546832 bytes 2016-02-03 16:03:39,020 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:39,043 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11504278 bytes 2016-02-03 16:03:44,336 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:44,354 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11560124 bytes 2016-02-03 16:03:49,646 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:49,664 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11507371 bytes 2016-02-03 16:03:54,891 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:03:54,910 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11553016 bytes 2016-02-03 16:04:00,273 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:00,291 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11584433 bytes 2016-02-03 16:04:05,615 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:05,638 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11563511 bytes 2016-02-03 16:04:11,095 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:11,113 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11543123 bytes 2016-02-03 16:04:16,364 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:16,382 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11561494 bytes 2016-02-03 16:04:21,754 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:21,772 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11533300 bytes 2016-02-03 16:04:27,097 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:27,118 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11546776 bytes 2016-02-03 16:04:32,408 INFO [main] org.apache.hadoop.mapred.Merger: Merging 40 sorted segments 2016-02-03 16:04:32,425 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 40 segments left of total size: 11524621 bytes 2016-02-03 16:04:37,744 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1450233158783_28583_m_000000_0 is done. And is in the process of committing 2016-02-03 16:04:37,818 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1450233158783_28583_m_000000_0' done. 2016-02-03 16:04:37,918 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system... 2016-02-03 16:04:37,919 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped. 2016-02-03 16:04:37,919 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.





reducer의 첫 번째 33%는 copy 단계이고, 다음 33%는 shuffle/sort 단계이고, 다음 (마지막) 33 %는 실제 reduce 작업이다.



map 태스크가 완료되면, map 태스크의 결과를 reduce 작업이 일어나는 곳으로 복사(copy)를 한다. map과 reduce는 동일 장비에서 일어나지 않기 때문에, map 태스트 작업이 완료되면, map 단계가 다 실행하기 전에 reduce 작업 단계에 일부 변경 내용을 알려 메모리를 복사시켜야 한다. 완료된 map 태스크 작업 내용이 있는 반면, 특정 map 태스크 작업 내용은 길어질 수 있다. 바로 이 부분 때문이다. 특정 map태스크가 길면 블럭이 되는 것이다.


모든 map 작업이 완료되어야 shuffle 작업을 시작한다. 또한 reduce의 전제 조건은 map에서 데이터를 다 받기 전까지 sort할 수 없기 때문이다. 



그림으로 설명하면, (출처 : http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.htm)

map 단계와 reduce 단계는 다음과 같다.






SPILLING의 단계는 다음과 같다.  (출처 : http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.htm)







Posted by '김용환'
,