<여러 버전 테스트시 유의사항>
설치 서버의 /tmp 위치를 dfs 파일시스템 디렉토리로 사용한다. 윈도우에서는 설치 디스크 드라이브를 따라감
만약 cygwin을 d:\에 설치했으면, d:\tmp 에 있다.
각 버전별로 디렉토리를 인식이 잘 안될 수 있으니, namenode 실행시 문제가 생기면 dfs 파일시스템인 /tmp 를 완전히 지우고 시작해야 한다.
bin/hadoop datanode -format 해서 새로 생성되게 하고 나서 bin/hadoop datanode를 실행한다.
* 0.20.2 버전 테스트시
잘됨
* 0.20.203.0 버전 테스트시
bin/hadoop tasktracker실행시 문제 발생.
11/08/22 19:01:37 ERROR mapred.TaskTracker: Can not start task tracker because j
ava.io.IOException: Failed to set permissions of path: /tmp/hadoop-nhn/mapred/lo
cal/ttprivate to 0700
살펴보는게 귀찮아서 패스..
* 0.21.0 버전 테스트시
classpath 에 새로운 정책이 수행되었나 보다. warning은 뜨지만, 기존처럼 계속 사용할 수 있다.
<설치 참고 싸이트>
http://hadoop.apache.org/common/docs/stable/single_node_setup.html
http://cardia.tistory.com/entry/Hadoop-0202-%EC%84%A4%EC%B9%98-%EB%B0%8F-%ED%99%98%EA%B2%BD%EC%84%A4%EC%A0%95
http://v-lad.org/Tutorials/Hadoop/00%20-%20Intro.html
http://developer.yahoo.com/hadoop/tutorial/module3.html
<설치 순서>
1. cygwin 설치. open ssh 연결 (cygwin을 워낙 잘쓰는 편이라서 자세한 내용 생략)
2. cygwin 에서 hadoop 다운로드(http://www.apache.org/dyn/closer.cgi/hadoop/common/)
3. hadoop 0.21.0 또는 0.20.2 버전 다운로드해서 "cygwin 설치디렉토리/home/계정명/"에 설치
4. cygwin 접속
5. cygwin에서 ssh 설정
(중간 중간에 yes / no 설정 잘하기)
$ ssh-host-config
*** Info: Generating /etc/ssh_host_key
*** Info: Generating /etc/ssh_host_rsa_key
*** Info: Generating /etc/ssh_host_dsa_key
*** Info: Generating /etc/ssh_host_ecdsa_key
*** Info: Creating default /etc/ssh_config file
*** Info: Creating default /etc/sshd_config file
*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.
*** Info: However, this requires a non-privileged account called 'sshd'.
*** Info: For more info on privilege separation read /usr/share/doc/openssh/READ
ME.privsep.
*** Query: Should privilege separation be used? (yes/no) no
*** Info: Updating /etc/sshd_config file
*** Warning: The following functions require administrator privileges!
*** Query: Do you want to install sshd as a service?
*** Query: (Say "no" if it is already installed as a service) (yes/no) yes
*** Query: Enter the value of CYGWIN for the daemon: [] yes
*** Info: On Windows Server 2003, Windows Vista, and above, the
*** Info: SYSTEM account cannot setuid to other users -- a capability
*** Info: sshd requires. You need to have or to create a privileged
*** Info: account. This script will help you do so.
*** Info: You appear to be running Windows XP 64bit, Windows 2003 Server,
*** Info: or later. On these systems, it's not possible to use the LocalSystem
*** Info: account for services that can change the user id without an
*** Info: explicit password (such as passwordless logins [e.g. public key
*** Info: authentication] via sshd).
*** Info: If you want to enable that functionality, it's required to create
*** Info: a new account with special privileges (unless a similar account
*** Info: already exists). This account is then used to run these special
*** Info: servers.
*** Info: Note that creating a new user requires that the current account
*** Info: have Administrator privileges itself.
*** Info: No privileged account could be found.
*** Info: This script plans to use 'cyg_server'.
*** Info: 'cyg_server' will only be used by registered services.
*** Query: Do you want to use a different name? (yes/no) yes
*** Query: Enter the new user name: ntsec
*** Query: Reenter: ntsec
*** Query: Create new privileged user account 'ntsec'? (yes/no) no
*** ERROR: There was a serious problem creating a privileged user.
*** Query: Do you want to proceed anyway? (yes/no) yes
*** Warning: Expected privileged user 'ntsec' does not exist.
*** Warning: Defaulting to 'SYSTEM'
*** Info: The sshd service has been installed under the LocalSystem
*** Info: account (also known as SYSTEM). To start the service now, call
*** Info: `net start sshd' or `cygrunsrv -S sshd'. Otherwise, it
*** Info: will start automatically after the next reboot.
*** Info: Host configuration finished. Have fun!
데몬 실행하고, 키생성 하고, 테스트
$ net start sshd
CYGWIN sshd 서비스를 시작합니다..
CYGWIN sshd 서비스가 잘 시작되었습니다.
$ ssh-keygen
$ cd ~/.ssh
$ cat id_rsa.pub >> authorized_keys
$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is 71:6d:7b:51:aa:8a:2f:c1:c2:30:44:d1:b7:e0:f8:0e.
Are you sure you want to continue connecting (yes/no)?
$ ssh localhost
nhn@localhost's password:
$ logout
Connection to localhost closed.
데몬으로 실행되는지 확인 (제어판-관리도구-서비스 보면, cygwin sshd 라고 서비스가 데몬으로 실행중인지 볼 수 있음)
6. jdk 위치 수정
conf/hadoop-env.sh 파일
export JAVA_HOME="/cygdrive/c/Progra~1/Java/jdk1.6.0_24
7. 하둡 설정 파일 수정 (conf 디렉토리 밑)
conf/core-site.xml:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
conf/hdfs-site.xml:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
conf/mapred-site.xml:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
8. 데몬 실행
cygwin 창을 띄우고, 하나씩 실행
$ bin/hadoop namenode
$ bin/hadoop secondarynamenode
$ bin/hadoop jobtracker
$ bin/hadoop datanode
$ bin/hadoop tasktracker
9. 테스트 실행
$ bin/hadoop dfs -put conf input
$ bin/hadoop jar hadoop-mapred-examples-0.21.0.jar grep input output 'dfs[a-z]+'
그리고, 어드민포트에 접근해서 정상적으로 작동되는지 확인한다.
0.21 버전에서는 아래와 같이 출력된다.
http://localhost:50070/dfshealth.jsp
http://localhost:50030/jobtracker.jsp