'2016/04 글 목록

[hive] hive의 윈도우 표현식(파티션 범위) (0)	2016.05.11
[hive] 정렬 키워드 - order by, sort by, cluster by, distribute by (0)	2016.05.11
[hadoop] getmerge 명령어 (0)	2016.04.21
[hive] count와 distinct 이슈 (0)	2016.04.20
[hive] group by 이후에 order by 개수 지정하기 (0)	2016.04.19

setuid 동작 결과를 ps로 확인하기(ruser, euser)

unix and linux 2016. 4. 28. 14:06

http://theurbanpenguin.com/wp/index.php/using-a-simple-c-program-to-explain-the-suid-permission/ 의 소스를 참조했다.

$ cat > test.c

#include <stdio.h>

#include <unistd.h>

int main () {

int real = getuid();

int euid = geteuid();

printf("The REAL UID =: %d\n", real);

printf("The EFFECTIVE UID =: %d\n", euid);

sleep(100);

}

$ cc test.c

$ ./a.out

The REAL UID =: 1000

The EFFECTIVE UID =: 1000

$ sudo chown root a.out

$ sudo chmod 4755 a.out

결과

$ ./a.out

The REAL UID =: 1000

The EFFECTIVE UID =: 0

(대기)

다른 터미널에서 ps 명령어로 확인한다.

$ ps -eo pid,euser,ruser,comm | grep a.out
PID EUSER RUSER COMMAND

4481 root deploy a.out

리눅스에서 프로세스의 유효한(effective) 사용자 ID(euid)와 리얼(real) 사용자 ID(ruid)를 볼 수 있다.

setuid를 실행하는 시점에서는 다른 리얼 사용자 ID를 볼 수 있도록 아래 명령어를 실행한다. 일반적인 경우 한 프로세스의 euid와 ruid는 동일하다.

a.out 은 setuid로 실행되기 때문에 결과가 euser와 ruser가 다른 값이 나온다.

저작자표시 (새창열림)

'unix and linux' 카테고리의 다른 글

[centos 7] realpath 명령어 (0)	2016.05.03
센트OS 7 다운로드 URL 설명 (0)	2016.05.02
[nginx] echo > sudo 파일 권한 (0)	2016.04.25
[bash] single quote(')에서 변수 사용하기 (0)	2016.03.23
netstat과 watch으로 모니터링 잘하기 (0)	2016.03.09

Posted by '김용환'

,

[guava] symmetric difference/difference, relative(absolute) complement 개념

general java 2016. 4. 28. 13:13

간단한 집합 명령에 대한 코드이다.

https://en.wikipedia.org/wiki/Symmetric_difference

Venn diagram of

~A \triangle B

https://en.wikipedia.org/wiki/Complement_(set_theory)

The relative complement of A (left circle) in B (right circle):

B \cap A^c = B \setminus A

The absolute complement of

A

in U:

Ac = U \ A

Set의 difference, removeall, symmetricDifference과 Guava의 difference 메소드 테스트 예시이다.

Set<String> s1 = Sets.newHashSet("1", "2", "3");

Set<String> s2 = Sets.newHashSet("3", "4");

System.out.println(Sets.difference(s1, s2)); // 1,2

Set<String> set1 = ((Set<String>) ((HashSet<String>) s1).clone());

set1.removeAll(s2);

System.out.println(set1); // 1, 2

System.out.println(com.google.common.collect.Sets.difference(s1, s2)); // 1,2

System.out.println(Sets.symmetricDifference(s1, s2)); // 1, 2, 4

저작자표시 (새창열림)

'general java' 카테고리의 다른 글

FilenameUtils 클래스 사용 예제 (0)	2016.06.02
[play1] 버전 업하기 1.3.0 -> 1.3.4 (또는 1.4.2) (0)	2016.05.26
[jenkins] 간단한 인증 처리 설정하기 (0)	2016.04.18
[jenkins] 인증이 필요한 jenkin job의 모니터링하기 (0)	2016.04.18
[guava] Iterables.concat (0)	2016.04.14

Posted by '김용환'

,

센트OS 7 다운로드 URL 설명 (0)	2016.05.02
setuid 동작 결과를 ps로 확인하기(ruser, euser) (0)	2016.04.28
[bash] single quote(')에서 변수 사용하기 (0)	2016.03.23
netstat과 watch으로 모니터링 잘하기 (0)	2016.03.09
[awk] awk에서 외부 변수 사용하기 예시 (0)	2016.02.26

[shell script] 한 줄 표현 - echo -n (0)	2017.05.17
zcat의 No such file or directory 해결하기 (0)	2017.05.17
openssl 명령어를 이용한 인증서 확인하기 (1)	2015.12.28
테스트용 포트(port) 열기 (0)	2015.12.24
[팁] linux에서 bash script를 이용하여 날짜 빼기(또는 더하기) (0)	2015.12.01

[hadoop] getmerge 명령어

hadoop 2016. 4. 21. 20:27

hadoop의 getmerge 명령어의 결과를 hadoop 파일 시스템에 저장하는 줄 알았는데, 알고 보니. local이었다. ㅠ

getmerge 명령어는 part*로 나눠진 파일을 하나로 모아 로컬 파일 시스템에 저장한다.

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#getmerge

getmerge

Usage: hadoop fs -getmerge [-nl] <src> <localdst>

Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally -nl can be set to enable adding a newline character (LF) at the end of each file.

Examples:

hadoop fs -getmerge -nl /src /opt/output.txt
hadoop fs -getmerge -nl /src/file1.txt /src/file2.txt /output.txt

Exit Code:

Returns 0 on success and non-zero on error.

예제>

$ hdfs dfs -ls /tmp/test/20160418

Found 6 items

/tmp/test/_SUCCESS

/tmp/test/done

/tmp/test/part-m-00000

/tmp/test/part-m-00001

/tmp/test/part-m-00002

/tmp/test/part-m-00003

$ hdfs dfs -getmerge hdfs://google-hadoop장비이름/tmp/test/20160418 /tmp/1

로컬에 하나의 파일로 저장한 것을 확인한다.

$ ls /tmp/1

/tmp/1

저작자표시 (새창열림)

'hadoop' 카테고리의 다른 글

[hive] 정렬 키워드 - order by, sort by, cluster by, distribute by (0)	2016.05.11
[hive] collect_set (0)	2016.04.30
[hive] count와 distinct 이슈 (0)	2016.04.20
[hive] group by 이후에 order by 개수 지정하기 (0)	2016.04.19
[hive] ALTER TABLE 예시 (0)	2016.04.19

Posted by '김용환'

,

[hive] count와 distinct 이슈

hadoop 2016. 4. 20. 08:36

하이브에서 COUNT와 DISTINCT를 동시에 사용할 때,

하이브는 mapred.reduce.tasks = 20으로 설정 리듀스 개수를 무시하고 오직 하나의 리듀서만 사용한다.

데이터가 클 때 병목이 발생한다.

이 방식의 해결책은 서브쿼리를 사용한다.

예시)

* COUNT와 DISTINCT를 동시에 쓰면 하나의 리듀스에서만 실행된다.

SELECT count(distinct age) FROM member;

* 서브 쿼리를 사용하면, 리듀스 작업을 mapred.reduce.tasks로 설정한 값으로 동작한다. DISTINCT를 사용하여 하나 이상의 리듀서를 사용하여 상대적으로 작은 양의 데이터에 COUNT를 실행한다. 따라서 리듀서 병목이 사라진다.

SELECT count(*) FROM (SELECT distinct age FROM member) a;

저작자표시 (새창열림)

'hadoop' 카테고리의 다른 글

[hive] collect_set (0)	2016.04.30
[hadoop] getmerge 명령어 (0)	2016.04.21
[hive] group by 이후에 order by 개수 지정하기 (0)	2016.04.19
[hive] ALTER TABLE 예시 (0)	2016.04.19
[hive] alter table 시 주의 사항 (0)	2016.04.19

Posted by '김용환'

,

[hive] group by 이후에 order by 개수 지정하기

hadoop 2016. 4. 19. 15:23

hive에서 count를 사용하기 위해 group by 컬럼을 적용한 후 order by 컬럼을 사용할 때 사용할 팁이다.

특정 필드를 기반으로 group by 후, count 별 역순으로 확인할 때,

order by 뒤에 들어갈 필드는 count(*)에 대한 앨리어스를 사용하면 잘된다.

select timezone, count(*) as count from request where date=20160401

group by timezone order by count desc limit 30

저작자표시 (새창열림)

'hadoop' 카테고리의 다른 글

[hadoop] getmerge 명령어 (0)	2016.04.21
[hive] count와 distinct 이슈 (0)	2016.04.20
[hive] ALTER TABLE 예시 (0)	2016.04.19
[hive] alter table 시 주의 사항 (0)	2016.04.19
[hive] hive.cli.print.header (0)	2016.04.19

Posted by '김용환'

,

[hive] ALTER TABLE 예시

hadoop 2016. 4. 19. 15:21

Hive의 ALTER TABLE 문은 많은 기능을 가진다.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable

단순한 컬럼 수정 뿐 아니라, 순서 변경, 추가, 완전 치환, 이름 변경 등의 기능을 가진다.

RENAME TO 절을 이용하여 이름을 변경한다.

hive> CREATE TABLE users (`id` int, `location` string);

OK

hive> desc users;

OK

col_name data_type comment

id int

location string

hive> ALTER TABLE users RENAME TO new_users;

OK

hive> desc new_users;

OK

col_name data_type comment

id int

location string

Time taken: 0.036 seconds, Fetched: 2 row(s)

테이블 속성을 변경할 수 있다.

hive> ALTER TABLE users SET TBLPROPERTIES ('comment' = 'new user heros');

OK

컬럼을 추가할 수 있다.

hive> ALTER TABLE users ADD COLUMNS (`work` string COMMENT 'comment');

OK

Time taken: 0.108 seconds

hive> desc users;

OK

col_name data_type comment

id int

location string

work string comment

Time taken: 0.104 seconds, Fetched: 3 row(s)

한 번에 다중 컬럼 추가 기능도 제공한다.

hive> ALTER TABLE users ADD COLUMNS (`has_talk` string COMMENT 'checking whether user has talk', `json` string);

OK

Time taken: 0.041 seconds

hive> desc users;

OK

col_name data_type comment

id int

location string

work string comment

has_talk string checking whether user has talk

json string

Time taken: 0.036 seconds, Fetched: 5 row(s)

컬럼의 타입과 순서를 변경한다.

hive> ALTER TABLE users CHANGE work work_name string AFTER json;

OK

Time taken: 0.047 seconds

hive> desc users;

OK

col_name data_type comment

id int

location string

has_talk string checking whether user has talk

json string

work_name string comment

Time taken: 0.039 seconds, Fetched: 5 row(s)

컬럼을 삭제하기 위해 DROP 또는 DROP COLUMN 절을 사용하면 에러가 발생한다.

hive>ALTER TABLE users DROP COLUMN json;

error

따라서 DROP 대신 REPLACE 절을 이용하여 테이블 스키마를 수정한다.

hive> ALTER TABLE users REPLACE COLUMNS (`id` int, `location` string);

OK

Time taken: 0.071 seconds

hive> desc users;

OK

col_name data_type comment

id int

location string

Time taken: 0.05 seconds, Fetched: 2 row(s)

저작자표시 (새창열림)

'hadoop' 카테고리의 다른 글

[hive] count와 distinct 이슈 (0)	2016.04.20
[hive] group by 이후에 order by 개수 지정하기 (0)	2016.04.19
[hive] alter table 시 주의 사항 (0)	2016.04.19
[hive] hive.cli.print.header (0)	2016.04.19
[hive] 데이터 타입에 맞게 구분자 활용하여 테이블 저장하기 (0)	2016.04.19

Posted by '김용환'

,

[hive] alter table 시 주의 사항

hadoop 2016. 4. 19. 11:20

필드명과 주석에 쓰이는 문자열을 표시하는 기호가 다르다.

필드명은 `로 사용하고, 주석은 '로 사용한다.

hive> ALTER TABLE users ADD COLUMNS (`json` string, `timezone` string COMMENT `timezone of user`);

mismatched input 'timezone of user' expecting StringLiteral near 'COMMENT' in column specification

hive> ALTER TABLE users ADD COLUMNS (`json` string, `timezone` string COMMENT 'timezone of user');

OK

저작자표시 (새창열림)

'hadoop' 카테고리의 다른 글

[hive] group by 이후에 order by 개수 지정하기 (0)	2016.04.19
[hive] ALTER TABLE 예시 (0)	2016.04.19
[hive] hive.cli.print.header (0)	2016.04.19
[hive] 데이터 타입에 맞게 구분자 활용하여 테이블 저장하기 (0)	2016.04.19
brew를 이용해서 hadoop, hive 설치하기 (0)	2016.04.19

Posted by '김용환'

,

'2016/04'에 해당되는 글 29건

[hive] collect_set

'hadoop' 카테고리의 다른 글

setuid 동작 결과를 ps로 확인하기(ruser, euser)

'unix and linux' 카테고리의 다른 글

[guava] symmetric difference/difference, relative(absolute) complement 개념

'general java' 카테고리의 다른 글

[nginx] echo > sudo 파일 권한

'unix and linux' 카테고리의 다른 글

[linux] keepalive 정보 살펴보기

'c or linux' 카테고리의 다른 글

[hadoop] getmerge 명령어

getmerge

'hadoop' 카테고리의 다른 글

[hive] count와 distinct 이슈

'hadoop' 카테고리의 다른 글

[hive] group by 이후에 order by 개수 지정하기

'hadoop' 카테고리의 다른 글

[hive] ALTER TABLE 예시

'hadoop' 카테고리의 다른 글

[hive] alter table 시 주의 사항

'hadoop' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바