'분류 전체보기' 카테고리의 글 목록 (56 Page)

'분류 전체보기'에 해당되는 글 4074건

2018.03.14 [sourcetree] remote: Invalid username or password. fatal: Authentication failed 해결
2018.03.14 [spark] 스파크 MLlib으로 비정상 데이터를 찾기에 좋은 참조 자료
2018.03.14 [spark] 여러 모드에서 스파크 잡 실행하기 예제
2018.03.12 [opentsdb] hbase uid 스키마에서 특이한 점 - 마지막 저장 위치를 0x00에서 저장
2018.03.11 [zookeeper] 메모리 설정
2018.03.08 [Datagrip] query editor 실행하기
2018.03.08 [펌] java8의 parrallel stream 좋은 설명
2018.02.23 [python] 테스크 코드 실행 - tox 이용
2018.02.22 [python] sql_alchemy 에서 join을 사용할 때 유의해야 할 사항 - sql_alchemy에 대한 이해
2018.02.21 [sql_alchemy] 3개의 테이블 조인 (join three tables) 예제

[sourcetree] remote: Invalid username or password. fatal: Authentication failed 해결

Tool 2018. 3. 14. 11:39

소스 트리에서 다음과 같은 에러가 계속 난다..

remote: Invalid username or password. fatal: Authentication failed

아마도 예전에 저장한 패스워드에 문제가 있었나 보다.

아래 설정을 진행하니 문제가 없다.

Sourcetree > Preferences > Git > Use System Git

저작자표시

'Tool' 카테고리의 다른 글

[intellij] 2018.1 lombok 설정 (0)	2018.06.19
[Intellij] 2017.3 이후 버전부터 Thread/Future쪽 디버깅 지원 (0)	2018.05.14
[Datagrip] query editor 실행하기 (0)	2018.03.08
리눅스 jq 파일 예제 (0)	2017.11.14
[Intellij] Unexpected exception[BootException:ID: (0)	2017.11.13

Posted by '김용환'

[spark] 스파크 MLlib으로 비정상 데이터를 찾기에 좋은 참조 자료

scala 2018. 3. 14. 02:42

스파크 MLlib은 K-평균, 이분법 K-평균, 가우스 혼합 외에 PIC, LDA, 스트리밍 K-평균과 같은 세 개의 클러스터링 알고리즘의 구현을 제공한다.

한 가지 분명한 것은 클러스터링 분석을 미세하게 튜닝하려면 종종 비정상 데이터(outlier 또는 anomaly)이라고 불리는 원치 않는 데이터 오브젝트를 제거해야 한다.

스파크 MLlib으로 비정상 데이터를 찾는데 공부하기 위한 좋은 자료

https://github.com/keiraqz/anomaly-detection

https://mapr.com/ebooks/spark/08-unsupervised-anomaly-detection-apache-spark.html

저작자표시

'scala' 카테고리의 다른 글

[spark] 스파크 애플리케이션 튜닝 방법 - 펌 (0)	2018.03.26
[spark] log4j 직렬화하기 - org.apache.spark.SparkException: Task not serializable 해결하기 (0)	2018.03.25
[spark] 여러 모드에서 스파크 잡 실행하기 예제 (0)	2018.03.14
[play2] 간단한 인증 방식 구현 예제(basic authentication) (0)	2018.02.20
[play] scala play framework에서 인증/권한(authentication/authorization) 참조 문서 - 펌 (0)	2018.02.09

Posted by '김용환'

[spark] 여러 모드에서 스파크 잡 실행하기 예제

scala 2018. 3. 14. 02:25

간단한 스파크 잡 실행하기 예제는 다음과 같다.

# 8코어에서 독립 실행 형 모드로 애플리케이션을 실행한다

SPARK_HOME/bin/spark-submit \

--class org.apache.spark.examples.Demo \

--master local[8] \

Demo-0.1-SNAPSHOT-jar-with-dependencies.jar

# YARN 클러스터에서 실행한다

export HADOOP_CONF_DIR=XXX

SPARK_HOME/bin/spark-submit \

--class org.apache.spark.examples.Demo \

--master yarn \

--deploy-mode cluster \ # 클러스터 모드로 클라이언트가 될 수 있다

--executor-memory 20G \

--num-executors 50 \

Demo-0.1-SNAPSHOT-jar-with-dependencies.jar

# supervise 플래그를 포함해 클러스터 배포 모드의 메소스(Mesos) 클러스터에서 실행한다

SPARK_HOME/bin/spark-submit \

--class org.apache.spark.examples.Demo \

--master mesos://207.184.161.138:7077 \ # IP 주소를 사용한다

--deploy-mode cluster \

--supervise \

--executor-memory 20G \

--total-executor-cores 100 \

Demo-0.1-SNAPSHOT-jar-with-dependencies.jar

supervise는 스탠드 얼론 모드에서 0이외의 값을 리턴, 비정상적인 종료일 때는 다시 실행하라는 의미를 가진다.

예제

https://spark.apache.org/docs/2.1.1/submitting-applications.html

# Run application locally on 8 cores
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master local[8] \
  /path/to/examples.jar \
  100

# Run on a Spark standalone cluster in client deploy mode
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

# Run on a Spark standalone cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

# Run on a YARN cluster
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000

# Run a Python application on a Spark standalone cluster
./bin/spark-submit \
  --master spark://207.184.161.138:7077 \
  examples/src/main/python/pi.py \
  1000

# Run on a Mesos cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  http://path/to/examples.jar \
  1000

저작자표시

'scala' 카테고리의 다른 글

[spark] log4j 직렬화하기 - org.apache.spark.SparkException: Task not serializable 해결하기 (0)	2018.03.25
[spark] 스파크 MLlib으로 비정상 데이터를 찾기에 좋은 참조 자료 (0)	2018.03.14
[play2] 간단한 인증 방식 구현 예제(basic authentication) (0)	2018.02.20
[play] scala play framework에서 인증/권한(authentication/authorization) 참조 문서 - 펌 (0)	2018.02.09
[play] JWT token 예제 (0)	2018.02.09

Posted by '김용환'

[opentsdb] hbase uid 스키마에서 특이한 점 - 마지막 저장 위치를 0x00에서 저장

hbase 2018. 3. 12. 17:10

opentsdb의 hbase 스키마는 다음 url에서 확인할 수 있다.

특이한 점은 데이터는 역순(증분)으로 되어 있고 마지막 uid 정보를 0x00에 저장한다는 점이다.

따라서 scanning할 때 무척 편하게 할 수 있다.

http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html

UID Table Schema

A separate, smaller table called tsdb-uid stores UID mappings, both forward and reverse. Two columns exist, one named name that maps a UID to a string and another id mapping strings to UIDs. Each row in the column family will have at least one of three columns with mapping values. The standard column qualifiers are:

metrics for mapping metric names to UIDs
tagk for mapping tag names to UIDs
tagv for mapping tag values to UIDs.

The name family may also contain additional meta-data columns if configured.

`id` Column Family

Row Key - This will be the string assigned to the UID. E.g. for a metric we may have a value of sys.cpu.user or for a tag value it may be 42.

Column Qualifiers - One of the standard column types above.

Column Value - An unsigned integer encoded on 3 bytes by default reflecting the UID assigned to the string for the column type. If the UID length has been changed in the source code, the width may vary.

`name` Column Family

Row Key - The unsigned integer UID encoded on 3 bytes by default. If the UID length has been changed in the source code, the width may be different.

Column Qualifiers - One of the standard column types above OR one of metrics_meta, tagk_meta or tagv_meta.

Column Value - For the standard qualifiers above, the string assigned to the UID. For a *_meta column, the value will be a UTF-8 encoded, JSON formatted UIDMeta Object as a string. Do not modify the column value outside of OpenTSDB. The order of the fields is important, affecting CAS calls.

UID Assignment Row

Within the id column family is a row with a single byte key of \x00. This is the UID row that is incremented for the proper column type (metrics, tagk or tagv) when a new UID is assigned. The column values are 8 byte signed integers and reflect the maximum UID assigned for each type. On assignment, OpenTSDB calls HBase's atomic increment command on the proper column to fetch a new UID.

저작자표시

'hbase' 카테고리의 다른 글

[hbase] hbase column addFamily, SingleColumnValueFilter 예제 (0)	2018.03.20
[Hbase] scan 결과에 대한 디버깅 - CellUtil (0)	2018.03.19
[Phoenix] 시간 관련 API 예제 (0)	2017.12.19
[Phoenix] describe (0)	2017.12.19
[Phoenix] outputformat 결과 출력 형태 변경 예제 (0)	2017.12.19

Posted by '김용환'

[zookeeper] 메모리 설정

general java 2018. 3. 11. 00:11

Kafka에 zookeeper를 사용할 때. zookeeper 기본 설정 사용하다가 주키퍼에서 메모리 부족하고 난리도 아니다.

메모리 설정과 jmx 설정을 해주는 것이 좋다.

conf/java.env파일을 추가해 메모리 설정도 gc 로그 파일을 생성한다. 아래는 대략 기본 설정으로 보는게 좋다.

export JVMFLAGS="-Xmx3g -Xms3g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:CompileThreshold=200 -verbosegc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/lib/zookeeper/gc.log -XX:+UseGCLogFileRotation -XX:GCLogFileSize=10m -XX:NumberOfGCLogFiles=10"

zkServer.sh에 다음을 추가해 jmx 모니터링을 진행한다.

-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8989 -Djava.rmi.server.hostname=my.remoteconsole.org

저작자표시

'general java' 카테고리의 다른 글

[log] log4j에서 json 로그를 출력하도록 설정하기 (0)	2018.03.26
[gradle] 특정 라이브러리 제외하기 - compile.exclude (0)	2018.03.21
okhttp의 기본 connection pool 개수 - 5개 (0)	2018.01.30
[java] 진짜 쓸만한 json parser (0)	2017.11.30
CircularFifoBuffer 예제 (0)	2017.11.01

Posted by '김용환'

[Datagrip] query editor 실행하기

Tool 2018. 3. 8. 17:50

Jetbrains에서 query editor를 사용하려면.

File -> New -> Console을 이용한다.

아니면.

Database -> + -> console을 이용한다.

저작자표시

'Tool' 카테고리의 다른 글

[Intellij] 2017.3 이후 버전부터 Thread/Future쪽 디버깅 지원 (0)	2018.05.14
[sourcetree] remote: Invalid username or password. fatal: Authentication failed 해결 (0)	2018.03.14
리눅스 jq 파일 예제 (0)	2017.11.14
[Intellij] Unexpected exception[BootException:ID: (0)	2017.11.13
[iterm2] 소리 안나게 하는 방법 (0)	2017.10.19

Posted by '김용환'

[펌] java8의 parrallel stream 좋은 설명

java core 2018. 3. 8. 10:08

https://www.slideshare.net/dgomezg/parallel-streams-en-java-8

Parallel streams in java 8 from David Gómez García

저작자표시

'java core' 카테고리의 다른 글

[java] ubuntu에서 java10 설치 (0)	2018.06.19
자바 heap 튜닝/분석할 때 주의할 점 (0)	2018.05.30
[java] Date를 GMT스타일(is8601)로 변환하는 방법 - 또는 그 반대로 변환 방법 예제 (0)	2018.01.19
[jvm] gc 로그와 메모리 (0)	2018.01.14
[java] jvm의 GC 옵션 설정하기 (0)	2018.01.12

Posted by '김용환'

[python] 테스크 코드 실행 - tox 이용

python 2018. 2. 23. 16:31

flask의 테스트 코드를 실행할 때 사용되는 툴은 다음과 같다.

$ tox -e flake8,py27

tox는 표준 툴이다.

https://tox.readthedocs.io/en/latest/

pyenv를 사용하고 있다면 다음과 같이 설치후 사용할 수 있다.

pip install -r requirements.txt -i http://proxy.google.com/pypi/simple/ --trusted-host proxy.google.com

~/.pyenv/shims/tox -e flake8,py27

저작자표시

'python' 카테고리의 다른 글

[python] b 문자열 (b string) (0)	2018.03.27
[python] jinja2.exceptions.UndefinedError: 'len' is undefined 해결하기 (0)	2018.03.27
[python] sql_alchemy 에서 join을 사용할 때 유의해야 할 사항 - sql_alchemy에 대한 이해 (0)	2018.02.22
[sql_alchemy] 3개의 테이블 조인 (join three tables) 예제 (0)	2018.02.21
[python] sql_alchemy sql 출력하기 (sql 디버그) (0)	2018.02.21

Posted by '김용환'

[python] sql_alchemy 에서 join을 사용할 때 유의해야 할 사항 - sql_alchemy에 대한 이해

python 2018. 2. 22. 20:52

아래와 같이 select의 컬럼과 from의 테이블이 서로 다르다.(사실 이게 되기도 한다)

SELECT distinct kibanaauth_esidx.esidx

FROM kibanaauth_role role

이전 쿼리는 kibanaauth_role와 kibanaauth_esidx가 다르기 때문에 조인을 할 수 없다.

sql_alchemy의 query()를 join()과 함께 쓸 때는

내부적으로 SQL의 select와 from 뒤에 query() 매개 변수에 포함되는 모델의 테이블을 무조건 적용하게 된다.

그래서 아래와 같이 select와 from을 동일한 테이블이 나오도록 쿼리를 수정한 후,,

SELECT DISTINCT kibanaauth_esidx.esidx AS kibanaauth_esidx_esidx

FROM kibanaauth_esidx

아래와 같이 sql_alchemy 문을 만들어서 테스트해보니. 조인이 된다.

aaa = session.query(LogAuthServiceTag.esidx).distinct() \

.join(LogAuthRoleServiceTag, LogAuthServiceTag.id == LogAuthRoleServiceTag.esidx_id) \

.join(LogAuthRole, LogAuthRole.id == LogAuthRoleServiceTag.role_id) \

.join(LogAuthRoleUser, LogAuthRoleUser.role_id == LogAuthRole.id) \

.join(LogAuthUser, LogAuthUser.id == LogAuthRoleUser.user_id) \

.filter(LogAuthUser.userid == userid)

query() 문에 여러 모델을 넣어도 sql_alchemy가 내부적으로 조합하기 때문에

상황에 따라서는 from이 이상하게 나올 수 있다.

복잡하게 sql_alchemy 를 사용할 때는 SQL 문장을 디버깅하면서 확인해야 한다.

다시 얘기하면.

session.query(Post) \

.join(User, Post.author_id == User.id)

이 문장은 다음과 같이 변환될 것이다. query()의 매개 변수는 select, from으로 넘어갔다(항상 그런 것은 아니지만, 대개 그렇다.)

select post

from post

inner join user

on post.author_id == user.id

저작자표시

'python' 카테고리의 다른 글

[python] jinja2.exceptions.UndefinedError: 'len' is undefined 해결하기 (0)	2018.03.27
[python] 테스크 코드 실행 - tox 이용 (0)	2018.02.23
[sql_alchemy] 3개의 테이블 조인 (join three tables) 예제 (0)	2018.02.21
[python] sql_alchemy sql 출력하기 (sql 디버그) (0)	2018.02.21
[flask] AssertionError: View function mapping is overwriting an existing endpoint function 해결하기 (0)	2018.02.20

Posted by '김용환'

[sql_alchemy] 3개의 테이블 조인 (join three tables) 예제

python 2018. 2. 21. 20:10

python의 sql_alchemy에서 3개의 테이블을 조인하고 특정 사람의 권한을 보고 싶은 쿼리가 있다.

select role.role

from

(user join roleuser on user.id = roleuser.user_id)

left join role role on roleuser.role_id = role.id

where user.userid = 'sma'

python의 sql_alchemy는 다음과 같이 코딩한다.

instance = session.query(Role.role)\

.join(RoleUser, User.id == RoleUser.user_id) \

.outerjoin(Role, RoleUser.role_id == Role.id) \

.filter(User.userid == userid)

조인된 다른 테이블을 보려면 다음과 같다.

instance = session.query(Role.role)\

.join(RoleUser, User.id == RoleUser.user_id) \

.outerjoin(Role, RoleUser.role_id == Role.id) \

.filter(User.userid == userid) \

.add_entity(RoleUser) \

.add_entity(User) \

저작자표시

'python' 카테고리의 다른 글

[python] 테스크 코드 실행 - tox 이용 (0)	2018.02.23
[python] sql_alchemy 에서 join을 사용할 때 유의해야 할 사항 - sql_alchemy에 대한 이해 (0)	2018.02.22
[python] sql_alchemy sql 출력하기 (sql 디버그) (0)	2018.02.21
[flask] AssertionError: View function mapping is overwriting an existing endpoint function 해결하기 (0)	2018.02.20
[python] str과 repr 비교 (0)	2018.01.23

Posted by '김용환'

이전 1 ··· 53 54 55 56 57 58 59 ··· 408 다음

'분류 전체보기'에 해당되는 글 4074건

[sourcetree] remote: Invalid username or password. fatal: Authentication failed 해결

'Tool' 카테고리의 다른 글

[spark] 스파크 MLlib으로 비정상 데이터를 찾기에 좋은 참조 자료

'scala' 카테고리의 다른 글

[spark] 여러 모드에서 스파크 잡 실행하기 예제

'scala' 카테고리의 다른 글

[opentsdb] hbase uid 스키마에서 특이한 점 - 마지막 저장 위치를 0x00에서 저장

http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html

UID Table Schema

`id` Column Family

`name` Column Family

UID Assignment Row

'hbase' 카테고리의 다른 글

[zookeeper] 메모리 설정

'general java' 카테고리의 다른 글

[Datagrip] query editor 실행하기

'Tool' 카테고리의 다른 글

[펌] java8의 parrallel stream 좋은 설명

'java core' 카테고리의 다른 글

[python] 테스크 코드 실행 - tox 이용

'python' 카테고리의 다른 글

[python] sql_alchemy 에서 join을 사용할 때 유의해야 할 사항 - sql_alchemy에 대한 이해

'python' 카테고리의 다른 글

[sql_alchemy] 3개의 테이블 조인 (join three tables) 예제

'python' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바

'분류 전체보기'에 해당되는 글 4074건

'Tool' 카테고리의 다른 글

'scala' 카테고리의 다른 글

'scala' 카테고리의 다른 글

UID Table Schema

id Column Family

name Column Family

UID Assignment Row

'hbase' 카테고리의 다른 글

'general java' 카테고리의 다른 글

'Tool' 카테고리의 다른 글

'java core' 카테고리의 다른 글

'python' 카테고리의 다른 글

'python' 카테고리의 다른 글

'python' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바

`id` Column Family

`name` Column Family