'Elasticsearch' 카테고리의 글 목록 (10 Page)

[elasticsearch] Source 필터링 이해하기

Elasticsearch 2015. 7. 10. 11:26

source 필터는 저장된 원본을 제어할 수 있다.

source 값	뜻
true	원본을 읽는다. { "_source": false, "query" : { "term" : { "name" : "samuel" } } }
false	원본을 읽지 않는다. { "_source": true, "query" : { "term" : { "name" : "samuel" } } }
obj.*	부분적으로 읽는다. { "_source": "obj.*", "query" : { "term" : { "name" : "samuel" } } }
exclude/include	특정 패턴을 제외 또는 포함한다 { "_source": { "include": [ "name" ] } }

https://www.elastic.co/guide/en/elasticsearch/reference/1.6/search-request-source-filtering.html

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] alias (색인 별명) (0)	2015.07.22
[elasticsearch] ctx 하위 필드 설명 (0)	2015.07.20
[elasticsearch] aggreation(집계) 사용시 script의 doc, value 사용 (0)	2015.07.09
[elasticsearch] sub aggregation의 개수 제한이 있을까? (0)	2015.07.04
[elasticsearch] copy_to 용법 (0)	2015.07.04

Posted by '김용환'

,

[elasticsearch] aggreation(집계) 사용시 script의 doc, value 사용

Elasticsearch 2015. 7. 9. 14:02

script에서 필드 접근 방법은 최소 두가지가 있다. doc 문맥과 _value 이다.

(아직 일래스틱서치 공부중이므로 찾아보면 더 있을 수는 있다.)

일반적으로 일래스틱서치의 script를 사용할 때, doc라는 문맥을 사용하여 필드에 접근하는 방법이 있다.

"script_fields" : { "test1" : { "script" : "doc['my_field_name'].value * 2" }, "test2" : { "script" : "doc['my_field_name'].value * factor", "params" : { "factor" : 2.0 } } }

출처 : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html

aggregation시에는 doc 문맥을 사용하지 않고, _value를 사용하는 스크립트가 있다.

{ "aggs" : { ... "aggs" : { "daytime_return" : { "sum" : { "field" : "change", "script" : "_value * _value" } } } } }

https://www.elastic.co/guide/en/elasticsearch/reference/1.6/search-aggregations-metrics-sum-aggregation.html

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] ctx 하위 필드 설명 (0)	2015.07.20
[elasticsearch] Source 필터링 이해하기 (0)	2015.07.10
[elasticsearch] sub aggregation의 개수 제한이 있을까? (0)	2015.07.04
[elasticsearch] copy_to 용법 (0)	2015.07.04
[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법 (0)	2015.07.01

Posted by '김용환'

,

[elasticsearch] sub aggregation의 개수 제한이 있을까?

Elasticsearch 2015. 7. 4. 19:54

elaticsearch의 aggreation는 깊이 제한(개수 제한) 없이 sub-aggregation을 구성할 수 있다.

https://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-aggregations.html

Bucketing aggregations can have sub-aggregations (bucketing or metric). The sub-aggregations will be computed for the buckets which their parent aggregation generates. There is no hard limit on the level/depth of nested aggregations (one can nest an aggregation under a "parent" aggregation, which is itself a sub-aggregation of another higher-level aggregation).

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] Source 필터링 이해하기 (0)	2015.07.10
[elasticsearch] aggreation(집계) 사용시 script의 doc, value 사용 (0)	2015.07.09
[elasticsearch] copy_to 용법 (0)	2015.07.04
[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법 (0)	2015.07.01
[elasticsearch] attachement 타입 (0)	2015.06.30

Posted by '김용환'

,

[elasticsearch] copy_to 용법

Elasticsearch 2015. 7. 4. 18:50

일래스틱서치에서 copy_to 용법 관련해서, (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html#copy-to)

어떻게 사용할까(용법) 확인해봤더니

중복되지 않을 정보를 잘 활용하여 검색하게 할 수 있다.

성(last name)과 이름(first name)을 저장하지만, 성과 이름 둘 다(full name) 검색될 수 있도록 할 수 있다.

또는 이름(name)과 별명(alias)을 따로 저장하지만, 둘 다 추천될 수 있게 할 수 있다.

예제1)

{

"name": {"type": "string", "copy_to":["suggest"]},

"alias": {"type": "string", "copy_to":["suggest"]},

"suggest": {

"type": "complection",

"payloads": true,

"index_analyzer": "simple",

"search_analyzer": "simple"

}

예제 2)

{

"people": {

"properties": {

"last_name": {

"type": "string",

"copy_to": "full_name"

},

"first_name": {

"type": "string",

"copy_to": "full_name"

},

"state": {

"type": "string"

},

"city": {

"type": "string"

},

"full_name": {

"type": "string"

}

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] aggreation(집계) 사용시 script의 doc, value 사용 (0)	2015.07.09
[elasticsearch] sub aggregation의 개수 제한이 있을까? (0)	2015.07.04
[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법 (0)	2015.07.01
[elasticsearch] attachement 타입 (0)	2015.06.30
[elasticsearch] Geo shape 확장 - spatial4j, jts (0)	2015.06.29

Posted by '김용환'

,

[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법

Elasticsearch 2015. 7. 1. 01:00

elasticsearch에서는 색인(indexing) 시간이 많이 소요되기 때문에, 색인하는 과정에서 데이터를 검색하다가 데이터 손실을 볼 수 있다. 따라서 색인할 때는 2 개의 색인(index)을 만들어 색인 작업(indexing)을 교대로 하도록 한다.

zero downtime 으로 가능한 방법이다.

임의의 이름으로 poi_suggest_v1이란 색인(index)를 생성한다.

curl -s -XPUT http://localhost:9200/poi_suggest_v1/ -d '{

"settings": {

"index": {

"number_of_shards": 1,

"number_of_replicas": 0

}

},

"mappings": {

"suggest": {

"properties": {

"title": {

"index": "not_analyzed",

"type": "string"

}

}'

poi_suggest_v1에 별명(alias)를 준다.

curl -XPOST localhost:9200/_aliases -d '

{

"actions": [

{ "add": {

"alias": "poi_suggest",

"index": "poi_suggest_v1"

}}

]

}

'

head 플러그인 UI로 보면 별명(alias)가 v1으로 되어 있는 것을 볼 수 있다. (사실 v2 는 아직 생성전이다.)

별명(alias)인 poi_suggest으로 데이터 한 건을 읽어본다. 읽어지고 실제 색인은 _index 속성으로 표시된다.

애플리케이션은 poi_suggest로 보고 있도록 한다.

curl -XGET localhost:9200/poi_suggest/suggest/1

{"_index":"poi_suggest_v1","_type":"suggest","_id":"1","found":false}

poi_suggest_v2를 생성한다.

curl -s -XPUT http://localhost:9200/poi_suggest_v2/ -d '{

"settings": {

"index": {

"number_of_shards": 1,

"number_of_replicas": 0

}

},

"mappings": {

"suggest": {

"properties": {

"title": {

"index": "not_analyzed",

"type": "string"

}

}'

poi_suggest_v2로 데이터를 가져와 색인(indexing) 하는 작업을 한다. 또는 v1에서 v2로 복사하는 작업을 진행한다.

(전문용어로 scroll-reindex)

색인을 받아들일 준비가 되면, 별명(alias)를 poi_suggest_v2로 변경한다.

curl -XPOST localhost:9200/_aliases -d '

{

"actions": [

{ "remove": {

"alias": "poi_suggest",

"index": "poi_suggest_v1"

}},

{ "add": {

"alias": "poi_suggest",

"index": "poi_suggest_v2"

}}

]

}

'

head 플러그인 UI로 보면 다음과 같을 것이다.

애플리케이션에서 호출하면 poi_suggest는 그대로 존재하는 것처럼 보이지만, 실제 색인은 _index의 결과처럼 poi_suggest_V2 가 된다.

curl -XGET localhost:9200/poi_suggest/suggest/1

{"_index":"poi_suggest_v2","_type":"suggest","_id":"1","found":false}

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] sub aggregation의 개수 제한이 있을까? (0)	2015.07.04
[elasticsearch] copy_to 용법 (0)	2015.07.04
[elasticsearch] attachement 타입 (0)	2015.06.30
[elasticsearch] Geo shape 확장 - spatial4j, jts (0)	2015.06.29
[elasticsearch] geo point 매개변수 (0)	2015.06.28

Posted by '김용환'

,

[elasticsearch] attachement 타입

Elasticsearch 2015. 6. 30. 23:49

일래스틱서치는 첨부 파일 타입을 플러그인으로 지원한다. (플러그인 설치가 필요하다)

일반 문서를 base64로 인코딩해서 검색할 수 있는 attachment 타입을 지원한다.

내부는 아파치 Tika 프로젝트를 기반으로 되어 있다. 아직도 개발 중이니. 적당하게 쓸만하리라 생각된다.

지원되는 문서가 상당히 많다. 압축파일/동영상 이미지까지 가능하다.

(http://tika.apache.org/1.5/formats.html#Supported_Document_Formats)

PUT /test/person/_mapping

{

"person" : {

"properties" : {

"file" : {

"type" : "attachment",

"fields" : {

"file" : {"index" : "no"},

"title" : {"store" : "yes"},

"date" : {"store" : "yes"},

"author" : {"analyzer" : "myAnalyzer"},

"keywords" : {"store" : "yes"},

"content_type" : {"store" : "yes"},

"content_length" : {"store" : "yes"},

"language" : {"store" : "yes"}

}

PUT /test/person/1

{

"my_attachment" : "... base64 encoded attachment ..."

}

참조

https://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-attachment-type.html

https://github.com/elastic/elasticsearch-mapper-attachments

http://tika.apache.org/

http://www.scrutmydocs.org/

http://tika.apache.org/1.5/formats.html#Supported_Document_Formats

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] copy_to 용법 (0)	2015.07.04
[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법 (0)	2015.07.01
[elasticsearch] Geo shape 확장 - spatial4j, jts (0)	2015.06.29
[elasticsearch] geo point 매개변수 (0)	2015.06.28
[elasticsearch] 객체 간의 관계 관리 방법 (object, nested, child) (0)	2015.06.27

Posted by '김용환'

,

[elasticsearch] Geo shape 확장 - spatial4j, jts

Elasticsearch 2015. 6. 29. 20:39

Elasticsearch는 geo shape 타입을 확장할 수 있다. spatical4j와 jts를 포함한 geo shape filter를 할 수 있다.

<groupId>com.spatial4j</groupId>

<artifactId>spatial4j</artifactId>

</dependency>

<groupId>com.vividsolutions</groupId>

<groupId>xerces</groupId>

<artifactId>xercesImpl</artifactId>

</exclusion>

</exclusions>

</dependency>

아래 링크보면 진짜 다양한 형태를 지원한다.

- https://github.com/spatial4j/spatial4j

- http://www.vividsolutions.com/jts/jtshome.htm

https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/query-dsl-filters.html#geo-shape-filter

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-shape-filter.html

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법 (0)	2015.07.01
[elasticsearch] attachement 타입 (0)	2015.06.30
[elasticsearch] geo point 매개변수 (0)	2015.06.28
[elasticsearch] 객체 간의 관계 관리 방법 (object, nested, child) (0)	2015.06.27
[elasitcsearch] DFS Query Then Fetch (0)	2015.06.24

Posted by '김용환'

,

[elasticsearch] geo point 매개변수

Elasticsearch 2015. 6. 28. 23:29

geo point 타입을 매핑시 다음과 같이 여러 매개변수를 저장할 수 있다.

PUT /attractions

{

"mappings": {

"restaurant": {

"properties": {

"name": {

"type": "string"

},

"location": {

"type": "geo_point",

"lat_lon": true,

"geohash_prefix": true,

"geohash_precision": "1km"

}

* lat_lon (기본값은 false): .lat와 .lon 필드의 위도와 경도를 저장한다. 성능을 향상시킬수 있다고 한다.

* geohash (기본값은 false): 계산된 geohash 값을 저장한다.

* geohash_precision (기본값은 12): geohash 미적분학에 사용되는 정확도를 정의한다. 만약 [271, 37]의 값은 다음으로 저장된다.

location = "271, 37"

location.lat = 271

location.lon = 37

location.geohash = "adfbdf1"

참고

https://www.elastic.co/guide/en/elasticsearch/guide/current/geohash-mapping.html

https://www.elastic.co/guide/en/elasticsearch/guide/current/geopoints.html

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] attachement 타입 (0)	2015.06.30
[elasticsearch] Geo shape 확장 - spatial4j, jts (0)	2015.06.29
[elasticsearch] 객체 간의 관계 관리 방법 (object, nested, child) (0)	2015.06.27
[elasitcsearch] DFS Query Then Fetch (0)	2015.06.24
[elasticsearch] 템플릿(template) 생성/삭제 (0)	2015.06.21

Posted by '김용환'

,

[elasticsearch] 객체 간의 관계 관리 방법 (object, nested, child)

Elasticsearch 2015. 6. 27. 23:55

elasticsearc는 객체간의 관계를 관리하는 방법이 있다.

1. type=object : elasticsearch가 암묵적으로 관리하며, 주요 다큐먼트의 일부로 간주된다. 빠르지만, 임베디드 객체 값을 변경하려면 주요 문서를 재색인한다.

2. type=nested : parent 다큐먼트의 더 정확한 검색과 필터링을 할 수 있다.

3. 외부 child 다큐먼트 : child 다큐먼트가 외부 다큐먼트를 parent 다큐먼트로 바인드하기 위해 _parent 속성을 가진 외부 다큐먼트안에 있다. 외부 child 다큐먼트는 parent와 같은 샤드로 색인되어야 한다. parent와 조인하면 중첩된 다큐먼트를 포함한 것보다 더 느리다.

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] Geo shape 확장 - spatial4j, jts (0)	2015.06.29
[elasticsearch] geo point 매개변수 (0)	2015.06.28
[elasitcsearch] DFS Query Then Fetch (0)	2015.06.24
[elasticsearch] 템플릿(template) 생성/삭제 (0)	2015.06.21
[elasticsearch] 템플릿 질의 (template query) (0)	2015.06.21

Posted by '김용환'

,

[elasitcsearch] DFS Query Then Fetch

Elasticsearch 2015. 6. 24. 19:19

elasticsearch DFS에 대한 설명이 그리 많지 않다.

DFS 하면, Depth First Search가 생각날 수 있지만,

elasticsearch에서는 DFS 는 Document Frequency Statics 인듯 하다.

DFS Query Then Fetch 는 TF-IDF와 연관된 단어이며,

얼마나 중요한 단어가 나타나는지 통계를 구해서 좋은 품질의 키워드의 점수를 높이는데 있다.

따라서, 속도는 좀 느릴 수 있지만, 좋은 품질을 얻을 수 있다.

키워드가 많이 나온 글이 상위 랭크에 있게 하는데, Query Then Fetch 에 달리 DFS Query Then Fetch는 2가지가 다르다.

미리 질의 (Prequery) 하여 다큐먼트 질의를 하고, 전체적인 득점(score)을 계산한다.

* 자세한 건 아래 참조 문서를 본다.

https://en.wikipedia.org/wiki/Tf%E2%80%93idf

https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch

저작자표시 (새창열림)

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] geo point 매개변수 (0)	2015.06.28
[elasticsearch] 객체 간의 관계 관리 방법 (object, nested, child) (0)	2015.06.27
[elasticsearch] 템플릿(template) 생성/삭제 (0)	2015.06.21
[elasticsearch] 템플릿 질의 (template query) (0)	2015.06.21
[elasticsearch]groovy script 사용 설정 (0)	2015.06.19

Posted by '김용환'

,

'Elasticsearch'에 해당되는 글 140건

[elasticsearch] Source 필터링 이해하기

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] aggreation(집계) 사용시 script의 doc, value 사용

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] sub aggregation의 개수 제한이 있을까?

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] copy_to 용법

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] 색인(index) 변경을 별명(alias)로 이용하는 방법

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] attachement 타입

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] Geo shape 확장 - spatial4j, jts

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] geo point 매개변수

'Elasticsearch' 카테고리의 다른 글

[elasticsearch] 객체 간의 관계 관리 방법 (object, nested, child)

'Elasticsearch' 카테고리의 다른 글

[elasitcsearch] DFS Query Then Fetch

'Elasticsearch' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바