cassandra 에서 시간 정보를 저장할 때, 3가지 type으로 저장가능하다.
1. uuid
2. timeuuid
3. timestamp
uuid에서 나온 것이 timeuuid 이다. uuid (timeuuid)의 큰 장점은 표준 uuid 라는 점, dateOf나 unixTimestampOf function을 cql에서 사용할 수 있다. 반면, timestamp에서는 이를 쓸 수 없다. timeuuid 는 처음부터 time으로 sorting이 가능하다. time uuid type으로 comparator type을 지정할 수 있다.
timeuuid는 mac address 기반이다.. 따라서 time stamp를 쓸지 timeuuid를 쓸지 고민해야 한다.
timestamp | integers, strings | Date plus time, encoded as 8 bytes since epoch |
uuid | uuids | A UUID in standard UUID format |
timeuuid | uuids | Type 1 UUID only (CQL 3) |
- uuid
32 hex digits, 0-9 or a-f, which are case-insensitive, separated by dashes, -, after the 8th, 12th, 16th, and 20th digits. For example: 01234567-0123-0123-0123-0123456789ab
- timeuuid
Uses the time in 100 nanosecond intervals since 00:00:00.00 UTC (60 bits), a clock sequence number for prevention of duplicates (14 bits), plus the IEEE 801 MAC address (48 bits) to generate a unique identifier. For example: d2177dd0-eaa2-11de-a572-001b779c76e3
http://stackoverflow.com/questions/17945677/cassandra-uuid-vs-timeuuid-benefits-and-disadvantages
UUID
and TIMEUUID
are stored the same way in Cassandra, and they only really represent two different sorting implementations.
TIMEUUID
columns are sorted by their time components first, and then by their raw bytes, whereasUUID
columns are sorted by their version first, then if both are version 1 by their time component, and finally by their raw bytes. Curiosly the time component sorting implementations are duplicated betweenUUIDType
and TimeUUIDType
in the Cassandra code, except for different formatting.
I think of the UUID
vs. TIMEUUID
question primarily as documentation: if you choose TIMEUUID
you're saying that you're storing things in chronological order, and that these things can occur at the same time, so a simple timestamp isn't enough. Using UUID
says that you don't care about order (even if in practice the columns will be ordered by time if you put version 1 UUIDs in them), you just want to make sure that things have unique IDs.
Even if using NOW()
to generate UUID
values is convenient, it's also very surprising to other people reading your code.
It probably does not matter much in the grand scheme of things, but sorting non-version 1 UUIDs is a bit faster than version 1, so if you have a UUID
column and generate the UUIDs yourself, go for another version.
http://wiki.apache.org/cassandra/FAQ#working_with_timeuuid_in_java
http://www.datastax.com/docs/0.8/dml/using_cli
[default@demo] CREATE COLUMN FAMILY blog_entry WITH comparator = TimeUUIDType AND key_validation_class=UTF8Type
AND default_validation_class = UTF8Type;
'nosql' 카테고리의 다른 글
cassandra 2.0 - 데이터 저장시 TTL 지정 (0) | 2013.11.11 |
---|---|
cassandra 2.0 - time 정보를 range 검색 (0) | 2013.11.11 |
cassandra 2.0 - copy (0) | 2013.11.08 |
facebook 분석툴 (dw엔진) fresto 소스 공개 (0) | 2013.11.08 |
[cassandra 2.0] terminal, cqlsh을 이용해서 간단한 정보 얻어오기 (active check) (0) | 2013.11.07 |