node repair를 실행할 때 관련 로그는 아래 url에서 설명하고 있다. 


http://www.datastax.com/dev/blog/interpreting-repair-logs



실제로 nodetool repair를 실행하면 datastax url과 설명한 것과 비슷하게 나타난다. 공부하는데 도움될 것 같다.


1. RepairJob.java : merkle 트리 구성 요청을 보낸다.

2. RepairSession.java : merkle 트리 요청을 받는다.

3. Difference.java : 일치(consistency), 불일치(inconsistency)를 확인한다.

4. StreamingRepairTask.java : repair 스트림 상태를 알린다.

5. RepairSession.java : 싱크가 완료됨을 알린다.

6. StorageService.java : (발견하지 못했지만) repair 세션 범위를 알린다






아래는 실제 로그이다. 약간 차이가 나지만 큰 차이는 없다.


$ nodetool repair -- google_plus media

...


INFO  [RepairJobTask:3] 2020-12-31 14:25:30,934 RepairJob.java:163 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] requesting merkle trees for media (to [/1.1.1.1, /1.1.1.2, /1.1.1.3])

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,939 RepairSession.java:171 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.1

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,944 RepairSession.java:171 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.2

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,949 RepairSession.java:171 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.3

INFO  [RepairJobTask:2] 2020-12-31 14:25:30,949 Differencer.java:67 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.1 and /1.1.1.3 are consistent for media

INFO  [RepairJobTask:1] 2020-12-31 14:25:30,949 Differencer.java:67 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.2 and /1.1.1.3 are consistent for media

INFO  [RepairJobTask:3] 2020-12-31 14:25:30,949 Differencer.java:67 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.1 and /1.1.1.2 are consistent for media

INFO  [AntiEntropySessions:18] 2020-12-31 14:25:30,949 RepairSession.java:260 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] new session: will sync /1.1.1.3, /1.1.1.4, /1.1.1.5 on range (-7269152197870639307,-7265094729233520360] for google_plus.[media]

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,949 RepairSession.java:237 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] media is fully synced

INFO  [AntiEntropySessions:20] 2020-12-31 14:25:30,949 RepairSession.java:299 - [repair #b033fec0-e2be-11e6-bcf3-edb4e9adcd85] session completed successfully

INFO  [RepairJobTask:2] 2020-12-31 14:25:30,960 RepairJob.java:163 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] requesting merkle trees for media (to [/1.1.1.4, /1.1.1.5, /1.1.1.3])

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,961 RepairSession.java:171 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.4

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,962 RepairSession.java:171 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.5

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,963 RepairSession.java:171 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.3

INFO  [RepairJobTask:2] 2020-12-31 14:25:30,963 Differencer.java:67 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.4 and /1.1.1.5 are consistent for media

INFO  [RepairJobTask:1] 2020-12-31 14:25:30,963 Differencer.java:67 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.4 and /1.1.1.3 are consistent for media

INFO  [RepairJobTask:3] 2020-12-31 14:25:30,963 Differencer.java:67 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.5 and /1.1.1.3 are consistent for media

INFO  [AntiEntropySessions:20] 2020-12-31 14:25:30,963 RepairSession.java:260 - [repair #b039f230-e2be-11e6-bcf3-edb4e9adcd85] new session: will sync /1.1.1.3, /1.1.1.1, /1.1.1.2 on range (4489624031702179489,4492484221112590824] for google_plus.[media]

INFO  [AntiEntropyStage:1] 2020-12-31 14:25:30,963 RepairSession.java:237 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] media is fully synced

INFO  [AntiEntropySessions:18] 2020-12-31 14:25:30,963 RepairSession.java:299 - [repair #b037cf50-e2be-11e6-bcf3-edb4e9adcd85] session completed successfully





....


INFO  [RepairJobTask:2] 2020-12-31 14:31:28,036 RepairJob.java:163 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] requesting merkle trees for media (to [/1.1.1.4, /1.1.1.7, /1.1.1.3])

INFO  [AntiEntropyStage:1] 2020-12-31 14:31:28,106 RepairSession.java:171 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.4

INFO  [AntiEntropyStage:1] 2020-12-31 14:31:28,177 RepairSession.java:171 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.7

INFO  [AntiEntropyStage:1] 2020-12-31 14:31:28,221 RepairSession.java:171 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Received merkle tree for media from /1.1.1.3

INFO  [RepairJobTask:2] 2020-12-31 14:31:28,221 Differencer.java:74 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.4 and /1.1.1.7 have 2 range(s) out of sync for media

INFO  [RepairJobTask:3] 2020-12-31 14:31:28,221 Differencer.java:67 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.7 and /1.1.1.3 are consistent for media

INFO  [RepairJobTask:1] 2020-12-31 14:31:28,221 Differencer.java:74 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Endpoints /1.1.1.4 and /1.1.1.3 have 2 range(s) out of sync for media

INFO  [RepairJobTask:2] 2020-12-31 14:31:28,221 StreamingRepairTask.java:81 - [repair #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Forwarding streaming repair of 2 ranges to /1.1.1.4 (to be streamed with /1.1.1.7)

INFO  [RepairJobTask:1] 2020-12-31 14:31:28,223 StreamingRepairTask.java:68 - [streaming task #850e32f0-e2bf-11e6-bcf3-edb4e9adcd85] Performing streaming repair of 2 ranges with /1.1.1.4

INFO  [RepairJobTask:1] 2020-12-31 14:31:28,224 ColumnFamilyStore.java:912 - Enqueuing flush of media: 364 (0%) on-heap, 80 (0%) off-heap

INFO  [MemtableFlushWriter:14683] 2020-12-31 14:31:28,224 Memtable.java:347 - Writing Memtable-media@1433910433(0.033KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit)

INFO  [MemtableFlushWriter:14683] 2020-12-31 14:31:28,224 Memtable.java:382 - Completed flushing /var/lib/cassandra/data/google_plus/media-6c2e8460af9311e69f844b983626f83b/google_plus-media-tmp-ka-2204-Data.db (0.000KiB) for commitlog position ReplayPosition(segmentId=1483928411656, position=2911640)

INFO  [MemtableFlushWriter:14683] 2020-12-31 14:31:28,228 Memtable.java:347 - Writing Memtable-media.friend_profileid_index@1788192341(0.030KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit)

INFO  [MemtableFlushWriter:14683] 2020-12-31 14:31:28,228 Memtable.java:382 - Completed flushing /var/lib/cassandra/data/google_plus/media-6c2e8460af9311e69f844b983626f83b/google_plus-media.friend_profileid_index-tmp-ka-1997-Data.db (0.000KiB) for commitlog position ReplayPosition(segmentId=1483928411656, position=2911640)

INFO  [RepairJobTask:1] 2020-12-31 14:31:28,232 StreamResultFuture.java:86 - [Stream #852b7ef0-e2bf-11e6-bcf3-edb4e9adcd85] Executing streaming plan for Repair

INFO  [StreamConnectionEstablisher:12] 2020-12-31 14:31:28,232 StreamSession.java:220 - [Stream #852b7ef0-e2bf-11e6-bcf3-edb4e9adcd85] Starting streaming to /1.1.1.4

INFO  [StreamConnectionEstablisher:12] 2020-12-31 14:31:28,233 StreamCoordinator.java:209 - [Stream #852b7ef0-e2bf-11e6-bcf3-edb4e9adcd85, ID#0] Beginning stream session with /1.1.1.4

INFO  [STREAM-IN-/1.1.1.4] 2020-12-31 14:31:28,240 StreamResultFuture.java:166 - [Stream #852b7ef0-e2bf-11e6-bcf3-edb4e9adcd85 ID#0] Prepare completed. Receiving 2 files(80817 bytes), sending 1 files(80161 bytes)

INFO  [StreamReceiveTask:543] 2020-12-31 14:31:28,248 SecondaryIndexManager.java:163 - Submitting index build of [media.friend_profileid_index] for data in SSTableReader(path='/var/lib/cassandra/data/google_plus/media-6c2e8460af9311e69f844b983626f83b/google_plus-media-ka-2205-Data.db'), SSTableReader(path='/var/lib/cassandra/data/google_plus/media-6c2e8460af9311e69f844b983626f83b/google_plus-media-ka-2206-Data.db')





문제가 있었는지 확인하기 위해 로그를 확인하는 방법이다. 


$ grep -r 'Sync failed' /var/log/cassandra/system.log

..


$ grep -i -e "ERROR" -e "WARN" /var/log/cassandra/system.log

..

WARN  [SharedPool-Worker-1] 2020-01-14 13:17:13,272 SliceQueryFilter.java:319 - Read 6 live and 1157 tombstone cells in google_plus.part_of_friend_permission_activities.allow_profile_ids_idx for key: 04727367 (see tombstone_warn_threshold). 200 columns were requested, slices=[04488138:_-04488138:!]

WARN  [SharedPool-Worker-7] 2020-01-14 13:17:13,287 SliceQueryFilter.java:319 - Read 6 live and 1157 tombstone cells in google_plus.part_of_friend_permission_activities.allow_profile_ids_idx for key: 04727367 (see tombstone_warn_threshold). 200 columns were requested, slices=[04488138:_-04488138:!]

java.io.IOException: Error while read(...): 연결 시간 초과

WARN  [SharedPool-Worker-3] 2020-01-17 10:58:03,992 SliceQueryFilter.java:319 - Read 7 live and 1330 tombstone cells in google_plus.part_of_friend_permission_activities.allow_profile_ids_idx for key: 04ee9da7 (see tombstone_warn_threshold). 200 columns were requested, slices=[04f4dfdb:_-04f4dfdb:!]

java.io.IOException: Error while read(...): 연결이 상대편에 의해 끊어짐

java.io.IOException: Error while read(...): 연결이 상대편에 의해 끊어짐

WARN  [SharedPool-Worker-3] 2020-01-24 12:11:57,223 SliceQueryFilter.java:319 - Read 4 live and 1232 tombstone cells in google_plus.part_of_friend_permission_activities.allow_profile_ids_idx for key: 04e66181 (see tombstone_warn_threshold). 200 columns were requested, slices=[04f4dfdb:_-04f4dfdb:!]

WARN  [SharedPool-Worker-4] 2017-01-24 12:11:57,289 SliceQueryFilter.java:319 - Read 4 live and 1232 tombstone cells in google_plus.part_of_friend_permission_activities.allow_profile_ids_idx for key: 04e66181 (see tombstone_warn_threshold). 21 columns were requested, slices=[04f4dfdb:_-04f4dfdb:!]





Posted by '김용환'
,