ebay ci/cd 아키텍처

scribbling 2019. 3. 21. 11:15


ebay 아키텍처

https://rnd.ebay.co.il/cwsd.php?Z3AuPTQ0Pg/NTQ/R0w-TEAocWx6bChfZmB-bmB2f3VmbXN2PGRhdA.pdf



ebay cI 아키텍처에 대한 좋은 설명이 있다.




# 로컬 개발 환경

Requirements

• Developer should be able to run the services on his development machine

• There should be isolation between development environments – no shared resources

• Developers shall use the same dependencies (DB version, etc.)

• How

• Running dependencies locally using Docker (databases, services, etc.)

• Use docker-compose files with preconfigured settings (e.g. DB credentials, etc.)

• Custom CLI tool to ease developers’ daily work (automate common tasks)

• Development environment related files are managed in a dedicated git repository


# CI pipeline

How

• Detailed notifications over email and Slack (where / who / what / …)

• Verbose output during the build

• Pull request validation + GitHub integration (trigger validation job, update the pull request, etc.)

• Use Jenkins Multibranch Pipeline jobs

- Pipeline as code (job configuration should be part of the codebase)

- Auto detects branches (new and deleted)

- Triggered on push (using GitHub webhook)

- Use Jenkins Shared Pipeline Libraries for code reuse (https://github.com/ebay/Jenkins-Pipeline-Utils »)

- Spin-up dependencies using Docker (Jenkins is running on Kubernetes…)


# key


Automate as much as possible

• Keep isolation (developers, environments, build runs, etc.)

• Define test boundaries

• Local Development Environment

• Should be easy and quick to setup, maintain and run

• Should be aligned across all developers

• CI

• Should provide quick feedback on all branches

• Should help protecting important branches

• CD

• Should give high confidence

• Should be auditable

Posted by '김용환'
,

도커만 봤을 때는.. 메모리 설정 안 해도 된다. 일반적인 도커 메모리 설정은 unlimited이다.


$ cat /sys/fs/cgroup/memory/system.slice/docker.service/memory.stat

hierarchical_memory_limit 9223372036854771712

hierarchical_memsw_limit 9223372036854771712


그런데, 왜 도커 메모리로 인해 도커가 죽는 일이 발생하기도 한다. 


바로 그것은 도커 메모리가 이슈가 아니라 도커 컨테이너의 애플리케이션이 이슈인 경우가 있다.



예를 들면,



다만, 일래스틱서치는 mmapfs를 사용하기에 vm 메모리 설정을 해야 한다. mmap은 매핑된 파일의 크기와 동일한 프로세스에서 사용 가능한 가상 메모리 주소 공간의 일부를 사용하는 형태이다(루씬의 MMapDirectory를 사용) 


그래서 운영체제 캐시 없이 바로 루신 인덱스에서 파일을 읽으면 빠르기 때문인데요. 이게 도커에 미치는 영향이다.



참고 자료


https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html


http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html


http://jprante.github.io/lessons/2012/07/26/Mmap-with-Lucene.html

Posted by '김용환'
,

kubernetes에 sidecar proxy라는 개념이 있다.

pod에는 일반적으로 하나의 container(image)를 띄우는데

함께 부가적으로 도커 이미지를 띄우는 경우를 말한다.



일반적으로 사용하는 tomcat app 앞에 apache나 nginx를 두고 싶을 때,

kibana 앞에 ldap 인증 부분(go, python)을 두는 앱서버를 두고 싶을때 사용되는 개념이다.


https://medium.com/@lukas.eichler/securing-pods-with-sidecar-proxies-d84f8d34be3e

Posted by '김용환'
,



debezium을 사용하다 아래와 같은 에러가 발생한다..





kafka-connect_1    | [2019-03-20 09:23:35,591] INFO Step 7: rolling back transaction after abort (io.debezium.connector.mysql.SnapshotReader)

kafka-connect_1    | [2019-03-20 09:23:35,603] ERROR Execption while rollback is executed (io.debezium.connector.mysql.SnapshotReader)

kafka-connect_1    | java.sql.SQLNonTransientConnectionException: Can''t call rollback when autocommit=true

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)

kafka-connect_1    | at com.mysql.cj.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:1851)

kafka-connect_1    | at io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:672)

kafka-connect_1    | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

kafka-connect_1    | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

kafka-connect_1    | at java.lang.Thread.run(Thread.java:748)

kafka-connect_1    | [2019-03-20 09:23:35,606] INFO Cluster ID: akSFNcLmRsK91EzRzUhD-A (org.apache.kafka.clients.Metadata)

kafka-connect_1    | [2019-03-20 09:23:35,606] ERROR Failed due to error: Aborting snapshot due to error when last running 'UNLOCK TABLES': Can''t call rollback when autocommit=true (io.debezium.connector.mysql.SnapshotReader)

kafka-connect_1    | org.apache.kafka.connect.errors.ConnectException: Can''t call rollback when autocommit=true Error code: 0; SQLSTATE: 08003.

kafka-connect_1    | at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)

kafka-connect_1    | at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:208)

kafka-connect_1    | at io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:678)

kafka-connect_1    | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

kafka-connect_1    | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

kafka-connect_1    | at java.lang.Thread.run(Thread.java:748)

kafka-connect_1    | Caused by: java.sql.SQLNonTransientConnectionException: Can''t call rollback when autocommit=true

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)

kafka-connect_1    | at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)

kafka-connect_1    | at com.mysql.cj.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:1851)

kafka-connect_1    | at io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:592)

kafka-connect_1    | ... 3 more






kafka-connect_1    | [2019-03-20 09:23:37,375] ERROR WorkerSourceTask{id=kc_debezium_connector_shopping_orders-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)

kafka-connect_1    | org.apache.kafka.connect.errors.ConnectException: A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '' at 4, the last event read from './mysql-bin.000003' at 194, the last byte read from './mysql-bin.000003' at 194. Error code: 1236; SQLSTATE: HY000.

kafka-connect_1    | at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)

kafka-connect_1    | at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:197)

kafka-connect_1    | at io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onCommunicationFailure(BinlogReader.java:984)

kafka-connect_1    | at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:950)

kafka-connect_1    | at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)

kafka-connect_1    | at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)

kafka-connect_1    | at java.lang.Thread.run(Thread.java:748)

kafka-connect_1    | Caused by: com.github.shyiko.mysql.binlog.network.ServerException: A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '' at 4, the last event read from './mysql-bin.000003' at 194, the last byte read from './mysql-bin.000003' at 194.

kafka-connect_1    | at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:914)

kafka-connect_1    | ... 3 more

kafka-connect_1    | [2019-03-20 09:23:37,377] ERROR WorkerSourceTask{id=kc_debezium_connector_shopping_orders-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)





여러 개의 io.debezium.connector.mysql.MySqlConnector를 사용할 때 database.server.id를 동일한 id로 사용할 때 발생한다..





Posted by '김용환'
,


kubernetes의 pod의 bash에 접근하려면, 먼저 pod 이름을 알아야 한다.



$ kubectl get pod

NAME                                READY   STATUS    RESTARTS   AGE

jenkins-8498fcb9b5-8k8b8         1/1     Running   0          40m



docker와 비슷하게 pod 이름으로 bash에 접근한다. 


$ kubectl exec -it  jenkins-8498fcb9b5-8k8b8 -- /bin/bash




Posted by '김용환'
,

보통 kubernetes pod을 재시작하려면 deployment 파일을 이용하는 경우가 많지만..


설정 파일 없이 확인하는 방법도 있다.



먼저 도커 이미지 이름을 얻는다.


아래 커맨ㄷ는 실제 컨테이너의 데타 데이터 이름과 컨테이너의 도커 이미지 이름을 얻는 커맨드이다.


$ kubectl get pods --all-namespaces -o=jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' |\ sort


ingress-nginx-controller-szb9s: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.19.0,

ingress-nginx-controller-ttq2h: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.19.0,

jenkins-8498fcb9b5-6n2vm: jenkins/jenkins:lts,

kube-apiserver-dkosv3-jenkins-master-1: gcr.io/google-containers/hyperkube-amd64:v1.11.5,

kube-apiserver-dkosv3-jenkins-master-2: gcr.io/google-containers/hyperkube-amd64:v1.11.5,

kube-apiserver-dkosv3-jenkins-master-3: gcr.io/google-containers/hyperkube-amd64:v1.11.5,




원하는 것은 바로 아래 커맨드이다. 메타데이터와 컨테이너 이름을 얻을 수 있다.


$ kubectl get pods -o=custom-columns=NAME:.metadata.name,CONTAINERS:.spec.containers[*].name

NAME                                                           CONTAINERS

jenkins-job-8f24e681-5b83-4f87-b713-69c86deedb22-25gsh-vjh9r   jnlp

jenkins-job-914tx-fthwt                                        jnlp

jenkins-8498fcb9b5-6n2vm                                    jenkins

my-release-mysql-65d89bd9c4-txkvn                              my-release-mysql








재시작을 진행한다. reboot 커맨드가 도커 안에 포함되어 있으면 다음처럼 실행한다.


$ kubectl exec jenkins-8498fcb9b5-6n2vm -c jenkins reboot



만약 다음 에러가 난다면, kill 커맨드를 사용해야 한다.


rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: \"reboot\": executable file not found in $PATH"




kubectl exec jenkins-8498fcb9b5-6n2vm -c jenkins  -- /bin/sh -c "kill 1"



Posted by '김용환'
,


debezium을 이용해 CDC(change data caputre) 기능을 접목하려 할 때.  GTID와 binlog 설정에 도움이 되는 설정(테스트 환경 설정)은 다음과 같다.



server-id         = 111

log_bin           = mysql-bin

expire_logs_days  = 1

gtid-mode         = ON

enforce-gtid-consistency = ON

binlog_format     = row
enforce_gtid_consistency  = on

#binlog_cache_size

#max_binlog_size





snapshot 모드를 위해 select, reload, show databases 권한이..

connector 최소 연결을 위해 replication slave, replication client 권한이 필요하다.



GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'debezium' IDENTIFIED BY 'dbz';




참고 


https://github.com/debezium/docker-images/blob/master/examples/mysql/0.8/mysql.cnf


https://dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html


https://debezium.io/docs/connectors/mysql/#topic-names

Posted by '김용환'
,


pod의 ip를 알려면 다음과 같은 커맨드를 사용한다.


$ bkubectl get pods

pod의 이름을 얻는다.


$ kubectl get pod <포트-이름> --template={{.status.podIP}}

10.10.10.10

Posted by '김용환'
,

kafka-connect_1    | [2019-03-18 08:23:37,859] ERROR WorkerSourceTask{id=inventory-connector-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)

kafka-connect_1    | org.apache.kafka.connect.errors.ConnectException: Creation of database history topic failed, please create the topic manually

kafka-connect_1    | at io.debezium.relational.history.KafkaDatabaseHistory.initializeStorage(KafkaDatabaseHistory.java:348)

kafka-connect_1    | at io.debezium.connector.mysql.MySqlSchema.intializeHistoryStorage(MySqlSchema.java:266)

kafka-connect_1    | at io.debezium.connector.mysql.MySqlTaskContext.initializeHistoryStorage(MySqlTaskContext.java:196)

kafka-connect_1    | at io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:137)

kafka-connect_1    | at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:47)

kafka-connect_1    | at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:198)

kafka-connect_1    | at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)

kafka-connect_1    | at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)

kafka-connect_1    | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

kafka-connect_1    | at java.util.concurrent.FutureTask.run(FutureTask.java:266)

kafka-connect_1    | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

kafka-connect_1    | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

kafka-connect_1    | at java.lang.Thread.run(Thread.java:748)

kafka-connect_1    | Caused by: java.util.concurrent.TimeoutException

kafka-connect_1    | at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108)

kafka-connect_1    | at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:274)

kafka-connect_1    | at io.debezium.relational.history.KafkaDatabaseHistory.getKafkaBrokerConfig(KafkaDatabaseHistory.java:353)

kafka-connect_1    | at io.debezium.relational.history.KafkaDatabaseHistory.initializeStorage(KafkaDatabaseHistory.java:337)

kafka-connect_1    | ... 12 more



원인은 kafka connect 포트 이슈이다.



docker compose를 사용할 때. 

kafka를 9092로 띄우면 docker 이슈가 발생한다. 따라서 저 에러가 나면 해당 포트(9092)에 데몬 이슈가 있으니 아래 링크를 참고해서 kafka:29092로 변경하고 관련해서 kafka_connect 컴포넌트에서 kafka:29092에 연결하도록 변경한다. 




https://rmoff.net/2018/08/02/kafka-listeners-explained/



'kafka' 카테고리의 다른 글

schema-registry HA 이슈  (0) 2019.04.05
kafka connect 설정 주의 사항  (0) 2019.04.04
stream stream-join 정보와 mjoin  (0) 2019.03.02
[펌] kafka burrow api  (0) 2018.11.20
[kafka] enable.auto.commit , auto.commit.interval.ms  (0) 2018.10.22
Posted by '김용환'
,



쿠버네티스에서 

볼륨을 포함하는 statefulset이 종료되지 않으면,  (사실 deployment도 영향 줄 수 있도)

볼륨을 삭제(delete pvc) 커맨드를 실행하더라도 종료되지 않는다.



$ kubectl get pod -w

NAME       READY   STATUS    RESTARTS   AGE

consul-0   1/1     Running   0          28d

consul-1   1/1     Running   0          28d

consul-2   1/1     Running   0          28d

consul-3   1/1     Running   0          28d

consul-4   1/1     Running   0          28d



pv를 확인한다. 


$ kubectl get pv

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                      STORAGECLASS   REASON   AGE

pvc-46e015f0-3041-11e9-a0e7-fa163ecffc2b   1Gi        RWO            Delete           Bound    default/datadir-consul-3   standard                26d

pvc-535563b9-3041-11e9-a0e7-fa163ecffc2b   1Gi        RWO            Delete           Bound    default/datadir-consul-4   standard                26d

pvc-b07a2c23-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            Delete           Bound    default/datadir-consul-0   standard                26d

pvc-c32c35d9-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            Delete           Bound    default/datadir-consul-1   standard                26d

pvc-d420797e-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            Delete           Bound    default/datadir-consul-2   standard                26




$ kubectl get pvc

NAME               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE

consul-0   Bound    pvc-b07a2c23-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       26d

consul-1   Bound    pvc-c32c35d9-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       26d

consul-2   Bound    pvc-d420797e-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       26d

consul-3   Bound    pvc-46e015f0-3041-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       26d

consul-4   Bound    pvc-535563b9-3041-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       26d




$ kubectl delete pvc consul-0 consul-1  consul-2  consul-3 consul-4

persistentvolumeclaim "consul-0" deleted

persistentvolumeclaim "consul-1" deleted

persistentvolumeclaim "consul-2" deleted

persistentvolumeclaim "consul-3" deleted

persistentvolumeclaim "consul-4" deleted




$ kubectl get pvc

NAME               STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE

consul-0   Terminating   pvc-b07a2c23-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       28d

consul-1   Terminating   pvc-c32c35d9-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       28d

consul-2   Terminating   pvc-d420797e-303d-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       28d

consul-3   Terminating   pvc-46e015f0-3041-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       28d

consul-4   Terminating   pvc-535563b9-3041-11e9-a0e7-fa163ecffc2b   1Gi        RWO            standard       28d




Terminating 상태로 변경되었지만


volume이 지워지지 않는다. 계속 Terminating 상태이다.






statefulset을 삭제하자 마자 모두 삭제된다.



$ kubectl delete statefulsets consul

statefulset.apps "consul" deleted




$ kubectl get pod

No resources found.


$ kubectl get pvc

No resources found.


$  kubectl get pv

No resources found.


Posted by '김용환'
,