顯示文章

這裡允許您檢視這個會員的所有文章。請注意, 您只能看見您有權限閱讀的文章。


主題 - netman

頁: [1] 2 3 ... 13
1
Runing a descheduler on your own K8S cluster

v1.0
2019-08-01


一,前言

當我們建置完成一個多 worker nodes 的 K8S cluster 之後,本身就能提供相當的容錯能力。也就是當其中一臺 worker 節點故障或是用 drain 命令排除的時候,運行其上的 pods 會自動轉移至其他良好的 worker 節點上。然而,當故障節點修復上線或用 uncordon 命令加回 cluster 之後,已經處於運行狀態的 pods 並不會自動轉移回康復節點, 只有新產生的 pods 才會被安排在其上運行。

有時候,不同的 node 或因資源不同或因工作量不同,會導致各自有不同的負載:有些節點很忙、有些則很閒。

爲解決上述情況,除了管理員人爲調配之外,我們也可以藉助一個名爲 DeScheduler 的工具來實現服務的自動化平衡調度。


二,關於 DeScheduler

如下爲 DeScheduler 的 GitHub:
   https://github.com/kubernetes-incubator/descheduler

有興趣的朋友可以先去瞭解它的設檔細節,尤其是其中的 Policy and Strategies:
  • RemoveDuplicates
  • LowNodeUtilization
  • RemovePodsViolatingInterPodAntiAffinity
  • RemovePodsViolatingNodeAffinity

這裏暫時不討論這 4 種基本策略的內容了,我們只需要將所要的策略配置爲 ConfigMap 即可(一個單一的 ConfigMap 可以配置多個策略)。


三,部署 DeScheduler

3.1 建立一個空白目錄:
代碼: [選擇]
mkdir descheduler-yaml
cd descheduler-yaml

3.2 建置 ClusterRole & ServiceAccount:
代碼: [選擇]
cat > cluster_role.yaml << END
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: descheduler
  namespace: kube-system
rules:
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "watch", "list"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list", "delete"]
- apiGroups: [""]
  resources: ["pods/eviction"]
  verbs: ["create"]
END
kubectl apply -f cluster_role.yaml
kubectl create sa descheduler -n kube-system
kubectl create clusterrolebinding descheduler \
    -n kube-system \
    --clusterrole=descheduler \
    --serviceaccount=kube-system:descheduler

3.3 配置 ConfigMap:
代碼: [選擇]
cat > config_map.yaml << END
apiVersion: v1
kind: ConfigMap
metadata:
  name: descheduler
  namespace: kube-system
data:
  policy.yaml: |- 
    apiVersion: descheduler/v1alpha1
    kind: DeschedulerPolicy
    strategies:
      RemoveDuplicates:
         enabled: true
      LowNodeUtilization:
         enabled: true
         params:
           nodeResourceUtilizationThresholds:
             thresholds:
               cpu: 20
               memory: 20
               pods: 20
             targetThresholds:
               cpu: 50
               memory: 50
               pods: 50
      RemovePodsViolatingInterPodAntiAffinity:
        enabled: true
      RemovePodsViolatingNodeAffinity:
        enabled: true
        params:
          nodeAffinityType:
          - requiredDuringSchedulingIgnoredDuringExecution
END
kubectl apply -f config_map.yaml

#注意: 這裏我們將 RemoveDuplicates 策略設定爲啓用(enabled: true),如果不需要自動將 pods 分配到新成員節點的話,可以設定爲停用(enabled: false)。如果系統資源充足而達不到觸發條件,可以調低 LowNodeUtilization 策略的 thresholds 數值(cpu, momory, pod 數量,這三者要同時滿足才能觸發作動)。

3.4 配置 CronJob:
代碼: [選擇]
cat > cron_job.yaml << END
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: descheduler
  namespace: kube-system
spec:
  schedule: "*/30 * * * *"
  jobTemplate:
    metadata:
      name: descheduler
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: "true"
    spec:
      template:
        spec:
          serviceAccountName: descheduler
          containers:
          - name: descheduler
            image: komljen/descheduler:v0.6.0
            volumeMounts:
            - mountPath: /policy-dir
              name: policy-volume
            command:
            - /bin/descheduler
            - --v=4
            - --max-pods-to-evict-per-node=10
            - --policy-config-file=/policy-dir/policy.yaml
          restartPolicy: "OnFailure"
          volumes:
          - name: policy-volume
            configMap:
              name: descheduler
END
kubectl apply -f cron_job.yaml

# 注意:這裏設定的 CronJob 是每 30 分鐘執行一次,假如想在測試時更快驗證效果,可以調整爲更短的時間。


四,驗證

4.1 確認 CronJob:
代碼: [選擇]
kubectl get cronjobs -n kube-system確定可以看到類似如下的結果:
NAME          SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
descheduler   */30 * * * *   False     0         2m              32m


4.2 確認完成工作的 pods:
代碼: [選擇]
kubectl get pods -n kube-system | grep Completed會看到類似如下的結果:
descheduler-1564670400-67tqx                   0/1     Completed   0          1H
descheduler-1564670700-2vwhv                   0/1     Completed   0          32m
descheduler-1564671000-g69nc                   0/1     Completed   0          2m


4.3 檢查 logs:
代碼: [選擇]
kubectl -n kube-system logs descheduler-1564671000-g69nc如果沒觸發任何作動的話,最後一行會類似如下:
...
I0505 11:55:08.160964       1 node_affinity.go:72] Evicted 0 pods


4.4 排擠測試節點:
代碼: [選擇]
kubectl drain worker03.localdomain --ignore-daemonsets --delete-local-data  --grace-period=0 --force
kubectl get nodes worker03.localdomain
確認節點狀態類似如下:
NAME                   STATUS                     ROLES    AGE   VERSION
worker03.localdomain   Ready,SchedulingDisabled   <none>   71d   v1.15.0


等待片刻,確認所有 running pods 都只運行在其他 workers 上面:
web-6fc4fb46d-k5pzr              1/1     Running   0          88s     10.47.0.23   worker01.localdomain   <none>           <none>
web-6fc4fb46d-l9h4j              1/1     Running   0          52s     10.39.0.24   worker02.localdomain   <none>           <none>
web-6fc4fb46d-mqwqv              1/1     Running   0          85s     10.47.0.26   worker01.localdomain   <none>           <none>
web-6fc4fb46d-phr8r              1/1     Running   0          71s     10.47.0.8    worker01.localdomain   <none>           <none>
web-6fc4fb46d-t5ct7              1/1     Running   0          80s     10.47.0.27   worker01.localdomain   <none>           <none>
web-6fc4fb46d-vq4mk              1/1     Running   0          5m38s   10.39.0.8    worker02.localdomain   <none>           <none>
web-6fc4fb46d-ww2nq              1/1     Running   0          6m8s    10.47.0.10   worker01.localdomain   <none>           <none>
web-6fc4fb46d-wz8vl              1/1     Running   0          58s     10.39.0.22   worker02.localdomain   <none>           <none>
web-6fc4fb46d-xvk48              1/1     Running   0          5m25s   10.39.0.11   worker02.localdomain   <none>           <none>
web-6fc4fb46d-xxr5q              1/1     Running   0          5m56s   10.47.0.16   worker01.localdomain   <none>           <none>
web-6fc4fb46d-zcg6l              1/1     Running   0          2m29s   10.47.0.18   worker01.localdomain   <none>           <none>
web-6fc4fb46d-zh7zv              1/1     Running   0          11m     10.39.0.19   worker02.localdomain   <none>           <none>
web-6fc4fb46d-zldt7              1/1     Running   0          5m31s   10.39.0.4    worker02.localdomain   <none>           <none>
web-6fc4fb46d-zxrxw              1/1     Running   0          31m     10.39.0.7    worker02.localdomain   <none>           <none>


4.4 將節點復原:
代碼: [選擇]
kubectl uncordon worker03.localdomain
kubectl get nodes
確認所有節點都處於 ready 狀態:
NAME                   STATUS   ROLES    AGE   VERSION
master01.localdomain   Ready    master   71d   v1.15.0
master02.localdomain   Ready    master   71d   v1.15.0
master03.localdomain   Ready    master   71d   v1.15.0
worker01.localdomain   Ready    <none>   71d   v1.15.0
worker02.localdomain   Ready    <none>   71d   v1.15.0
worker03.localdomain   Ready    <none>   71d   v1.15.0


觀察 CronJob 剛有執行最近一次工作:
代碼: [選擇]
kubectl get pods -n kube-system | grep Completed看到類似如下的結果:
descheduler-1564671600-spl42                   0/1     Completed   0          1h
descheduler-1564671900-2sn9j                   0/1     Completed   0          30m
descheduler-1564672200-sq5zw                   0/1     Completed   0          77s


再次檢查 logs:
代碼: [選擇]
kubectl -n kube-system logs descheduler-1564672200-sq5zw將會發現有相當數量的 pods 被 evicted 了:
...
I0801 15:11:00.012104       1 node_affinity.go:72] Evicted 20 pods


確認 pods 被重新分配回節點:
web-6fc4fb46d-n687t              1/1     Running   0          87s     10.42.0.17   worker03.localdomain   <none>           <none>
web-6fc4fb46d-nzdrs              1/1     Running   0          91s     10.42.0.16   worker03.localdomain   <none>           <none>
web-6fc4fb46d-qrn6n              1/1     Running   0          2m8s    10.47.0.14   worker01.localdomain   <none>           <none>
web-6fc4fb46d-qxd8v              1/1     Running   0          2m1s    10.39.0.15   worker02.localdomain   <none>           <none>
web-6fc4fb46d-rpw8b              1/1     Running   0          70s     10.42.0.11   worker03.localdomain   <none>           <none>
web-6fc4fb46d-rxxrn              1/1     Running   0          2m3s    10.47.0.19   worker01.localdomain   <none>           <none>
web-6fc4fb46d-svts8              1/1     Running   0          2m6s    10.47.0.15   worker01.localdomain   <none>           <none>
web-6fc4fb46d-v9q9c              1/1     Running   0          2m4s    10.47.0.17   worker01.localdomain   <none>           <none>
web-6fc4fb46d-x5vrs              1/1     Running   0          110s    10.39.0.21   worker02.localdomain   <none>           <none>
web-6fc4fb46d-xfrnh              1/1     Running   0          76s     10.42.0.8    worker03.localdomain   <none>           <none>
web-6fc4fb46d-xmz64              1/1     Running   0          7m11s   10.42.0.4    worker03.localdomain   <none>           <none>
web-6fc4fb46d-z2xhw              1/1     Running   0          7m9s    10.42.0.7    worker03.localdomain   <none>           <none>
web-6fc4fb46d-zkv95              1/1     Running   0          7m12s   10.42.0.2    worker03.localdomain   <none>           <none>
web-6fc4fb46d-zltxl              1/1     Running   0          105s    10.47.0.6    worker01.localdomain   <none>           <none>



五,結論

透過 DeScheduler 我們可以實現動態的服務應用自動化配置,將負載平均到所有的 worker 節點,可確保 cluster 資源消費的合理化,有助於提升容錯性與服務穩定性。值得一試。

2
Building a K8S load balancer yourself

v1.0
2019-07-30


一、前言

看網絡上許多關於 K8S 的文章,談到服務(Service)導出方法的時候,若不想手動操作 expose 或使用 NodePort 的話,其中有一個選項是 LoadBalancer。聽起來是非常 cool 的方式,可惜大部分都是雲端服務商才能提供的服務。這對於一般入門者來說,消費雲端服務是一項奢侈品項,或是單純練習並想花那些成本。

然而,對於成功自架 K8S Cluster 的朋友來說,可以安裝 MetalLB 這個產品提供 local 的負載平衡器,實在是一大福音!


二、安裝步驟

2.1 直接用 kubectl 部署:
kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml

成功的話,會多出一個 metallb-system 的 name space,裏面會運行如下兩種 pod:
代碼: [選擇]
controller
speaker

2.2 配置 ConfigMap
請以文字編輯器編輯一份 yaml (metallb_configmap.yml),其內容如下:
代碼: [選擇]
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: my-ip-space
      protocol: layer2
      addresses:
      - 192.168.100.240-192.168.100.249

請注意:將 ip range 修改爲設當的網段:可以連線也不會造成 IP 位置衝突即可。

完成後套用 ConfigMap:
kubectl apply -f metallb_configmap.yml

沒錯,就這麼簡單!這樣我們的 MetalLB 就已經設定完成了。


三、測試

如下,我們透過一個簡單的 nginx 服務來測試 LoadBalancer 是否可以運作。

3.1 建立 nginx deployment 設定檔(nginx.yaml),其內容如下:
代碼: [選擇]
apiVersion: v1
kind: Namespace
metadata:
  name: metallb-test
  labels:
    app: metallb
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: metallb-test
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 3
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-deployment
  namespace: metallb-test
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: nginx
  sessionAffinity: None
  type: LoadBalancer

這裏我們獨立建一個 metallb-test 的 name space 以便我們測試用。

3.2 執行部署:
kubeclt apply -f nginx.yaml

3.3 假如部署沒有問題,我們就可以執行如下命令來檢查結果:
kubectl -n metallb-test get svc -o wide

應該看到類似如下這樣的 EXTERNAL-IP :
代碼: [選擇]
NAME               TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)        AGE     SELECTOR
nginx-deployment   LoadBalancer   10.103.250.239   192.168.100.240   80:16656/TCP   2m51s   app=nginx

這就代表 MetalLB 這個 LoadBalancer 是可以 work 的!

3.4 清除部署:
kubeclt delete -f nginx.yaml


四、結論
MetalLB 可以說是窮人的 K8S LoadBalancer,設定非常簡單,部署上沒有太高難度。非常適合自架練習 K8S 服務使用。

3
Using Ceph RBD Storage Classes in K8S

Author: netman<netman@study-area.org>
Date: 2019-07-25


一,前言

K8S 中可以使用的 volume 種類非常多,同時也提供 Storage Class 的方式讓管理員更方便的調用定義好的儲存種類。有興趣的話可以參考官方文件:
   https://kubernetes.io/docs/concepts/storage/volumes/
   https://kubernetes.io/docs/concepts/storage/persistent-volumes
   https://kubernetes.io/docs/concepts/storage/storage-classes/

本篇文章僅以簡單的實作形式來說明如何在 K8S 中使用 Ceph RBD 這一儲存資源。


二,前提條件

在展開實作之前,我們假設如下環境已經建置完成並且正常運作中:
Ceph Storage (monitors: 192.168.100.21-23)
K8S Cluster (用 kubeadm 部署)
這裏不再討論如何建置上述系統了,請另行參考其他文章。


三,實作步驟

3.1 於 ceph 上面建立專用的 pool 並設定帳號權限:
代碼: [選擇]
ceph osd pool create kube 32 32    # 具體的 PG number 請根據現況調整
ceph osd pool application enable 'kube' 'rbd'
ceph auth get-or-create client.kube mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kube'

3.2 記錄如下 base64 編碼值( K8S 設定 secrets 需要一致):
代碼: [選擇]
ceph auth get-key client.admin | base64    # 輸出將對應到 k8s 的 ceph-secret-admin secret
ceph auth get-key client.kube | base64     # 輸出將對應到 k8s 的 ceph-secret-kube secret

3.3 在 k8s cluster 中的 worker nodes 上面安裝 ceph-common, 並將 ceph monitor 中的 admin key 檔案複製過來:
代碼: [選擇]
yum install -y ceph-common
scp 192.168.100.21:/etc/ceph/ceph.client.admin.keyring /etc/ceph/

3.4 在 k8s 的操作主機(已經設定好context並能順利執行kubectl, 同時也能執行git命令)建立一個專用工作目錄:
代碼: [選擇]
mkdir ~/kube-ceph
cd ~/kube-ceph

3.5 建立 secrets yaml:
代碼: [選擇]
cat > kube-ceph-secret.yaml << END
apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret-admin
type: "kubernetes.io/rbd"
data:
  key: QVFEYzF0SmNMaVpkRmhBQWlKbUhNbndaR2tCdldFcThXWDhaaXc9PQ==
---
apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret-kube
type: "kubernetes.io/rbd"
data:
  key: QVFDSFdUaGROcC9LT2hBQUpkVG5XVUpQUOYrZGtvZ2k3S0Zwc0E9PQ==
END

3.6 建立 Storage Class yaml:
代碼: [選擇]
cat > kube-ceph-sc.yaml << END
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ceph-rbd
#provisionen: kubernetes.io/rbd
provisioner: ceph.com/rbd
parameters:
  monitors: 192.168.100.21:6789,192.168.100.22:6789,192.168.100.23:6789
  adminId: admin
  adminSecretName: ceph-secret-admin
  adminSecretNamespace: default
  pool: kube
  userId: kube
  userSecretName: ceph-secret-kube
  userSecretNamespace: default
  imageFormat: "2"
  imageFeatures: layering
END

3.7 建立 Persistent Volume Claim (PVC) yaml:
代碼: [選擇]
cat > kube-ceph-pvc.yaml << END
metadata:
  name: ceph-k8s-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ceph-rbd
  resources:
    requests:
      storage: 1Gi
END

3.8 因爲透過 kubeadm 部署 k8s 的原因,其內建的 provider 並沒有提供 rbd 的支援(provisioner: kubernetes.io/rbd),會遇到如下 error:
persistentvolume-controller     Warning   ProvisioningFailed  Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: executable file not found in $PATH, command output:

我們需要另行下載並部署延伸支援(provisioner: ceph.com/rbd, 上面的 Storage Class yaml 已經修改爲延伸支援了):
代碼: [選擇]
git clone  https://github.com/kubernetes-incubator/external-storage
cd external-storage/ceph/rbd/deploy/rbac/
kubectl apply -f ./

3.9 確認 rbd-provisioner pod 有正常運作:
代碼: [選擇]
kubectl get pods# 確定可以看到類似如下的 pod 在 Running 的狀態:
rbd-provisioner-9b8ffbcc-bwzvr   1/1     Running   0          58s

3.10 套用 ceph rbd 並將 ceph-rbd Storage Class 設定爲 default:
代碼: [選擇]
cd ~/kube-ceph
kubectl apply -f ./
kubectl patch storageclass ceph-rbd -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

3.11 驗收成果
代碼: [選擇]
kubectl get StorageClass# 確認 ceph-rbd 是唯一的 default
NAME                 PROVISIONER    AGE
ceph-rbd (default)   ceph.com/rbd   31s


代碼: [選擇]
kubectl get pvc# 確認 pvc 有正確的 bound 起來
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ceph-k8s-claim   Bound    pvc-00840751-aedf-11e9-8a7d-525400b83b04   1Gi        RWO            ceph-rbd       75s


同時,在 ceph 那邊也能看到新的 rbd image 被自動建立起來:
代碼: [選擇]
rbd list -p kube# 可以看到類似如下的結果:
kubernetes-dynamic-pvc-755378b8-aedf-11e9-984f-16b92f357547

3.12 將 pvc 實作於 pod 內:
代碼: [選擇]
cat > kube-ceph-pod.yaml << END
apiVersion: v1
kind: Pod
metadata:
  name: kube-ceph-pod       
spec:
  containers:
  - name: ceph-busybox
    image: busybox         
    command: ["sleep", "60000"]
    volumeMounts:
    - name: ceph-volume       
      mountPath: /usr/share/ceph-rbd
      readOnly: false
  volumes:
  - name: ceph-volume 
    persistentVolumeClaim:
      claimName: ceph-k8s-claim
END

4

Building a CEPH cluster with SSD cache

Author: netman<netman@study-area.org>
Date: 2019-05-09


1. 環境
  網路: 192.168.100.0/24
  作業系統:CentOS 7.6 (1810), 安裝套件: Server with GUI
  主機:
   node1: 192.168.100.21
   node2: 192.168.100.22
   node3: 192.168.100.23
   srv1:  192.168.100.1 (不參與 ceph cluster, 單純用來操作 ceph 部署)
  硬碟:
   node1:
      sda (SSD, 作 cache 用)
      sdb (HDD, 做 ceph storage 用)
   node2:
      sda (SSD, 作 cache 用)
      sdb (HDD, 做 ceph storage 用)
   node3:
      sdd (SSD, 作 cache 用)
      sdc (HDD, 做 ceph storage 用)



2. 必要前提條件
2.1 srv1 可以免用密碼直接 ssh 到三個 node 以 root 身份執行命令
2.2 全部主機關閉 firewalld 與 selinux
2.3 全部主機 /etc/hosts 都有彼此的名稱解析 (或透過 DNS)
2.4 全部主機執行過 yum update -y 更新到最新狀態
2.5 於所有主機安裝 epel repo:
   yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm



3. 執行安安裝與建置
(以下操作均於 srv1 完成)

# 建立 ceph-deploy repo
cat << EOM > /etc/yum.repos.d/ceph-deploy.repo
[ceph-noarch]
name=Ceph noarch packages
baseurl=https://download.ceph.com/rpm-mimic/el7/noarch/
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
EOM

# 安裝 ceph-deploy
yum install -y ceph-deploy

# 建立安裝目錄
mkdir my-cluster
cd my-cluster

# 建立初始 ceph.conf
cat > ceph.conf <<END
[global]
fsid = c8678875-c948-4d5c-b349-77868857150e
mon initial members = node1,node2,node3
mon host = 192.168.100.1,192.168.100.2,192.168.100.3
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
public network = 192.168.100.0/24
END

# 開始安裝
ceph-deploy install node1 node2 node3
ceph-deploy new node1 node2 node3
ceph-deploy mon create-initial
ceph-deploy admin node1 node2 node3
ceph-deploy mgr create node1 node2 node3
ceph-deploy mds create node1 node2 node3

# 確認所有主機可以執行 ceph -s 並確認狀態 OK
ssh node1 ceph -s
ssh node2 ceph -s
ssh node3 ceph -s

# 查看 node1 硬碟並確認磁碟名稱
ssh node1 lsblk

# 初始化 node1 硬碟
ceph-deploy disk zap node1 /dev/sda
ceph-deploy disk zap node1 /dev/sdb
ceph-deploy osd create --data /dev/sda node1
ceph-deploy osd create --data /dev/sdb node1

# 確認 node1 硬碟結果
ssh node1 lsblk

# 初始化 node2 硬碟:
ssh node2 lsblk
ceph-deploy disk zap node2 /dev/sda
ceph-deploy disk zap node2 /dev/sdb
ceph-deploy osd create --data /dev/sda node2
ceph-deploy osd create --data /dev/sdb node2
ssh node2 lsblk

# 初始化 node3 硬碟:
ssh node3 lsblk
ceph-deploy disk zap node3 /dev/sdd
ceph-deploy disk zap node3 /dev/sdc
ceph-deploy osd create --data /dev/sdd node3
ceph-deploy osd create --data /dev/sdc node3
ssh node3 lsblk


4. 設定 ceph cluster
# 以下操作在 node1 完成 (事實上在 cluster 中任何 node 中執行都可以)
ssh node1

# 確認 osd 狀態
ceph -s

# 計算 pg num
echo '(3*100)/3/6' | bc
## 公式:Total PGs = ((Total_number_of_OSD * 100) / max_replication_count) / pool_count
## 說明:環境中有總共6顆硬碟(3 x SSD + 3 x HDD),但因爲 SSD 會拿來做 cache 用,
##   因此我不太確定這裏應該用 3 個 OSD 還是 6 個?我先假設 3 個。
##   pool 的數量我這裏先假設 6 個
##    計算結果爲 16,但我在 HDD storage 中將調高爲 32

# 確認 osd class (應該會自動分配爲 ssd 與 hdd)
ceph osd crush class ls
ceph osd tree

# 建立 class rule
ceph osd crush rule ls
ceph osd crush rule create-replicated rule-ssd default host ssd
ceph osd crush rule create-replicated rule-hdd default host hdd
ceph osd crush rule ls

# 建立 ceph pool
## 說明:這裏我暫時不打算直接建 rbd 來使用,而是建立 cephfs 以 mount 的方式共享到全部主機,
##   一個 cephfs 需要兩個 pool: data-pool 與 metadata-pool, 名稱可以隨意命名,
##   我這裏打算將 metadata-pool 直接建在 SSD 上,而 data-pool 則設定爲使用 SSD 作 cache,
##   使用 SSD cache 是 ceph tier 的功能,有很多可以細調的地方,我這裏就全部略過了...
##   再來,我這裏規劃會建立兩個 ceph fs ,因此必須啓動 ceph fs 旗標 enable_multiple,
##   因此,我的 HDD storage 將會建立兩個 pool,但 SSD 則會建立四個 pool。

# 啓動旗標
ceph fs flag set enable_multiple true --yes-i-really-mean-it

# 建立第一個 cephfs
# 在 SSD 建 cache
ceph osd pool create fs1cache 16 16 rule-ssd
# 在 HDD 建 data-pool
ceph osd pool create fs1data 32 32 rule-hdd
# 在 SSD 建 metadata-pool
ceph osd pool create fs1metadata 16 16 rule-ssd
# 建 SSD cache
ceph osd tier add fs1data fs1cache
# 設定 cache 模式
ceph osd tier cache-mode fs1cache writeback
# 設定 cache hit_set_type
ceph osd pool set fs1cache hit_set_type bloom
# 建立 cephfs
ceph fs new fs1 fs1metadata fs1data

# 建立第二個 cephfs
ceph osd pool create fs2cache 16 16 rule-ssd
ceph osd pool create fs2data 32 32 rule-hdd
ceph osd pool create fs2metadata 16 16 rule-ssd
ceph osd tier add fs2data fs2cache
ceph osd tier cache-mode fs2cache writeback
ceph fs new fs2 fs2metadata fs2data

# 確認 pool 數量與名稱
ceph -s
ceph osd lspools


5. 掛載 CEPH FileSystem

# 以下在 node 1 執行
ssh node1

# 建立掛載點
mkdir /cepffs1 /cephfs2

# 提取 admin secret
awk '/key = /{print $NF}' /etc/ceph/ceph.client.admin.keyring > /etc/ceph/admin.secret

# 修改 fstab
# 說明: 我這裏的設計是,每一主機不直接連接自己的服務,而是分散連接其他主機。
#   由於是同一 ceph 節點下使用兩個不同的 fs,所以必須在 options 中以 mds_namespace 來指定 fs name
cat >>/etc/fstab << END
192.168.100.22:6789,192.168.100.23:6789:/ /cephfs1 ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,mds_namespace=fs1 0 0
192.168.100.22:6789,192.168.100.23:6789:/ /cephfs2 ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,mds_namespace=fs2 0 0
END

# 執行掛載(確認沒有 error)
mount -a

# 確認掛載
# 注意:如果用 df 命令只能看到其中一筆而已
mount

# 退出 node1
exit


# 以下在 node 2 執行
ssh node2
mkdir /cepffs1 /cephfs2
awk '/key = /{print $NF}' /etc/ceph/ceph.client.admin.keyring > /etc/ceph/admin.secret
cat >>/etc/fstab << END
192.168.100.21:6789,192.168.100.23:6789:/ /cephfs1 ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,mds_namespace=fs1 0 0
192.168.100.21:6789,192.168.100.23:6789:/ /cephfs2 ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,mds_namespace=fs2 0 0
END
mount -a
mount
exit


# 以下在 node 3 執行
ssh node3
mkdir /cepffs1 /cephfs2
awk '/key = /{print $NF}' /etc/ceph/ceph.client.admin.keyring > /etc/ceph/admin.secret
cat >>/etc/fstab << END
192.168.100.21:6789,192.168.100.22:6789:/ /cephfs1 ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,mds_namespace=fs1 0 0
192.168.100.21:6789,192.168.100.22:6789:/ /cephfs2 ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,mds_namespace=fs2 0 0
END
mount -a
mount
exit

#---- END ----#


5
Building Kubernetes Cluster on CentOS7

* OS:
Centos 7.6 (1810)

* Nodes:
  master:
    node1: 192.168.1.1
  workers:
    node2: 192.168.1.2
    node2: 192.168.1.3

* Pre-configuration:
  - firewalld: disabled
  - selinux: disabled
  - dns or /etc/hosts: configured

* Steps
### Ref: https://www.howtoforge.com/tutorial/centos-kubernetes-docker-cluster/

1. update packages:
yum update -y
reboot

2. enable br_netfilter:
echo br_netfilter > /etc/modules-load.d/br_netfilter.conf
echo "net.bridge.bridge-nf-call-iptables=1" >> /etc/sysctl.conf

3. turn off swap:
swapoff -a
sed -i '/^[^ ]\+ \+swap \+/s/^/#/' /etc/fstab

4. install docker-ce:
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce

5. install kubernetes:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
### Notes: gpgcheck=1 may not work while installing kubeadm, otherwise keep it to 1 if it works.
yum install -y kubelet kubeadm kubectl

6. reboot OS:
reboot

7. start and enable services:
systemctl start docker && systemctl enable docker
systemctl start kubelet && systemctl enable kubelet

8. fix cgroupfs issue:
kadm_conf=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
grep -q 'KUBELET_KUBECONFIG_ARGS=.* --cgroup-driver=cgroupfs"' $kadm_conf || sed -i '/KUBELET_KUBECONFIG_ARGS=/s/"$/ --cgroup-driver=cgroupfs"/' $kadm_conf
systemctl daemon-reload
systemctl restart kubelet

### Note: run above steps on all nodes (both master and workers)

9. enable master (Run on node1 only):
kubeadm init --apiserver-advertise-address=192.168.1.1 --pod-network-cidr=10.244.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
### Nodes: copy the 'kubeadm join 192.168.1.1:6443 --token XXXXXXXXXXX' line from the outputs and save it to a text file

10. verify the master nodes:
kubectl get nodes
kubectl get pods --all-namespaces

11. enable workers (Run on node2 & node3 only):
### paste the command line which was copied from master:
kubeadm join 192.168.1.1:6443 --token XXXXXXXXXXX...

12. verify nodes (Run on node1):
### you may have some thing like below:
[root@node1 ~]# kubectl get nodes
NAME                STATUS   ROLES    AGE    VERSION
node1.example.com   Ready    master   12m    v1.13.1
node2.example.com   Ready    <none>   4m9s   v1.13.1
node3.example.com   Ready    <none>   35m    v1.13.1


* Dashboard:
### Ref:
###     https://github.com/kubernetes/dashboard
###     https://github.com/kubernetes/dashboard/wiki/Creating-sample-user

1. create dashbord:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
kubectl proxy

2. create admin user:
cat > dashboard-adminuser.yaml << EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kube-system
EOF
cat > rolebinding.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kube-system
EOF
kubectl apply -f dashboard-adminuser.yaml
kubectl apply -f rolebinding.yaml

3. view and copy login token:
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}') | awk '/^token:/{print $2}'

4. access dashboard:
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
### select 'Token' and paste the the login token

--- end ---

6
https://studyarea.kktix.cc/events/2f263587-77f30d-4e6ea6-61e384-0eb2e5-3badd7-aba8c0-03756d-188891-copy-2
2019 1月份 SA@Tainan 1/20(日) GitLab 與 Git LFS 介紹

活動議題: GitLab 與 Git LFS 介紹

講師: HaWay

簡介:
Git 的缺點之一是無法在儲存庫中加入大型檔案, 因為大型檔案送入儲存庫之後,會造成所有人都必須複製一份副本,而隨者時間會越來越肥大。 後來 GitHub 推出 Git LFS 系統來針對大型檔案可以一同推送入程式庫中,但又可以避免無謂的副本複製。 本月我們來介紹 Git LFS 工具,看看它是如何運作的,並且也會分享一些 GitLab 的使用。

議程:
14:00 ~ 14:15 入場;報到
14:15 ~ 15:15 [贊助商議程]
15:15 ~ 15:30 休息
15:30 ~ 17:00 GitLab & Git LFS

[贊助商議程]
本次活動講師車馬費由 Gandi.net 贊助,並同時與大家介紹網域名稱的相關技術,主題未定,隨後更新。

時間:2019/01/20 (日) 14:00~17:00
地點:成功大學成功校區資訊工程學系資訊新館65203教室(二樓電腦教室)/  台南市東區大學路一號

地理位置/交通路線:
     從台南火車站後站出來沿著大學路直走到長榮路左轉小走一段路就看的到會旗囉!
     http://www.csie.ncku.edu.tw/ncku_csie/intro/traffic
     http://www.csie.ncku.edu.tw/ncku_csie/images/ncku/map.png

費用: 免費

主辦單位:
Study area酷!學園

協辦單位:
國立成功大學資訊工程學系

7
活動/聚會區 / 1/20 聚餐報名
« 於: 2019-01-05 13:12 »
1/20 有要聚餐的來這裏報名哦...
打算吃臺南focus的鬥牛士二鍋,先統計人數訂位。

----
netman +1
鳥哥 +1
haway +3
翔哥 +1
dean +1
清輝 +3

8
雜七雜八 / test
« 於: 2017-06-28 10:52 »
please ignore...

9
DevOps 討論版 / docker trouble-shooting tips
« 於: 2017-05-12 17:16 »
Note down before forgetting:

* docker: behind proxy
create /etc/systemd/system/docker.service.d/http-proxy.conf with following contents:
代碼: [選擇]
[Service]

Environment="ALL_PROXY=socks://127.0.0.1:8080/" "FTP_PROXY=ftp://127.0.0.1:8080/" "HTTPS_PROXY=http://127.0.0.1:8080/" "HTTP_PROXY=http://127.0.0.1:8080/" "NO_PROXY=localhost,127.0.0.0/8,127.0.0.1/16,192.168.0.0./16" "all_proxy=socks://127.0.0.1:8080/" "ftp_proxy=ftp://127.0.0.1:8080/" "http_proxy=http://127.0.0.1:8080/" "https_proxy=http://127.0.0.1:8080/" "no_proxy=localhost,127.0.0.0/8,172.16.0.0/16,192.168.0.0./16"


* Dockerfile: run pip with specified proxy:
代碼: [選擇]
RUN https_proxy=http://127.0.0.1:8080/ pip install -r requirements.txt

* Dockerfile: encounter SSL certificate failed while run pip:
Could not fetch URL https://pypi.python.org/simple/flask/: There was a problem confirming the ssl certificate: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed
- Solution:
add --trusted-host pypi.python.org in the pip command line:
代碼: [選擇]
RUN pip install --trusted-host pypi.python.org -r requirements.txt

* docker swarp: get folloing error while re-join a re-initiated swarn:
Error response from daemon: rpc error: code = 13 desc = connection error: desc = "transport: x509: certificate signed by unknown authority (possibly because of \"x509: ECDSA verification failure\" while trying to verify candidate authority certificate \"swarm-ca\")
- Solution:
    rm or mv the file swarm-root-ca.crt  in /var/lib/docker/swarm/certificates/

10
Ref:
https://www.howtoforge.com/tutorial/sync-documents-with-google-drive-on-ubuntu-linux/
http://askubuntu.com/questions/611801/grive-sync-error-possibly-google-api-shift

Env:
# cat /etc/debian_version
8.5

Problems:
The official 'drive' requires go with version 1.5 or above, while the system provides 1.3.3 only.
An alternative 'grive' provided by default has an API bug and gets 400 error.

Solution:
Compile the grive2

Instruction:
代碼: [選擇]
apt-get install git cmake build-essential libgcrypt11-dev libyajl-dev libboost-all-dev libcurl4-openssl-dev libexpat1-dev libcppunit-dev binutils-dev pkg-config
mkdir ~/grive
cd ~/grive
git clone https://github.com/vitalif/grive2.git
mkdir grive2/build
cd grive2/build
cmake ..
make -j4
sudo make install

Then prepare your directory for sync:
代碼: [選擇]
mkdir ~/mydir
cd ~/mydir
/usr/local/bin/grive -a
Copy & Paste the URL in your browser and get the auth code (40 chars), copy the code and paste back to the console...



11
Study-Area 酷學園 2016 群英會

前言

各位熱愛資訊技術的鄉親父老大家好,一年一度的 Study-Area 酷學園群英會又來了!本次會議融合了最近一年討論熱度頗高的議題,如 DevOps、SDN 等等議題。每位演講者皆將業界實務經驗濃縮成五十分鐘的演講,透過講者講述各種經驗後,期盼與會者能於會後能站在巨人的肩膀上往下一個技術高峰前進。

時程
09:00 - 09:05 開場
09:05 - 09:50 ONOS 及實際 SDN Switch 整合使用經驗分享 -- 小飛機
10:10 - 11:00 淺談 DC/OS -- Danial
11:00 - 12:00 epub電子書現場包 -- 雨蒼
12:00 - 13:10 休息(恕不提供午餐)
13:20 - 14:10 Git 導入中小企業經驗分享 -- Haway
14:20 - 15:10 淺談 Ansible 組態管理工具 -- Sakana
15:30 - 16:20 Ansible (Roles, Windows support) -- 凍仁翔
16:20 - 16:30 閉幕

活動時間
2016-07-16 星期六

活動地點
新竹交通大學工程三館 EC122

活動費用
門票: Free
停車卷: 30 元/次

主辦單位
酷學園 (Study-Area)
交通大學資訊工程學系

報名網址: http://studyarea.kktix.cc/events/c6457aff

12
Linux 討論版 / 問一個許功蓋問題
« 於: 2016-06-01 14:19 »
有勞大大審查下面script代碼:
代碼: [選擇]
#!/bin/bash
export LANG=zh_TW.Big5

in_file=1.txt

# case 1
lines=$(cat $in_file | awk -F, '{print$2,$3}')
echo "$lines"

# case 2
awk -F, '{print $2,$3}' $in_file | while read line
do
        echo $line
done
本以為兩個case的輸出會是一樣的...
但實際上會碰到許功蓋的問題:
代碼: [選擇]
[kenny@vmtest-linux tmp]$ locale
LANG=zh_TW.Big5
LC_CTYPE="zh_TW.Big5"
LC_NUMERIC="zh_TW.Big5"
LC_TIME="zh_TW.Big5"
LC_COLLATE="zh_TW.Big5"
LC_MONETARY="zh_TW.Big5"
LC_MESSAGES="zh_TW.Big5"
LC_PAPER="zh_TW.Big5"
LC_NAME="zh_TW.Big5"
LC_ADDRESS="zh_TW.Big5"
LC_TELEPHONE="zh_TW.Big5"
LC_MEASUREMENT="zh_TW.Big5"
LC_IDENTIFICATION="zh_TW.Big5"
LC_ALL=
[kenny@vmtest-linux tmp]$ file 1.txt
1.txt: ISO-8859 text
[kenny@vmtest-linux tmp]$ cat 1.txt
x1230,葉小姐,usa@xxx.com.tw,89,0,16/06/01,
x1978,許小姐,ally@xxx.com.tw,90,0,16/06/01,
x8657,陳先生,cbk@xxx.com.tw,3,0,16/06/01,
x1467,鄭成功,cck@xxx.com.tw,3,0,16/06/01,

[kenny@vmtest-linux tmp]$ ./1.sh
葉小姐 usa@xxx.com.tw
許小姐 ally@xxx.com.tw
陳先生 cbk@xxx.com.tw
鄭成功 cck@xxx.com.tw

葉小姐 usa@xxx.com.tw
酗p姐 ally@xxx.com.tw
陳先生 cbk@xxx.com.tw
鄭成?cck@xxx.com.tw

13
雜七雜八 / 大家新年快樂!
« 於: 2016-02-08 01:22 »
恭祝大家猴年進步!平安快樂!

^_^

14
Ref:
https://www.digitalocean.com/community/tutorials/how-to-secure-nginx-with-let-s-encrypt-on-ubuntu-14-04

Purpose: to add Let's Encrypt SSL Cert to gitlab, with auto-renew.

Steps:

sudo su -
gitlab-ctl stop
git clone https://github.com/letsencrypt/letsencrypt /opt/letsencrypt
cd /opt/letsencrypt/
./letsencrypt-auto certonly --standalone
cp /etc/letsencrypt/archive/gitlab.example.com/fullchain1.pem /etc/pki/ca-trust/source/anchors/
cp /etc/letsencrypt/archive/gitlab.example.com/chain1.pem /etc/pki/ca-trust/source/anchors/
update-ca-trust
mkdir -p /etc/gitlab/ssl
cp /etc/letsencrypt/archive/gitlab.example.com/chain1.pem /etc/gitlab/ssl/ca.crt
cp /etc/letsencrypt/archive/gitlab.example.com/fullchain1.pem /etc/gitlab/ssl/gitlab.example.com.crt
cp /etc/letsencrypt/archive/gitlab.example.com/privkey1.pem /etc/gitlab/ssl/gitlab.example.com.key
chmod 600 /etc/gitlab/ssl/gitlab.example.com.key
vim /etc/gitlab/gitlab.rb
代碼: [選擇]
external_url 'https://gitlab.example.com'
...
nginx['redirect_http_to_https'] = true
nginx['ssl_client_certificate'] = "/etc/gitlab/ssl/ca.crt" # Most root CA's are included by default
nginx['ssl_certificate'] = "/etc/gitlab/ssl/gitlab.example.com.crt"
nginx['ssl_certificate_key'] = "/etc/gitlab/ssl/gitlab.example.com.key"
...
nginx['custom_gitlab_server_config'] = "location ^~ /.well-known {\n allow all;\n}\n"
...
gitlab-ctl start
gitlab-ctl reconfigure # to make sure everything is OK
gitlab-ctl restart
cp /opt/letsencrypt/examples/cli.ini /usr/local/etc/le-renew-webroot.ini
vim /usr/local/etc/le-renew-webroot.ini
代碼: [選擇]
rsa-key-size = 4096
email = root@example.com
domains = gitlab.example.com
webroot-path = /opt/gitlab/embedded/service/gitlab-rails/public
cd /opt/letsencrypt/
./letsencrypt-auto certonly -a webroot --renew-by-default --config /usr/local/etc/le-renew-webroot.ini # to make sure it works fine!
curl -L -o /usr/local/sbin/le-renew-webroot https://gist.githubusercontent.com/thisismitch/e1b603165523df66d5cc/raw/fbffbf358e96110d5566f13677d9bd5f4f65794c/le-renew-webroot
vim /usr/local/sbin/le-renew-webroot
代碼: [選擇]
#!/bin/bash

date

web_service='nginx'
config_file="/usr/local/etc/le-renew-webroot.ini"
...
chmod +x /usr/local/sbin/le-renew-webroot
le-renew-webroot # to make sure the result is as expected
vim /etc/cron.d/le-renew-webroot
代碼: [選擇]
30 2 * * 1 root /usr/local/sbin/le-renew-webroot >> /var/log/le-renewal.log

15
Ref: https://docs.docker.com/registry/insecure/

Prerequisite:
* Docker service installed and running
* Private CA and server key/certs are already on CA server

Steps:

#-- Registry Host --#
mkdir -p /etc/docker/certs
cp /etc/pki/tls/private/dokcerhub.example.com.key /etc/docker/certs
cd /etc/docker/certs
cat /etc/pki/tls/certs/dokcerhub.example.com.crt /etc/pki/CA/cacert.pem > dokcerhub.example.com.crt
docker run -d -p 5000:5000 --restart=always --name registry -v /etc/docker/certs:/certs -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/dokcerhub.example.com.crt -e REGISTRY_HTTP_TLS_KEY=/certs/dokcerhub.example.com.key registry:2
docker ps    # to make sure registry is UP

#-- Docker Host --#
mkdir -p /etc/docker/certs.d/dokcerhub.example.com:5000
scp  dokcerhub.example.com:/etc/pki/CA/cacert.pem /etc/docker/certs.d/dokcerhub.example.com:5000/ca.crt
cp /etc/docker/certs.d/dokcerhub.example.com:5000/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust
systemctl restart docker
docker pull ubuntu
docker tag ubuntu dokcerhub.example.com:5000/ubuntu
docker push ubuntu dokcerhub.example.com:5000/ubuntu

16
Ref:
http://www.server-world.info/en/note?os=CentOS_7&p=openldap&f=1
http://www.server-world.info/en/note?os=CentOS_7&p=openldap&f=3
http://www.server-world.info/en/note?os=CentOS_7&p=openldap&f=4
http://www.server-world.info/en/note?os=CentOS_6&p=samba&f=4
http://www.study-area.org/tips/smbldap/
https://wiki.samba.org/index.php/Required_settings_for_NT4-style_domains

### Configure LDAP Server ###
yum -y install openldap-servers openldap-clients
cp /usr/share/openldap-servers/DB_CONFIG.example /var/lib/ldap/DB_CONFIG
chown ldap. /var/lib/ldap/DB_CONFIG
systemctl start slapd
systemctl enable slapd
slappasswd # copy the result
mkdir /root/tmp
cd /root/tmp
vi chrootpw.ldif
代碼: [選擇]
dn: olcDatabase={0}config,cn=config
changetype: modify
add: olcRootPW
olcRootPW: {SSHA}xxxxxxxxxxxxxxxxxxxxxxxx
ldapadd -Y EXTERNAL -H ldapi:/// -f chrootpw.ldif
for i in /etc/openldap/schema/*.ldif; do ldapadd -Y EXTERNAL -H ldapi:/// -f $i ; done
vi chdomain.ldif
代碼: [選擇]
dn: olcDatabase={1}monitor,cn=config
changetype: modify
replace: olcAccess
olcAccess: {0}to * by dn.base="gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth"
  read by dn.base="cn=Manager,dc=example,dc=com" read by * none

dn: olcDatabase={2}hdb,cn=config
changetype: modify
replace: olcSuffix
olcSuffix: dc=example,dc=com

dn: olcDatabase={2}hdb,cn=config
changetype: modify
replace: olcRootDN
olcRootDN: cn=Manager,dc=example,dc=com

dn: olcDatabase={2}hdb,cn=config
changetype: modify
add: olcRootPW
olcRootPW: {SSHA}xxxxxxxxxxxxxxxxxxxxxxxx

dn: olcDatabase={2}hdb,cn=config
changetype: modify
add: olcAccess
olcAccess: {0}to attrs=userPassword,shadowLastChange by
  dn="cn=Manager,dc=example,dc=com" write by anonymous auth by self write by * none
olcAccess: {1}to dn.base="" by * read
olcAccess: {2}to * by dn="cn=Manager,dc=example,dc=com" write by * read
ldapmodify -Y EXTERNAL -H ldapi:/// -f chdomain.ldif
vi basedomain.ldif
代碼: [選擇]
dn: dc=example,dc=com
objectClass: top
objectClass: dcObject
objectclass: organization
o: Example dot Com
dc: Example

dn: cn=Manager,dc=example,dc=com
objectClass: organizationalRole
cn: Manager
description: Directory Manager

dn: ou=People,dc=example,dc=com
objectClass: organizationalUnit
ou: People

dn: ou=Group,dc=example,dc=com
objectClass: organizationalUnit
ou: Group
ldapadd -x -D cn=Manager,dc=example,dc=com -W -f basedomain.ldif

#-- skip this part if you don't want TLS --#
# you MUST build your server key and cert first, the 'easy-ras' package should be a good idea
# Assuming you've installed openvpn and easy-rsa
cd /etc/openvpn/easy-rsa
cp ldap.example.com.key ldap.example.com.crt ca.crt /etc/openldap/certs/
cd /etc/openldap/certs/
chown ldap. ldap.example.com.* ca.crt
cd /root/tmp
vi mod_ssl.ldif
代碼: [選擇]
dn: cn=config
changetype: modify
add: olcTLSCACertificateFile
olcTLSCACertificateFile: /etc/openldap/certs/ca.crt
-
replace: olcTLSCertificateFile
olcTLSCertificateFile: /etc/openldap/certs/ldap.example.com.crt
-
replace: olcTLSCertificateKeyFile
olcTLSCertificateKeyFile: /etc/openldap/certs/ldap.example.com.key
ldapmodify -Y EXTERNAL -H ldapi:/// -f mod_ssl.ldif
vi /etc/sysconfig/slapd
代碼: [選擇]
SLAPD_URLS="ldapi:/// ldap:/// ldaps:///"#-- end of TLS configuration --#

systemctl start slapd
systemctl enable slapd

### Configure Client ###
#-- without TLS --#
yum -y install openldap-clients nss-pam-ldapd
authconfig --enableldap --enableldapauth --ldapserver=dlp.server.world --ldapbasedn="dc=example,dc=com" --enablemkhomedir --update
systemctl restart nslcd
systemctl enable nslcd
#-- withTLS --#
yum -y install openldap-clients nss-pam-ldapd
echo "TLS_REQCERT allow" >> /etc/openldap/ldap.conf
echo "tls_reqcert allow" >> /etc/nslcd.conf
scp ldap.example.com:/etc/openldap/certs/cacert.pem /etc/openldap/cacerts
authconfig --enableldap --enableldapauth --enableldaptls --ldapserver=dlp.server.world --ldapbasedn="dc=example,dc=com" --enablemkhomedir --update
systemctl restart nslcd
systemctl enable nslcd


### Configure SAMBA ###
yum -y install samba samba-client
cp /usr/share/doc/samba-4.2.3/LDAP/samba.ldif /etc/openldap/schema/
ldapadd -Y EXTERNAL -H ldapi:/// -f /etc/openldap/schema/samba.ldif
vi samba_indexes.ldif
代碼: [選擇]
dn: olcDatabase={2}hdb,cn=config
changetype: modify
add: olcDbIndex
olcDbIndex: uidNumber eq
olcDbIndex: gidNumber eq
olcDbIndex: loginShell eq
olcDbIndex: uid eq,pres,sub
olcDbIndex: memberUid eq,pres,sub
olcDbIndex: uniqueMember eq,pres
olcDbIndex: sambaSID eq
olcDbIndex: sambaPrimaryGroupSID eq
olcDbIndex: sambaGroupType eq
olcDbIndex: sambaSIDList eq
olcDbIndex: sambaDomainName eq
olcDbIndex: default sub
ldapmodify -Y EXTERNAL -H ldapi:/// -f samba_indexes.ldif
systemctl restart slapd


### Configure openldap-tools ###
rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
yum install -y install smbldap-tools
cd /etc/samba
mv smb.conf smb.conf.bak
cp /usr/share/doc/smbldap-tools-*/smb.conf smb.conf
vi /etc/samba/smb.conf
代碼: [選擇]
[global]
workgroup = EXAMPLE
netbios name = ldap
deadtime = 10
log level = 1
log file = /var/log/samba/log.%m
max log size = 5000
debug pid = yes
debug uid = yes
syslog = 0
utmp = yes
security = user
domain logons = yes
os level = 64
logon path =
logon home =
logon drive =
logon script =
passdb backend = ldapsam:"ldap://ldap.example.com/"
ldap ssl = no
ldap admin dn = cn=Manager,dc=example,dc=com
ldap delete dn = no
ldap password sync = yes
ldap suffix = dc=example,dc=com
ldap user suffix = ou=People
ldap group suffix = ou=Group
ldap machine suffix = ou=Computers
ldap idmap suffix = ou=Idmap
add user script = /usr/sbin/smbldap-useradd -m '%u' -t 1
rename user script = /usr/sbin/smbldap-usermod -r '%unew' '%uold'
delete user script = /usr/sbin/smbldap-userdel '%u'
set primary group script = /usr/sbin/smbldap-usermod -g '%g' '%u'
add group script = /usr/sbin/smbldap-groupadd -p '%g'
delete group script = /usr/sbin/smbldap-groupdel '%g'
add user to group script = /usr/sbin/smbldap-groupmod -m '%u' '%g'
delete user from group script = /usr/sbin/smbldap-groupmod -x '%u' '%g'
add machine script = /usr/sbin/smbldap-useradd -w '%u' -t 1
admin users = domainadmin
[NETLOGON]
path = /var/lib/samba/netlogon
browseable = no
share modes = no
[PROFILES]
path = /var/lib/samba/profiles
browseable = no
writeable = yes
create mask = 0611
directory mask = 0700
profile acls = yes
csc policy = disable
map system = yes
map hidden = yes
[homes]
comment = Home Directories
browseable = no
writable = yes
mkdir /var/lib/samba/{netlogon,profiles}
smbpasswd -W    # type the passwor of ldap manager twice
system start nmb
system start smb
system enable nmb
system enable smb
smbldap-config
    # Answer all the question down to the way
    # You could however press ctrl-c and reload the command if you made a mistake
smbldap-populate
smbldap-groupadd -a domainadmin
smbldap-useradd -am -g domainadmin domainadmin
smbldap-passwd domainadmin

### To add a Win7 client ###
smbldap-useradd -W win7pchttps://wiki.samba.org/index.php/Required_settings_for_NT4-style_domains

### Win7 modification ###
# Edit a text file named 'sambafix.reg'
代碼: [選擇]
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanManWorkstation\Parameters]

"DomainCompatibilityMode"=dword:00000001
"DNSNameResolutionRequired"=dword:00000000
# Double click the file to import the registry
# Reboot and join the 'EXAMPLE' domain using domainadmin or root account

17
Linux 討論版 / [openvpn] Install OpenVPN on CentOS7
« 於: 2016-01-23 21:46 »
Ref:
https://www.digitalocean.com/community/tutorials/how-to-setup-and-configure-an-openvpn-server-on-centos-7
https://www.howtoforge.com/tutorial/how-to-install-openvpn-on-centos-7/
http://www.study-area.org/tips/openvpn.html

on Server:
yum install openvpn easy-rsa -y
cp /usr/share/doc/openvpn-*/sample/sample-config-files/server.conf /etc/openvpn
vi /etc/openvpn/server.conf
代碼: [選擇]
port 1194
proto udp
dev tap
ca ca.crt
cert vpnserver.example.com.crt
key vpnserver.example.com.key  # This file should be kept secret
dh dh2048.pem
server 10.8.0.0 255.255.255.0
ifconfig-pool-persist ipp.txt
push "redirect-gateway def1 bypass-dhcp"
push "dhcp-option DNS 8.8.8.8"
push "dhcp-option DNS 8.8.4.4"
keepalive 10 120
comp-lzo
user nobody
group nobody
persist-key
persist-tun
status openvpn-status.log
verb 3
*note: I don't use tun here, instead of tap for device

mkdir -p /etc/openvpn/easy-rsa/keys
cp -rf /usr/share/easy-rsa/2.0/* /etc/openvpn/easy-rsa
vi /etc/openvpn/easy-rsa/vars
代碼: [選擇]
export EASY_RSA="`pwd`"
export OPENSSL="openssl"
export PKCS11TOOL="pkcs11-tool"
export GREP="grep"
export KEY_CONFIG=`$EASY_RSA/whichopensslcnf $EASY_RSA`
export KEY_DIR="$EASY_RSA/keys"
echo NOTE: If you run ./clean-all, I will be doing a rm -rf on $KEY_DIR
export PKCS11_MODULE_PATH="dummy"
export PKCS11_PIN="dummy"
export KEY_SIZE=2048
export CA_EXPIRE=3650
export KEY_EXPIRE=3650
export KEY_COUNTRY="TW"
export KEY_PROVINCE="Taiwan"
export KEY_CITY="Tainan"
export KEY_ORG="ExampleDotCom"
export KEY_EMAIL="root@example.com"
export KEY_OU="IT"
export KEY_NAME="EasyRSA"
cp /etc/openvpn/easy-rsa/openssl-1.0.0.cnf /etc/openvpn/easy-rsa/openssl.cnf
cd /etc/openvpn/easy-rsa/
source ./vars
./clean-all
./build-ca
./build-key-server vpnserver.example.com
./build-dh
cd keys/
cp dh2048.pem ca.crt vpnserver.example.com.key vpnserver.example.com.crt /etc/openvpn/
cd ..
./build-key linuxclient
./build-key win7client
systemctl -f enable openvpn@server.service
systemctl start openvpn@server.service
systemctl status openvpn@server.service
systemctl start firewalld
systemctl enable firewalld
firewall-cmd --zone=public --add-port=1194/udp --permanent
firewall-cmd --reload
firewall-cmd --zone=public --list-ports

On Linux Client (Ubuntu)
sudo su -
apt-get install openvpn
cd /etc/openvpn/
scp vpnserver.example.com:/etc/openvpn/easy-rsa/keys/linuxclient.* .
scp vpnserver.example.com:/etc/openvpn/easy-rsa/keys/ca.crt
vi client.ovpn
代碼: [選擇]
client
dev tap
proto udp
remote 1.2.3.4 1194
resolv-retry infinite
nobind
persist-key
persist-tun
comp-lzo
verb 3
ca /etc/openvpn/ca.crt
cert /etc/openvpn/linuxclient.crt
key /etc/openvpn/linuxclient.key
* Note: 1.2.3.4 is the IP of server
exit
sudo openvpn --config /etc/openvpn/client.ovpn


18
Ref: http://mark.koli.ch/configuring-apache-to-support-ssh-through-an-http-web-proxy-with-proxytunnel

Purpose:
Get ssh connection via HTTP proxy, if corporate firewall doesn't allow SSH.

Steps:

1. Install apache and proxy tunnel on server side. (with RPMforege Repo installed)
yum install httpd proxytunnel

2. vi /etc/httpd/conf.d/proxytunnel.conf
代碼: [選擇]
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so
Listen 443
<VirtualHost *:443>
  RequestReadTimeout header=0,MinRate=500 body=0,MinRate=500
  ServerName proxy.example.com:443
  DocumentRoot /var/www/proxytunnel
  ServerAdmin root@example.com
  RewriteEngine On
  RewriteCond %{REQUEST_METHOD} !^CONNECT [NC]
  RewriteRule ^/(.*)$ - [F,L]
  ProxyRequests On
  ProxyBadHeader Ignore
  ProxyVia Full
  AllowCONNECT 22
  <Proxy *>
    Order deny,allow
    #Allow from all
    Deny from all
  </Proxy>
  <ProxyMatch (proxy\.example\.com)>
    Order allow,deny
    Allow from all
  </ProxyMatch>
  LogLevel warn
  ErrorLog logs/proxy.example.com-proxy_error_log
  CustomLog logs/proxy.example.com-proxy_request_log combined
</VirtualHost>
cp -a /var/www/html /var/www/proxytunnel

3. enable service
systemctl restart httpd
systemctl enable httpd

4. fix SeLinux:
grep ssh /var/log/audit/audit.log | audit2allow -M mypol
semodule -i mypol.pp

5. Client side settings:
vi .ssh/config
代碼: [選擇]
Host proxy.example.com
  Hostname proxy.example.com
  ProxyCommand /usr/bin/proxytunnel -p localproxy:3128 -r proxy.example.com:443 -d %h:%p -H "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Win32)"
  ServerAliveInterval 30
  TCPKeepAlive no
*Note: write the ProxyCommand in a single line, do not use \ to break lines.

6. Test
ssh user@proxy.example.com

For server on ubuntu:
a2enmod proxy_http
a2enmod proxy_connect
a2enmod rewrite
and fix the log path

19
Linux 討論版 / [ADSL] Configure ADSL on CentOS7
« 於: 2016-01-21 22:07 »
Ref:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/sec-Using_NetworkManager_with_the_GNOME_Graphical_User_Interface.html#sec-Establishing_a_DSL_Connection

Preparation:
yum install -y rp-pppoe

Configuration:
1. run command
    nm-connection-editor
2.Press Add
3. Select 'DSL' from Connection Type, and Create
4. Enter information in DSL tag:
    Username: 7654321@ip.hinet.net (for Static IP)  or 7654321@hinet.net (for Dynamic IP)
    Service: Hinet
    Password: XXXXXXX
6. Select General tag:
    Check the "Automatically connection to ......"
7. Select Ethernet tag:
    Choose the proper device
8. Save
9. Go to NetworkManager and start the DSL connection

20
Problem Description:
Get https refused while pushing to a private registry.

Symbols:
代碼: [選擇]
docker push 1.2.3.4:5000/test
The push refers to a repository [1.2.3.4:5000/test] (len: 1)
unable to ping registry endpoint https://1.2.3.4:5000/v0/
v2 ping attempt failed with error: Get https://1.2.3.4:5000/v2/: dial tcp 1.2.3.4:5000: connection refused
 v1 ping attempt failed with error: Get https://1.2.3.4:5000/v1/_ping: dial tcp 1.2.3.4:5000: connection refused

Solution:
vi /etc/sysconfig/docker
代碼: [選擇]
OPTIONS='--selinux-enabled --insecure-registry 1.2.3.4:5000systemctl restart docker


21
step1: goto download website:
https://about.gitlab.com/downloads/#centos7

step2: preparation:
代碼: [選擇]
sudo yum install curl openssh-server
sudo systemctl enable sshd
sudo systemctl start sshd
sudo yum install postfix
sudo systemctl enable postfix
sudo systemctl start postfix
sudo firewall-cmd --permanent --add-service=http
sudo systemctl reload firewalld

step3: installation:
代碼: [選擇]
curl https://packages.gitlab.com/install/repositories/gitlab/gitlab-ce/script.rpm.sh | sudo bash
sudo yum install gitlab-ce
*note: curl may fails if behide proxy/firewall.
        - setup proxy environment and use wget to download the script.rpm.sh, then run

step4: configuration:
代碼: [選擇]
sudo gitlab-ctl reconfigure*note: you may want to change the URL if your servername is localhost.localdomain.
       - Edit /etc/gitlab/gitlab.rb and change the following:
               external_url "http://your.servername.or.ip.address"

step5: login
use browser to connect to the ip address, login with root  (password: 5iveL!fe)

step6: getting start
GitLab Documentation
連猴子都能懂的Git入門指南


22
雜七雜八 / 大家新年快樂!
« 於: 2016-01-01 09:02 »
我將2016定為學習之年。願所有朋友都能在今年學有所成,學好學滿!
今年臺北的場地應該有着落了,同時也希望各位學員能踊躍分享哦... ^_^

23
為慶祝sakana大大學成歸來還特意南下分享最新最夯技術,我們安排了會後聚餐洗塵接風!
請大家這裏報名,我會更新在本po:

netman : 4
xiang : 1
sakana + Ines : 2
鳥哥 : 1
小飛機 + 女友 : 2
三子 : 1
Jhe : 1
--------------
Total: 12

24
vSphere 5.5

encountered following error while vMotioning RHEL7.1 vms :

To revert to this snapshot, you must change the host pixel format to match that of the guest.  The host's current settings are: depth 24, bits per pixel 32.  The guest's current settings are: depth 24, bits per pixel 32.
Error encountered while trying to restore the virtual machine state from file "".

SOLUTION:
1. login as a normal user
2. go to Settings and change Display to lower resolution, e.g. 1024x768
3. su to root and run:
        cp /home/user/.config/monitors.xml /var/lib/gdm/.config/


25
some tips:

* load module: modprobe drbd
* make a link first: ln -s /sbin/drbdadm /usr/sbin/drbdadm
* there is a warning about drbd-kmp when using Yast, just keep press Ok or Next
* use device name as /dev/drbd_XXX minor XXX (the _ must be presented)
* manually run drbdadm create-md <RES>
* manually run drbdadm up <RES>
* manually run drbdadm -- --overwrite-data-of-peer primay <RES> on 1st node, and drbdadm secondary <RES> on 2nd node
* check status by cat /proc/drbd, to make sure connected and Primary/Secondary on 1st node and Secondary/Primary on 2nd node

Some Errors:
* unknown minor
    - run drbdadm up <RES>

* no resources defined!
   - use Yast to create resource

* ds:Diskless/UpToDate in /proc/drbd
  - drbdadm down <RES>; drbdadm up <RES>

* (104) Can not open backing device
  - delect and re-create partition, ( may needs a reboot)

* staying cs:StandAlone in /proc/drbd on 2nd node
 - drbdadm -- --discard-my-data connect <RES>

26
Tips of creating HA Cluster on OpenSuse 13.2 / 42.1

* Make sure NTP, DNS and firewall are configured properly.
* Avoid to use LVM for ISCSI target device, use raw partition instead.
* It is recommended to use Disk ID(/dev/disk/by-id/xxxxxx) rather than path (/dev/sdaX).
* Initname (iqn) must be unique to all cluster nodes. It could be regenerated by iscsi-iname command.
* No auth for discovery, but login. Incoming should be enough.
* The softdog module must be loaded for SBD (fencing system) before cluster initialization, if no hardware solution is available.
* Run mkfs.ocfs2 with stack and cluster names before using Hawk ocfs2 wizard. (mkfs.ocfs2 --cluster-stack=pcmk --cluster-name=hacluser /dev/disk/by-id/XXXXXX ; mounted.ocfs2 -t)
* Sync configuration files using csync2 -xv before setup a service cluster.
* Don't use 255.255.255.0 mask format in Hawk web server wizard, use 24 instead.
* Set up clone resources for sbd, dlm(base) and ocfs, those must run on all nodes, or put them into a single group & clone.
* Configure constrains after resource creation, set up dependency(colocation) and order. (No need for Leap 42.1)

27
vi /etc/qemu/bridge.conf
代碼: [選擇]
allow virbr0
qemu-system-arm -kernel kernel-qemu -cpu arm1176 -m 256 -M versatilepb -no-reboot -append "root=/dev/sda2" -hda xxxxxxxxxxxxxxx.img -net nic -net bridge,br=virbr0

28
database 討論版 / [mssql]remove a db owner login
« 於: 2015-07-29 14:09 »
Ref:
http://coresql.com/2013/10/24/cant-drop-user-the-server-principal-owns-one-or-more-endpoints-and-cannot-be-dropped/

Scenario:
1.   Used user-A to create DB, and assigned to db owner.
2.   Need to remove user-A and to use user-B instead.

Symbols:
1.   Can’t remove user-A and encountering following error 15141:
The server principal owns one or more endpoint(s) and cannot be dropped hence unable to delete a login from SQL Server

Steps:
1.   Open SSMS and add new Login user-B, assign db owner.
2.   Open DB properties then select File Permission, change owner to user-B.
3.   Run query to find all endpoints related to user-A:
代碼: [選擇]
SELECT p.name, e.* FROM sys.endpoints e
inner join sys.server_principals p on e.principal_id = p.principal_id
4.   Run alter to change owner:
代碼: [選擇]
Alter Authorization on endpoint::Mirroring to user-B5.   Go to Security/Login to remove user-A

30
Linux 討論版 / CentOS 7 join AD Domain
« 於: 2015-07-27 14:54 »
Ref:
http://www.hexblot.com/blog/centos-7-active-directory-and-samba

steps:
yum install realmd samba samba-common oddjob oddjob-mkhomedir sssd ntpdate ntp
systemctl enable ntpd.service
vi /etc/ntp.conf    # to add server dc1.mydomain.local on top of other servers
ntpdate dc1.mydomain.local
systemctl start ntpd.service
realm join --user=adminuser@mydomain.local mydomain.local
realm list    # to verify, otherwise redo from beginning

vi /etc/samba/smb.conf
代碼: [選擇]
[global]
workgroup = MYDOMAINLOCAL
server string = Samba Server Version %v

# Add the IPs / subnets allowed acces to the server in general.
# The following allows local and 10.0.*.* access
hosts allow = 127. 10.0.

# log files split per-machine:
log file = /var/log/samba/log.%m
# enable the following line to debug:
# log level =3
# maximum size of 50KB per log file, then rotate:
max log size = 50

# Here comes the juicy part!
security = ads
encrypt passwords = yes
passdb backend = tdbsam
realm = MYDOMAIN.LOCAL

# Not interested in printers
load printers = no
cups options = raw

# This stops an annoying message from appearing in logs
printcap name = /dev/null
systemctl enable smb.service
systemctl start smb.service

firewall-cmd --permanent --add-service=samba
firewall-cmd --reload

-----
vi /etc/sssd/sssd.conf:
代碼: [選擇]
...
#use_fully_qualified_names = True
use_fully_qualified_names = False
...
#fallback_homedir = /home/%d/%u
fallback_homedir = /home/%u
...

systemctl restart sssd
systemctl status sssd

id username
su - usernamd
chcon -t samba_share_t /home/username

visudo:
%MYDOMAINLOCAL\\domain\ admins ALL=(ALL)       NOPASSWD: ALL

Tips:
if failed on browsing by ip then change to netbiosname instead.

頁: [1] 2 3 ... 13