💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
**第一章 简介** PAAS:PaaS是Platform-as-a-Service的缩写,意思是平台即服务。 把服务器平台作为一种服务提供的商业模式。通过网络进行程序提供的服务称之为SaaS(Software as a Service),而云计算时代相应的服务器平台或者开发环境作为服务进行提供就成为了PaaS(Platform as a Service)。 Docker: Docker 是一个开源的应用容器引擎,让开发者可以打包他们的应用以及依赖包到一个可移植的容器中,然后发布到任何流行的 Linux 机器上,也可以实现虚拟化。 Zookeeper: ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,它包含一个简单的原语集,分布式应用程序可以基于它实现同步服务,配置维护和命名服务等。 MooseFS: MooseFS是一种分布式文件系统。 Kubernetes:Kubernetes是Google开源的容器集群管理系统。它构建Ddocker技术之上,为容器化的应用提供资源调度、部署运行、服务发现、扩容缩容等整一套功能。 Etcd:etcd组件作为一个高可用强一致性的服务发现存储仓库。 Keepalived: 基于VRRP协议来实现的WEB服务高可用方案,可以利用其来避免单点故障。 Calico是一个基于BGP协议的虚拟网络工具,在数据中心中的虚拟机、容器或者裸金属机器(在这里都称为workloads)只需要一个IP地址就可以使用Calico实现互连。 **第二章 准备环境** 2.1、镜像文件 准备CentOS1611及docker-extra两个镜像包文件。积分测试环境使用的阿里云yum源,此步骤可忽略。 2.2、配置本地yum源 ~~~ vi /etc/yum.repos.d/CentOS7_1611.repo [CentOS7_1611-media] name=CentOS7_1611-media baseurl=http://10.255.224.27/CentOS1611/ gpgcheck=0 enabled=1 [1611-entry] name=1611-entry baseurl=http://10.255.224.95/docker-extra/ gpgcheck=0 enabled=1 ~~~ **第三章 整体架构图** 3.1、集群架构 3.2、版本信息 服务名称 现生产版本 wlw版本 ~~~ kubernetes v1.2.0 v1.5.2 docker 1.9.1 1.12.6 etcd 2.2.5 3.1.0 docker-distribution docker-registryV1 Docker Registry V2 calico 0.18 0.18 zookeeper 3.4.6 3.4.6 ~~~ 3.3、kubernetes集群架构 **第四章 配置crt通信证书** 4.1、配置apiserver crt证书 1、下载easyrsa3: ~~~ curl -L -O https://storage.googleapis.com/kubernetes-release/easy-rsa/easy-rsa.tar.gz PS:easy-rsa.tar.gz已经放在盘上 tar xzf easy-rsa.tar.gz cd easy-rsa-master/easyrsa3 ~~~ 2、 ./easyrsa init-pki 3、创建CA: ./easyrsa --batch "--req-cn=${MASTER_IP} @`date +%s`" build-ca nopass (如果要使用默认的service访问kubernetes集群,使用--req-cn=*,IP地址为master节点的IP) 4、生成服务使用的cert和key: ./easyrsa --subject-alt-name="IP: ${MASTER_IP} " build-server-full kubernetes-master nopass(如果要使用默认的service访问kubernetes集群,修改${MASTER_IP}为*要在IP后边配置service的IP) 5、将生成文件移入生产目录下 ~~~ mkdir -p /srv/kubernetes cp pki/ca.crt /srv/kubernetes/ cp pki/issued/kubernetes-master.crt /srv/kubernetes/server.crt cp pki/private/kubernetes-master.key /srv/kubernetes/server.key ~~~ 6、配置ServiceAccount: ~~~ openssl genrsa -out /srv/kubernetes/serviceaccount.key 2048 PS:修改证书目录授权chmod 777 -R /srv/kubernetes/ ~~~ **第五章 搭建Etcd集群** 5.1安装Etcd 执行命令: ` rpm -ivh etcd-3.1.0-2.el7.x86_64.rpm` 5.2配置Etcd 以minion1机器为例,修改配置如下: ~~~ vi /etc/etcd/etcd.conf # [member] ETCD_NAME=GPRSDX1 # 不同的 etcd 主机定义不同的 NAME ETCD_DATA_DIR="/var/lib/etcd/default.etcd" #ETCD_WAL_DIR="" #ETCD_SNAPSHOT_COUNT="10000" #ETCD_HEARTBEAT_INTERVAL="100" #ETCD_ELECTION_TIMEOUT="1000" ETCD_LISTEN_PEER_URLS=”http://0.0.0.0:2380” # 定义peer绑定端口,即内部集群通 信端口 ETCD_LISTEN_CLIENT_URLS=”http://0.0.0.0:2379” # 定义client绑定端口,即 client 访问通信端口 #ETCD_MAX_SNAPSHOTS="5" #ETCD_MAX_WALS="5" #ETCD_CORS="" # #[cluster] ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.255.224.17:2380" # if you use different ETCD_NAME (e.g. test), set ETCD_INITIAL_CLUSTER value for this name, i.e. "test=http://..." ETCD_INITIAL_CLUSTER="GPRSDX1=http://10.255.224.17:2380,GPRSDX2=http://10.255.224.19:2380,gprsdx3=http://10.255.224.91:2380" # ETCD_INITIAL_CLUSTER 定义集群成员 ETCD_INITIAL_CLUSTER_STATE="new" # 初始化状态使用 new,建立之后改此值为 existing ETCD_INITIAL_CLUSTER_TOKEN="k8s-etcd-cluster" # etcd 集群名 ETCD_ADVERTISE_CLIENT_URLS=”http://10.255.224.17:2379” # 定义 client 广播端口,此处必须填写相应主机的 IP,不能填写 0.0.0.0,否则 etcd client 获取不了 etcd cluster 中的主机 #ETCD_DISCOVERY="" #ETCD_DISCOVERY_SRV="" #ETCD_DISCOVERY_FALLBACK="proxy" #ETCD_DISCOVERY_PROXY="" # #[proxy] #ETCD_PROXY="off" #ETCD_PROXY_FAILURE_WAIT="5000" #ETCD_PROXY_REFRESH_INTERVAL="30000" #ETCD_PROXY_DIAL_TIMEOUT="1000" #ETCD_PROXY_WRITE_TIMEOUT="5000" #ETCD_PROXY_READ_TIMEOUT="0" # #[security] #ETCD_CERT_FILE="" #ETCD_KEY_FILE="" #ETCD_CLIENT_CERT_AUTH="false" #ETCD_TRUSTED_CA_FILE="" #ETCD_PEER_CERT_FILE="" #ETCD_PEER_KEY_FILE="" #ETCD_PEER_CLIENT_CERT_AUTH="false" #ETCD_PEER_TRUSTED_CA_FILE="" # #[logging] #ETCD_DEBUG="false" # examples for -log-package-levels etcdserver=WARNING,security=DEBUG #ETCD_LOG_PACKAGE_LEVELS="" ~~~ 注意: 如果集群中的机器已经运行过etcd服务,只修改etcd的配置重启服务是不能把机器加到etcd集群中的,需要删除etcd的数据之后重启服务。 Etcd的数据目录见etcd配置文件的“ETCD_DATA_DIR” 检查etcd服务的启动文件: ~~~ vi /usr/lib/systemd/system/etcd.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/etcd/etcd.conf User=etcd # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\"" Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target ~~~ 5.3启动Etcd systemctl start etcd.service 5.4检查Etcd集群状态 在任意一台etcd主机上执行命令: ~~~ etcdctl member list 如果显示结果和以下内容类似则集群安装正常: 9e6e7b1fddbae15f: name=minion1 peerURLs=http://192.168.11.44:2380 clientURLs=http://192.168.11.44:2379 9ffcd715e2ce89d0: name=minion3 peerURLs=http://192.168.11.46:2380 clientURLs=http://192.168.11.46:2379 b4e5215afede02fd: name=minion2 peerURLs=http://192.168.11.45:2380 clientURLs=http://192.168.11.45:2379 ~~~ ~~~ 或者执行 etcdctl cluster-health member e5a6e1abedcc726 is healthy: got healthy result from http://10.255.224.91:2379 member 70994e5bc7e29cf9 is healthy: got healthy result from http://10.255.224.19:2379 member a463697b2fd4abb4 is healthy: got healthy result from http://10.255.224.17:2379 ~~~ cluster is healthy 5.5 ETCD对接Paas平台 这里仅说明对接etcd集群对应的变更部分,Calico, kubernetes安装配置见ECP安装部署手册。 5.5.1对接Calico 所有节点上执行:  /etc/profile文件添加环境变量“ETCD_ENDPOINTS” 这个变量的值为ETCD集群里每个节点endpoint,以英文逗号隔开。如果之前对接单节etcd添加过环境变量“ETCD_ENDPOINTS”请注释或删除“ETCD_ENDPOINTS” `export ETCD_ENDPOINTS=http://192.168.11.44:2379, http://192.168.11.45:2379, http://192.168.11.46:2379`  修改/etc/systemd/calico-node.service 修改Environment参数,从单机版的“ETCD_AUTHORITY”变更为“ETCD_ENDPOINTS”参数值也做对应的变更。 `Environment=ETCD_ENDPOINTS= http://192.168.11.44:2379, http://192.168.11.45:2379, http://192.168.11.46:2379` Minion节点上执行:  修改/etc/cni/net.d/10-calico.conf配置文件。 修改对接单机版本的“etcd_authority”为"etcd_endpoints",同时修改对应的值。 ~~~ $ cat /etc/cni/net.d/10-calico.conf { "name" : "calico-k8s-network", "type" : "calico", " etcd_endpoints " : " ETCD_ENDPOINTS=http://192.168.11.44:2379, http://192.168.11.45:2379, http://192.168.11.46:2379", "log_level" : "info", "ipam" : { "type" : "calico-ipam" } } ~~~ 重新启动calico服务. 运行命令: calicoctl node 检查节点状态正常 5.5.2对接Kubernetes Master节点上修改API Server的配置文件: 修改:/etc/kubernetes/apiserver `KUBE_ETCD_SERVERS="--etcd-servers=http://192.168.11.44:2379, http://192.168.11.45:2379, http://192.168.11.46:2379"` 重新启动kubernetes相关服务。 **第六章 搭建Zookeeper集群** 6.1 安装 Zookeeper 1、将下载的Zookeeper的安装包解压至安装目录: `tar -zxvf zookeeper-3.4.6.tar.gz -C /data01/wlwjf/app/zookeeper-3.4.6/` 6.2 Zookeeper的配置 ~~~ vi zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/data01/wlwjf/app/data/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=10.255.224.96:2888:3888 ###安装zk的其他主机地址 server.2=10.255.224.97:2888:3888 server.3=10.255.224.98:2888:3888 ~~~ 6.3 查看启动文件配置 ~~~ vi /usr/lib/systemd/system/zookeeper.service [Unit] Description=Zookeeper service After=network.target [Service] User=wlwjf Group=wlwjf SyslogIdentifier=wlwjf Environment=ZHOME=/data01/wlwjf/app/zookeeper-3.4.6 ExecStart=/usr/bin/java \ -Dzookeeper.log.dir=${ZHOME}/logs/zookeeper.log \ -Dzookeeper.root.logger=INFO,CONSOLE \ -cp ${ZHOME}/zookeeper-3.4.6.jar:${ZHOME}/lib/* \ -Dlog4j.configuration=file:${ZHOME}/conf/log4j.properties \ -Dcom.sun.management.jmxremote \ -Dcom.sun.management.jmxremote.local.only=false \ org.apache.zookeeper.server.quorum.QuorumPeerMain \ ${ZHOME}/conf/zoo.cfg [Install] WantedBy=multi-user.target ~~~ 6.4 启动Zookeeper Systemctl start zookeeper.service 6.5 检查ZK 安装目录下的bin目录下执行: ~~~ ./zkServer.sh status JMX enabled by default Using config: /data01/wlwjf/app/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower ~~~ 第七章 安装docker 准备环境:需单独创建一个VG,取名为docker-vg 7.1 安装docker yum -u install dokcer 7.2 修改配置文件 ~~~ # /etc/sysconfig/docker # Modify these options if you want to change the way the docker daemon runs OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false' if [ -z "${DOCKER_CERT_PATH}" ]; then DOCKER_CERT_PATH=/etc/docker fi # If you want to add your own registry to be used for docker search and docker # pull use the ADD_REGISTRY option to list a set of registries, each prepended # with --add-registry flag. The first registry added will be the first registry # searched. # ADD_REGISTRY='--add-registry registry.access.redhat.com:5000' # If you want to block registries from being used, uncomment the BLOCK_REGISTRY # option and give it a set of registries, each prepended with --block-registry # flag. For example adding docker.io will stop users from downloading images # from docker.io # BLOCK_REGISTRY='--block-registry' # If you have a registry secured with https but do not have proper certs # distributed, you can tell docker to not look for full authorization by # adding the registry to the INSECURE_REGISTRY line and uncommenting it. INSECURE_REGISTRY='--insecure-registry registry.wlw.com:5000' #####配置本地仓库地址,修改/etc/hosts文件设置本地的仓库地址改为registry registry.wlw # On an SELinux system, if you remove the --selinux-enabled option, you # also need to turn on the docker_transition_unconfined boolean. # setsebool -P docker_transition_unconfined 1 # Location used for temporary files, such as those created by # docker load and build operations. Default is /var/lib/docker/tmp # Can be overriden by setting the following environment variable. # DOCKER_TMPDIR=/var/tmp # Controls the /etc/cron.daily/docker-logrotate cron job status. # To disable, uncomment the line below. # LOGROTATE=false # # docker-latest daemon can be used by starting the docker-latest unitfile. # To use docker-latest client, uncomment below lines #DOCKERBINARY=/usr/bin/docker-latest #DOCKERDBINARY=/usr/bin/dockerd-latest #DOCKER_CONTAINERD_BINARY=/usr/bin/docker-containerd-latest #DOCKER_CONTAINERD_SHIM_BINARY=/usr/bin/docker-containerd-shim-latest vi /etc/sysconfig/docker-storage-setup # Edit this file to override any configuration options specified in # /usr/lib/docker-storage-setup/docker-storage-setup. # # For more details refer to "man docker-storage-setup" VG=docker-vg ####需提前创建该VG SETUP_LVM_THIN_POOL=yes ~~~ 7.3 启动docker systemctl start docker 7.4 检查docker启动状态 systemctl status docker 第八章 安装calico网络插件 calico负责为容器分配虚拟IP地址,使用主机的生产网卡进行容器的跨主机通信。 8.1 master节点安装(使用v0.18.0版本) 安装calicoctl: `wget -o /usr/bin/calicoctl https://github.com/projectcalico/calico-containers/releases/download/v0.18.0/calicoctl ` chmod +x /usr/bin/calicoctl 8.2 minion节点安装 参考master节点安装calicoctl。 ~~~ 安装calico-cni扩展: wget -N -P /opt/cni/bin https://github.com/projectcalico/calico-cni/releases/download/v1.1.0/calico chmod +x /opt/cni/bin/calico wget -N -P /opt/cni/bin https://github.com/projectcalico/calico-cni/releases/download/v1.1.0/calico-ipam chmod +x /opt/cni/bin/calico-ipam ~~~ 8.3 配置master节点 配置calico-node: ~~~ 创建文件/etc/systemd/calico-node.service,设置etcd集群服务地址 [Unit] Description=calicoctl node After=docker.service Requires=docker.service [Service] User=root Environment=ETCD_ENDPOINTS=${ETCD主机管理网卡IP}:2379 PermissionsStartOnly=true ExecStart=/usr/bin/calicoctl node --ip=${主机生产网卡IP}--detach=false Restart=always RestartSec=10 [Install] WantedBy=multi-user.target 将会使用镜像calico/node:v0.18.0,启动服务。 将准备的镜像加载到本机: docker load -i calico.tar 检查 dokcer images 将calico-node配置成开机启动服务,并启动。 在环境变量中增加ETCD_ENDPOINTS=master:2379的配置。 systemctl enable /etc/systemd/calico-node.service service calico-node restart 执行命令加载内核模块: modprobe ip_set modprobe ip6_tables modprobe ip_tables ~~~ 8.4 配置node节点 1 参考master配置,配置calico-node。 2 配置cni网络声明: ~~~ $ cat /etc/cni/net.d/10-calico.conf { "name" : "calico-k8s-network", "type" : "calico", "etcd_ENDPOINTS" : "${ETCD主机管理网卡IP}:4001", "log_level" : "info", "ipam" : { "type" : "calico-ipam" } } 8.5 配 ~~~置calico IP池 在master节点配置calico使用的IP池。 查看IP池是否存在: calicoctl pool show 如果存在IP池,且ip池使用的网段和规划的不一样,删除掉已经存在的IP池,重新添加正确的IP池。 calicoctl pool remove ${CIDR} calicoctl pool add 192.168.0.0/16 第九章 搭建Kubernetes集群 9.1 yum源安装k8s-1.5.2版本 yum -y install kubernetes-1.5.2 9.2 master配置 拷贝证书文件到所有的Master node主机 cd /srv/kubernetes/ scp * root@10.255.224.19:/srv/kubernetes/ 9.2.1 apiserver配置 API Server提供了HTTP Rest接口的关键服务进程,是k8s里所有资源的增删改查等操作的唯一入口,也是集群控制的入口进程。 cat /etc/kubernetes/apiserver KUBE_API_ADDRESS="--insecure-bind-address=0.0.0.0"##表示使用全部网络接口 KUBE_ETCD_SERVERS="--etcd-servers=http://10.255.224.17:2379,http://10.255.224.19:2379,http://10.255.224.91:2379" ###etcd服务列表设置etcd服务的访问地址,如果etcd是集群部署,在参数中配置多个etcd服务的地址,用逗号分隔 KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.254.0.0/16" ###service的虚拟IP池,这个地址不能与物理机网络重合 KUBE_ADMISSION_CONTROL="--admission-control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota" ## NamespaceLifecycle 如果尝试在一个不存在的namespaace中创建资源对象,则该创建请求将被拒绝,当删除一个namespace时,系统会删除该namespace中的所有对象,包括pod,service等。 ## LimitRanger 用于配额管理,作用于pod和container上,确保pod与container上的配额不会超标。 ## SecurityContextDeny 定义了操作系统级别的安全设定 ### ServiceAccount 实现自动化。 ### ResourceQuota 作用于配额管理,作用于namespace上,它会观察所有的请求,确保在namespace上不会超标。 KUBE_API_ARGS="--client-ca-file=/srv/kubernetes/ca.crt ##客户端证书将被用于认证过程 --tls-cert-file=/srv/kubernetes/server.crt ###包含x509证书的文件路径,用于https认证 --tls-private-key-file=/srv/kubernetes/server.key 包含x509与tls-cert-file对应的私钥文件路径 --service_account_key_file=/srv/kubernetes/serviceaccount.key" 包含PEM-encoded x509 RSA公钥和私钥 的文件路径,用于验证service account的token。不指定则使用tls-private-key-file指定的文件。 启动配置文件: [root@GPRSDX1 ~]# cat /usr/lib/systemd/system/kube-apiserver.service [Unit] Description=Kubernetes API Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target After=etcd.service [Service] EnvironmentFile=-/etc/kubernetes/config EnvironmentFile=-/etc/kubernetes/apiserver User=kube ExecStart=/usr/bin/kube-apiserver \ $KUBE_LOGTOSTDERR \ $KUBE_LOG_LEVEL \ $KUBE_ETCD_SERVERS \ $KUBE_API_ADDRESS \ $KUBE_API_PORT \ $KUBELET_PORT \ $KUBE_ALLOW_PRIV \ $KUBE_SERVICE_ADDRESSES \ $KUBE_ADMISSION_CONTROL \ $KUBE_API_ARGS Restart=on-failure Type=notify LimitNOFILE=65536 [Install] WantedBy=multi-user.target 9.2.2 scheduler配置 Kubernetes Scheduler负责资源调度(Pod调度)的进程,通过控制,使pod选择最优的node节点运行。相当于公交公司的“调度室”。 # Add your own! KUBE_CONTROLLER_MANAGER_ARGS=" --leader-elect=true " ##进行leader选举,用于多个master组件的高可用部署。 启动配置文件: [root@GPRSDX1 ~]# cat /usr/lib/systemd/system/kube-scheduler.service [Unit] Description=Kubernetes Scheduler Plugin Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] EnvironmentFile=-/etc/kubernetes/config EnvironmentFile=-/etc/kubernetes/scheduler User=kube ExecStart=/usr/bin/kube-scheduler \ $KUBE_LOGTOSTDERR \ $KUBE_LOG_LEVEL \ $KUBE_MASTER \ $KUBE_SCHEDULER_ARGS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target 9.2.3 controller-manager配置 kubernetes里所有资源对象的自动化控制中心,可以理解为资源对象的“大总管”。 /etc/kubernetes/controller-manager 中添加以下内容: KUBE_CONTROLLER_MANAGER_ARGS="--root-ca-file=/srv/kubernetes/ca.crt --service_account_private_key_file=/srv/kubernetes/server.key ### --terminated-pod-gc-threshold=12500 ##控制pod的数量 --leader-elect=true" ##进行leader选举 启动配置文件: [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] EnvironmentFile=-/etc/kubernetes/config EnvironmentFile=-/etc/kubernetes/controller-manager User=kube ExecStart=/usr/bin/kube-controller-manager \ $KUBE_LOGTOSTDERR \ $KUBE_LOG_LEVEL \ $KUBE_MASTER \ $KUBE_CONTROLLER_MANAGER_ARGS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target 9.3 node配置 9.3.1 kuberlet 配置 Kubelet负责Pod对应的容器的创建、启停等任务,同时与Master节点密切协作,实现集群管理的基本功能。 [root@GPRSDX1 ~]# cat /etc/kubernetes/kubelet |grep -v "^#" |grep -v "^$" KUBELET_ADDRESS="--address=0.0.0.0" ---绑定主机IP地址 KUBELET_HOSTNAME="--hostname-override=GPRSDX1" ---本node在集群中的主机名 KUBELET_API_SERVER="--api-servers=http://10.255.224.90:8089" ---api_server的IP地址及端口 KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest" ---用于pod内网络命名空间共享的基础pause镜像 KUBELET_ARGS="--network-plugin-dir=/etc/cni/net.d ## 扫描网络查件的目录 --network-plugin=cni ##自定义网络插件的名字 --cluster-dns=10.254.0.3 ##集群内DNS服务的IP地址 --cluster-domain=cluster.local" ##集群内DNS服务所用的域名 启动配置文件: [root@GPRSDX1 ~]# cat /usr/lib/systemd/system/kubelet.service [Unit] Description=Kubernetes Kubelet Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] WorkingDirectory=/var/lib/kubelet EnvironmentFile=-/etc/kubernetes/config EnvironmentFile=-/etc/kubernetes/kubelet ExecStart=/usr/bin/kubelet \ $KUBE_LOGTOSTDERR \ --network-plugin-dir=/etc/cni/net.d \ --network-plugin=cni \ $KUBE_LOG_LEVEL \ $KUBELET_API_SERVER \ $KUBELET_ADDRESS \ $KUBELET_PORT \ $KUBELET_HOSTNAME \ $KUBE_ALLOW_PRIV \ $KUBELET_POD_INFRA_CONTAINER \ $KUBELET_ARGS Restart=on-failure [Install] WantedBy=multi-user.target 9.3.2 kube-proxy 配置 kube-proxy实现kubernetes Service的通信与负载均衡的重要组件。 cat /etc/kubernetes/proxy |grep -v "^#" |grep -v "^$" KUBE_PROXY_ARGS="--proxy-mode=iptables" ---- 代理模式 启动配置文件: [root@GPRSDX1 ~]# cat /usr/lib/systemd/system/kube-proxy.service [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] EnvironmentFile=-/etc/kubernetes/config EnvironmentFile=-/etc/kubernetes/proxy ExecStart=/usr/bin/kube-proxy \ $KUBE_LOGTOSTDERR \ $KUBE_LOG_LEVEL \ $KUBE_MASTER \ $KUBE_PROXY_ARGS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target 9.4 公共参数配置 cat /etc/kubernetes/config |grep -v "^#" |grep -v "^$" KUBE_LOGTOSTDERR="--logtostderr=true" ----日志输出到文件同时输出到stderr KUBE_LOG_LEVEL="--v=0" ------日志级别 KUBE_ALLOW_PRIV="--allow-privileged=true" ---k8s允许pod中运行系统特权的容器应用 KUBE_MASTER="--master=http://10.255.224.90:8089" ---k8s的主节点浮动ip及端口 9.5 启动检查各服务 启动各服务: 主节点master启动以下服务: systemctl start kube-apiserver.service systemctl start kube-controller-manager.service systemctl start kube-scheduler.service 检查服务启动情况: systemctl status kube-apiserver.service systemctl status kube-controller-manager.service systemctl status kube-scheduler.service 或者: kubectl get componentstatuses node节点启动: systemctl start kubelet.service systemctl start kube-proxy.service 检查服务启动情况: systemctl status kube-proxy.service systemctl status kubelet.service 9.6 各服务介绍 master运行三个组件: apiserver:作为kubernetes系统的入口,封装了核心对象的增删改查操作,以RESTFul接口方式提供给外部客户和内部组件调用。它维护的REST对象将持久化到etcd(一个分布式强一致性的key/value存储)。 scheduler:负责集群的资源调度,为新建的pod分配机器。这部分工作分出来变成一个组件,意味着可以很方便地替换成其他的调度器。 controller-manager:负责执行各种控制器,目前有两类: endpoint-controller:定期关联service和pod(关联信息由endpoint对象维护),保证service到pod的映射总是最新的。 replication-controller:定期关联replicationController和pod,保证replicationController定义的复制数量与实际运行pod的数量总是一致的。 slave(称作minion)运行两个组件: kubelet:负责管控docker容器,如启动/停止、监控运行状态等。它会定期从etcd获取分配到本机的pod,并根据pod信息启动或停止相应的容器。同时,它也会接收apiserver的HTTP请求,汇报pod的运行状态。 proxy:负责为pod提供代理。它会定期从etcd获取所有的service,并根据service信息创建代理。当某个客户pod要访问其他pod时,访问请求会经过本机proxy做转发。 第十章 安装本地镜像仓库 当前docker版本为1.13.0,docker-registry已替换为docker-distribution yum install docker-distribution 修改docker配置文件/etc/sysconfig/docker(标红为需要修改部分) # /etc/sysconfig/docker # Modify these options if you want to change the way the docker daemon runs OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false' if [ -z "${DOCKER_CERT_PATH}" ]; then DOCKER_CERT_PATH=/etc/docker fi # If you want to add your own registry to be used for docker search and docker # pull use the ADD_REGISTRY option to list a set of registries, each prepended # with --add-registry flag. The first registry added will be the first registry # searched. #ADD_REGISTRY='--add-registry registry.access.redhat.com' # If you want to block registries from being used, uncomment the BLOCK_REGISTRY # option and give it a set of registries, each prepended with --block-registry # flag. For example adding docker.io will stop users from downloading images # from docker.io # BLOCK_REGISTRY='--block-registry' # If you have a registry secured with https but do not have proper certs # distributed, you can tell docker to not look for full authorization by # adding the registry to the INSECURE_REGISTRY line and uncommenting it. INSECURE_REGISTRY='--insecure-registry registry.jf.com:5000' # On an SELinux system, if you remove the --selinux-enabled option, you # also need to turn on the docker_transition_unconfined boolean. # setsebool -P docker_transition_unconfined 1 # Location used for temporary files, such as those created by # docker load and build operations. Default is /var/lib/docker/tmp # Can be overriden by setting the following environment variable. # DOCKER_TMPDIR=/var/tmp # Controls the /etc/cron.daily/docker-logrotate cron job status. # To disable, uncomment the line below. # LOGROTATE=false # # docker-latest daemon can be used by starting the docker-latest unitfile. # To use docker-latest client, uncomment below lines #DOCKERBINARY=/usr/bin/docker-latest #DOCKER_CONTAINERD_BINARY=/usr/bin/docker-containerd-latest #DOCKER_CONTAINERD_SHIM_BINARY=/usr/bin/docker-containerd-shim-latest 修改/etc/hosts,为镜像仓库添加别名 重新启动docker服务及docker-distribution服务 systemctl restart docker systemctl start docker-distribution 第十一章 安装Keepalived Keepalived是一个基于VRRP协议来实现的WEB服务高可用方案,可以利用其来避免单点故障。使用多台节点安装keepalived。其他的节点用来提供真实的服务,同样的,他们对外表现一个虚拟的IP。主服务器宕机的时候,备份服务器就会接管虚拟IP,继续提供服务,从而保证了高可用性。 11.1环境配置 1、主Keepalived服务器IP地址 2、备Keepalived服务器IP地址 (目前是一主两备) 3、Keepalived虚拟IP地址 (注意虚拟IP地址不要和其他主IP地址冲突) 软件下载地址 http://www.keepalived.org/software/keepalived-1.1.20.tar.gz 11.2安装流程 11.2.1上传Keepalived至/home/目录 11.2.2解压Keepalived软件 [root@localhost home]# tar -zxvf keepalived-1.1.20.tar.gz [root@localhost home]# cd keepalived-1.1.20 [root@localhost keepalived-1.1.20]# ln -s /usr/src/kernels/2.6.9-78.EL-i686/usr/src//linux [root@localhost keepalived-1.1.20]# ./configure 11.2.3编译以及编译安装 [root@localhost keepalived-1.1.20]# make && make install 11.2.4修改配置文件路径 [root@localhostkeepalived-1.1.20]#cp /usr/local/etc/rc.d/init.d/keepalived/etc/rc.d/init.d/ [root@localhostkeepalived-1.1.20]#cp usr/local/etc/sysconfig/keepalived /etc/sysconfig/ [root@localhost keepalived-1.1.20]# mkdir /etc/keepalived [root@localhostkeepalived-1.1.20]#cp /usr/local/etc/keepalived/keepalived.conf/etc/keepalived/ [root@localhost keepalived-1.1.20]# cp /usr/local/sbin/keepalived /usr/sbin/ 11.2.5设置为服务,开机自启动 11.2.6主keepalived配置 修改配置文件 vi /etc/keepalived/keepalived.conf state: 状态只有MASTER和BACKUP两种,并且要大写,MASTER为工作状态,BACKUP是备用状态。 interface:要绑定的网卡,根据机器的网卡填写。 virtual_router_id:虚拟路由标识,同一个vrrp_instance的MASTER和BACKUP的vitrual_router_id 是一致的。 priority:优先级,同一个vrrp_instance的MASTER优先级必须比BACKUP高。 advert_int 1 :MASTER 与BACKUP 负载均衡器之间同步检查的时间间隔,单位为秒。 authentication:包含验证类型和验证密码。类型主要有PASS、AH 两种,通常使用的类型为PASS,\ virtual_ipaddress: 虚拟ip地址,可以有多个地址,每个地址占一行,不需要子网掩码 11.2.7备keepalived配置 11.2.8启动服务 11.3验证测试 在其他主机通过ssh连接虚拟ip,查看是否到达主Keepalived服务器IP地址。 第十二章 搭建中遇到的问题 1、 etcd主机挂掉后,主机重新安装etcd,会启动失败。 需将老的在运行中的主节点删除原来的member,再将新的加入进去。安装后更新配置,启动etcd,若在没有将老的member删除启动过,需将data_dir目录下的数据删除。将配置文件中的ETCD_INITIAL_CLUSTER_STATE="existing"。 2、 启动calico时报错: Jun 5 19:58:10 WLWJFX7 systemd: Starting calicoctl node... Jun 5 19:58:11 WLWJFX7 calicoctl: Invalid ETCD_AUTHORITY. Address must take the form <address>:<port>. Value Jun 5 19:58:11 WLWJFX7 calicoctl: provided is Jun 5 19:58:11 WLWJFX7 calicoctl: 'http://10.255.224.96:2379,http://10.255.224.97:2379,http://10.255.224.98:2379' Jun 5 19:58:11 WLWJFX7 systemd: calico-node.service: main process exited, code=exited, status=1/FAILURE 解决方法::find . -name *calico* 将含有 ETCD_AUTHORITY 的文件删除。重新启动calico