一、Docker啟動(dòng)異常表現(xiàn):
1.狀態(tài)反復(fù)restaring,用命令查看
$docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
21c09be88c11 docker.xxxx.cn:5000/xxx-tes/xxx_tes:1.0.6 "/usr/local/tomcat..." 9 days ago Restarting (1) Less than a second ago xxx10
2.Docker日志有明顯問(wèn)題:
二、Docker啟動(dòng)異常的可能原因:
2.1.內(nèi)存不夠
Docker 啟動(dòng)至少需要2G內(nèi)存,首先執(zhí)行free -mh命令查看剩余內(nèi)存是否足夠
直接查看內(nèi)存
$free -mh
total used free shared buff/cache available
Mem: 15G 14G 627M 195M 636M 726M
Swap: 0B 0B 0B
分析日志
有時(shí)候一瞬間內(nèi)存過(guò)載溢出,導(dǎo)致部分進(jìn)程被殺死,看起來(lái)內(nèi)存也是夠用的,事實(shí)上docker還是會(huì)反復(fù)重啟,就需要通過(guò)docker日志和系統(tǒng)日志信的息來(lái)進(jìn)一步分析:
分析docker日志
查看docker日志看到內(nèi)存溢出的信息,要仔細(xì)翻閱才能找到信息,并不是在最下面
$docker logs [容器名/容器ID]|less
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000769990000, 1449590784, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 1449590784 bytes for committing reserved memory.
# An error report file with more information is saved as:
# //hs_err_pid1.log
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000769990000, 1449590784, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 1449590784 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000769990000, 1449590784, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 1449590784 bytes for committing reserved memory.
# Can not save log file, dump to screen..
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 1449590784 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2756), pid=1, tid=140325689620224
#
# JRE version: (7.0_79-b15) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 compressed oops)
# Core dump written. Default location: //core or core.1
#
分析系統(tǒng)日志
查看系統(tǒng)日志,發(fā)現(xiàn)有大量由于內(nèi)存溢出,進(jìn)程被殺死的記錄
$grep -i 'Out of Memory' /var/log/messages
Apr 7 10:04:02 centos106 kernel: Out of memory: Kill process 1192 (java) score 54 or sacrifice child
Apr 7 10:08:00 centos106 kernel: Out of memory: Kill process 2301 (java) score 54 or sacrifice child
Apr 7 10:09:59 centos106 kernel: Out of memory: Kill process 28145 (java) score 52 or sacrifice child
Apr 7 10:20:40 centos106 kernel: Out of memory: Kill process 2976 (java) score 54 or sacrifice child
Apr 7 10:21:08 centos106 kernel: Out of memory: Kill process 3577 (java) score 47 or sacrifice child
Apr 7 10:21:08 centos106 kernel: Out of memory: Kill process 3631 (java) score 47 or sacrifice child
Apr 7 10:21:08 centos106 kernel: Out of memory: Kill process 3634 (java) score 47 or sacrifice child
Apr 7 10:21:08 centos106 kernel: Out of memory: Kill process 3640 (java) score 47 or sacrifice child
Apr 7 10:21:08 centos106 kernel: Out of memory: Kill process 3654 (java) score 47 or sacrifice child
Apr 7 10:27:27 centos106 kernel: Out of memory: Kill process 6998 (java) score 51 or sacrifice child
Apr 7 10:27:28 centos106 kernel: Out of memory: Kill process 7027 (java) score 52 or sacrifice child
Apr 7 10:28:10 centos106 kernel: Out of memory: Kill process 7571 (java) score 42 or sacrifice child
Apr 7 10:28:10 centos106 kernel: Out of memory: Kill process 7586 (java) score 42 or sacrifice child
2.2.端口沖突
該docker監(jiān)聽端口已經(jīng)被其他進(jìn)程占用,一般此種問(wèn)題容易出現(xiàn)在新部署的服務(wù),或在原有機(jī)器上部署新的后臺(tái)服務(wù),所以在部署之前應(yīng)該執(zhí)行命令檢查端口是否已經(jīng)被占用,如果上線后發(fā)現(xiàn)占有則應(yīng)改為可用端口再重啟之。
檢查命令: $netstat -nltp|grep [規(guī)劃的端口號(hào)]
三、對(duì)策
3.1.內(nèi)存不夠的對(duì)策:
對(duì)策1:
3.1.1 saltstack的minion在運(yùn)行過(guò)久之后,可能占用大量?jī)?nèi)存,需要將其重啟。重啟命令可能有時(shí)并不起作用。主要檢查運(yùn)行狀態(tài),如果未成功停止,則重新重啟;
對(duì)策2:
3.2.2 ELK日志收集程序或者其他java進(jìn)程占用過(guò)高,用top和ps命令排查,謹(jǐn)慎確定進(jìn)程的作用,在確保不影響業(yè)務(wù)的情況下,停止相關(guān)進(jìn)程;
對(duì)策3:
釋放被占用的內(nèi)存(buff/cache):
$sync #將內(nèi)存數(shù)據(jù)寫入磁盤
$echo 3 > /proc/sys/vm/drop_caches #釋放被占用的內(nèi)存
對(duì)策4:
有時(shí)候并不是buff/cache過(guò)高導(dǎo)致內(nèi)存不夠用,確實(shí)是被很多必要的進(jìn)程消耗掉了內(nèi)存,那就需要從機(jī)器資源分配使用的層面去考慮和解決了。
3.2 端口沖突的對(duì)策
對(duì)策1:
一般此種問(wèn)題容易出現(xiàn)在新部署的服務(wù),或在原有機(jī)器上部署新的后臺(tái)服務(wù),所以在部署之前應(yīng)該執(zhí)行命令檢查端口是否已經(jīng)被占用,如果上線后發(fā)現(xiàn)占有則應(yīng)改為可用端口再重啟之。
檢查命令: $netstat -nltp|grep [規(guī)劃的端口號(hào)]
以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持腳本之家。