Mongodb MMS安装

MMS(Mongodb Monitor Service)是官方推出的、用来对Mongodb实例进行监控和备份以及自动部署的服务,1.6版本以后,改名为Ops Manager。当然,压缩包名、进程名等还是mongodb-mms
从架构上来讲,MMS/Ops Manager由两部分组成:Web ServiceAgent

MMS Web Service安装

可以直接使用官网的服务,也可以使用官方提供的压缩包把服务安装在本地,下面只介绍部署在本地的方式(On-Premise)。

On-Premise

下载压缩包,需要填写一些个人信息。Mongodb大中华区的同学会定期发一些邮件,同步一些新消息,里面偶尔还会有些Slide可以看。

1
2
wget https://downloads.mongodb.com/on-prem-mms/tar/mongodb-mms-2.0.7.372-1.x86_64.tar.gz
tar xvf mongodb-mms-2.0.7.372-1.x86_64.tar.gz

安装文档参见此处

MMS Agent安装

Web Service安装好之后,Settings->Agents页面会有提示如何去安装Agent,可以选择只安装Monitor Agent。

遇到的问题

MMS依赖的带Arbiter的Replica Set节点宕机之后,程序启动不了

pre-flight的时候会有如下报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
An unexpected error occurred during pre-flight checks: Unable to provision, see the following errors:
1) Error injecting constructor, com.mongodb.MongoException$Network: Operation on server 10.13.42.19:27017 failed
at com.xgen.svc.core.AppSettings.<init>(AppSettings.java:125)
at com.xgen.svc.core.AppSettings.class(AppSettings.java:45)
while locating com.xgen.svc.core.AppSettings
for parameter 0 at com.xgen.svc.core.dao.mongo.MongoSvcProvider.<init>(MongoSvcProvider.java:23)
at com.xgen.svc.core.dao.mongo.MongoSvcProvider.class(MongoSvcProvider.java:19)
while locating com.xgen.svc.core.dao.mongo.MongoSvcProvider
while locating com.xgen.svc.core.dao.mongo.MongoSvc
1 error
com.google.inject.ProvisionException: Unable to provision, see the following errors:
1) Error injecting constructor, com.mongodb.MongoException$Network: Operation on server 10.13.42.19:27017 failed
at com.xgen.svc.core.AppSettings.<init>(AppSettings.java:125)
at com.xgen.svc.core.AppSettings.class(AppSettings.java:45)
while locating com.xgen.svc.core.AppSettings
for parameter 0 at com.xgen.svc.core.dao.mongo.MongoSvcProvider.<init>(MongoSvcProvider.java:23)
at com.xgen.svc.core.dao.mongo.MongoSvcProvider.class(MongoSvcProvider.java:19)
while locating com.xgen.svc.core.dao.mongo.MongoSvcProvider
while locating com.xgen.svc.core.dao.mongo.MongoSvc
1 error
at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1025)
at com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1051)
at com.mycila.inject.jsr250.Jsr250InjectorImpl.getInstance(Jsr250InjectorImpl.java:123)
at com.xgen.svc.core.PreFlightCheck.performChecks(PreFlightCheck.java:90)
at com.xgen.svc.core.PreFlightCheck.main(PreFlightCheck.java:150)
Caused by: com.mongodb.MongoException$Network: Operation on server 10.13.42.19:27017 failed
at com.mongodb.DBTCPConnector.doOperation(DBTCPConnector.java:215)
at com.mongodb.DBCollectionImpl.writeWithCommandProtocol(DBCollectionImpl.java:566)
at com.mongodb.DBCollectionImpl.updateWithCommandProtocol(DBCollectionImpl.java:561)
at com.mongodb.DBCollectionImpl.updateImpl(DBCollectionImpl.java:288)
at com.mongodb.DBCollection.update(DBCollection.java:250)
at com.mongodb.DBCollection.update(DBCollection.java:232)
at com.mongodb.DBCollection.save(DBCollection.java:1223)
at com.xgen.svc.core.dao.system.AppPropertyDao.set(AppPropertyDao.java:54)
at com.xgen.svc.core.AppSettings.recordInstanceOverrides(AppSettings.java:366)

从日志里面可以看到是update操作超时了。线下尝试重现,往对应的Mongodb RS集群中写入数据,可以成功操作,无法重现错误。

使用mongosniff在Primary上进行抓包:

1
sudo ./mongosniff --source NET eth0 27017 > sniff.result

从结果中看到确实有update请求没有返回:

1
2
mongodb-mms:43119 -->> mongodb-rs-primary:27017 cloudconf.$cmd 21390 bytes id:1 1
query: { update: "config.appState", ordered: true, writeConcern: { w: 2 }, updates: [ { q: { _id:

值得关注的参数是writeConcern: { w: 2 },所以mongodb-mms里面使用的writeConcern应该是:

1
WriteConcern writeConcern = new WriteConcern("majority", 2000, false, false);

线下使用相同的配置,再次写入数据,操作超时。

引用

[1] http://www.mongoing.com/archives/3644
[2] https://docs.mongodb.com/manual/reference/write-concern
[3] https://github.com/mongodb/mongo-java-driver
[4] http://blog.csdn.net/zhu_tianwei/article/details/44515477