mybatis 的缓存是怎么回事
Java 真的是任何一个中间件,比较常用的那种,都有很多内容值得深挖,比如这个缓存,慢慢有过一些感悟,比如如何提升性能,缓存无疑是一大重要手段,最底层开始 CPU 就有缓存,而且又小又贵,再往上一点内存一般作为硬盘存储在运行时的存储,一般在代码里也会用内存作为一些本地缓存,譬如数据库,像 mysql 这种也是有innodb_buffer_pool来提升查询效率,本质上理解就是用更快的存储作为相对慢存储的缓存,减少查询直接访问较慢的存储,并且这个都是相对的,比起 cpu 的缓存,那内存也是渣,但是与普通机械硬盘相比,那也是两个次元的水平。
闲扯这么多来说说 mybatis 的缓存,mybatis 一般作为一个轻量级的 orm 使用,相对应的就是比较重量级的 hibernate,不过不在这次讨论范围,上一次是主要讲了 mybatis 在解析 sql 过程中,对于两种占位符的不同替换实现策略,这次主要聊下 mybatis 的缓存,前面其实得了解下前置的东西,比如 sqlsession,先当做我们知道 sqlsession 是个什么玩意,可能或多或少的知道 mybatis 是有两级缓存,
一级缓存
第一级的缓存是在 BaseExecutor 中的 PerpetualCache,它是个最基本的缓存实现类,使用了 HashMap 实现缓存功能,代码其实没几十行1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65public class PerpetualCache implements Cache {
private final String id;
private final Map<Object, Object> cache = new HashMap<>();
public PerpetualCache(String id) {
this.id = id;
}
public String getId() {
return id;
}
public int getSize() {
return cache.size();
}
public void putObject(Object key, Object value) {
cache.put(key, value);
}
public Object getObject(Object key) {
return cache.get(key);
}
public Object removeObject(Object key) {
return cache.remove(key);
}
public void clear() {
cache.clear();
}
public boolean equals(Object o) {
if (getId() == null) {
throw new CacheException("Cache instances require an ID.");
}
if (this == o) {
return true;
}
if (!(o instanceof Cache)) {
return false;
}
Cache otherCache = (Cache) o;
return getId().equals(otherCache.getId());
}
public int hashCode() {
if (getId() == null) {
throw new CacheException("Cache instances require an ID.");
}
return getId().hashCode();
}
}
可以看一下BaseExecutor 的构造函数1
2
3
4
5
6
7
8
9protected BaseExecutor(Configuration configuration, Transaction transaction) {
this.transaction = transaction;
this.deferredLoads = new ConcurrentLinkedQueue<>();
this.localCache = new PerpetualCache("LocalCache");
this.localOutputParameterCache = new PerpetualCache("LocalOutputParameterCache");
this.closed = false;
this.configuration = configuration;
this.wrapper = this;
}
就是把 PerpetualCache 作为 localCache,然后怎么使用我看简单看一下,BaseExecutor 的查询首先是调用这个函数1
2
3
4
5
6
public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler) throws SQLException {
BoundSql boundSql = ms.getBoundSql(parameter);
CacheKey key = createCacheKey(ms, parameter, rowBounds, boundSql);
return query(ms, parameter, rowBounds, resultHandler, key, boundSql);
}
可以看到首先是调用了 createCacheKey 方法,这个方法呢,先不看怎么写的,如果我们自己要实现这么个缓存,首先这个缓存 key 的设计也是个问题,如果是以表名加主键作为 key,那么分页查询,或者没有主键的时候就不行,来看看 mybatis 是怎么设计的1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
public CacheKey createCacheKey(MappedStatement ms, Object parameterObject, RowBounds rowBounds, BoundSql boundSql) {
if (closed) {
throw new ExecutorException("Executor was closed.");
}
CacheKey cacheKey = new CacheKey();
cacheKey.update(ms.getId());
cacheKey.update(rowBounds.getOffset());
cacheKey.update(rowBounds.getLimit());
cacheKey.update(boundSql.getSql());
List<ParameterMapping> parameterMappings = boundSql.getParameterMappings();
TypeHandlerRegistry typeHandlerRegistry = ms.getConfiguration().getTypeHandlerRegistry();
// mimic DefaultParameterHandler logic
for (ParameterMapping parameterMapping : parameterMappings) {
if (parameterMapping.getMode() != ParameterMode.OUT) {
Object value;
String propertyName = parameterMapping.getProperty();
if (boundSql.hasAdditionalParameter(propertyName)) {
value = boundSql.getAdditionalParameter(propertyName);
} else if (parameterObject == null) {
value = null;
} else if (typeHandlerRegistry.hasTypeHandler(parameterObject.getClass())) {
value = parameterObject;
} else {
MetaObject metaObject = configuration.newMetaObject(parameterObject);
value = metaObject.getValue(propertyName);
}
cacheKey.update(value);
}
}
if (configuration.getEnvironment() != null) {
// issue #176
cacheKey.update(configuration.getEnvironment().getId());
}
return cacheKey;
}
首先需要 id,这个 id 是 mapper 里方法的 id, 然后是偏移量跟返回行数,再就是 sql,然后是参数,基本上是会有影响的都加进去了,在这个 update 里面1
2
3
4
5
6
7
8
9
10
11public void update(Object object) {
int baseHashCode = object == null ? 1 : ArrayUtil.hashCode(object);
count++;
checksum += baseHashCode;
baseHashCode *= count;
hashcode = multiplier * hashcode + baseHashCode;
updateList.add(object);
}
其实是一个 hash 转换,具体不纠结,就是提高特异性,然后回来就是继续调用 query1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {
ErrorContext.instance().resource(ms.getResource()).activity("executing a query").object(ms.getId());
if (closed) {
throw new ExecutorException("Executor was closed.");
}
if (queryStack == 0 && ms.isFlushCacheRequired()) {
clearLocalCache();
}
List<E> list;
try {
queryStack++;
list = resultHandler == null ? (List<E>) localCache.getObject(key) : null;
if (list != null) {
handleLocallyCachedOutputParameters(ms, key, parameter, boundSql);
} else {
list = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql);
}
} finally {
queryStack--;
}
if (queryStack == 0) {
for (DeferredLoad deferredLoad : deferredLoads) {
deferredLoad.load();
}
// issue #601
deferredLoads.clear();
if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) {
// issue #482
clearLocalCache();
}
}
return list;
}
可以看到是先从 localCache 里取,取不到再 queryFromDatabase,其实比较简单,这是一级缓存,考虑到 sqlsession 跟 BaseExecutor 的关系,其实是随着 sqlsession 来保证这个缓存不会出现脏数据幻读的情况,当然事务相关的后面可能再单独聊。
二级缓存
其实这个一级二级顺序有点反过来,其实查询的是先走的二级缓存,当然二级的需要配置开启,默认不开,
需要通过1
<setting name="cacheEnabled" value="true"/>
来开启,然后我们的查询就会走到1
2
3
4public class CachingExecutor implements Executor {
private final Executor delegate;
private final TransactionalCacheManager tcm = new TransactionalCacheManager();
这个 Executor 中,这里我放了类里面的元素,发现没有一个 Cache 类,这就是一个特点了,往下看查询过程1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public <E> List<E> query(MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler) throws SQLException {
BoundSql boundSql = ms.getBoundSql(parameterObject);
CacheKey key = createCacheKey(ms, parameterObject, rowBounds, boundSql);
return query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
}
public <E> List<E> query(MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql)
throws SQLException {
Cache cache = ms.getCache();
if (cache != null) {
flushCacheIfRequired(ms);
if (ms.isUseCache() && resultHandler == null) {
ensureNoOutParams(ms, boundSql);
List<E> list = (List<E>) tcm.getObject(cache, key);
if (list == null) {
list = delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
tcm.putObject(cache, key, list); // issue #578 and #116
}
return list;
}
}
return delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
}
看到没,其实缓存是从 tcm 这个成员变量里取,而这个是什么呢,事务性缓存(直译下),因为这个其实是用 MappedStatement 里的 Cache 作为key 从 tcm 的 map 取出来的1
2
3public class TransactionalCacheManager {
private final Map<Cache, TransactionalCache> transactionalCaches = new HashMap<>();
MappedStatement是被全局使用的,所以其实二级缓存是跟着 mapper 的 namespace 走的,可以被多个 CachingExecutor 获取到,就会出现线程安全问题,线程安全问题可以用SynchronizedCache来解决,就是加锁,但是对于事务中的脏读,使用了TransactionalCache来解决这个问题,1
2
3
4
5
6
7
8public class TransactionalCache implements Cache {
private static final Log log = LogFactory.getLog(TransactionalCache.class);
private final Cache delegate;
private boolean clearOnCommit;
private final Map<Object, Object> entriesToAddOnCommit;
private final Set<Object> entriesMissedInCache;
在事务还没提交的时候,会把中间状态的数据放在 entriesToAddOnCommit 中,只有在提交后会放进共享缓存中,1
2
3
4
5
6
7public void commit() {
if (clearOnCommit) {
delegate.clear();
}
flushPendingEntries();
reset();
}