Java 真的是任何一个中间件,比较常用的那种,都有很多内容值得深挖,比如这个缓存,慢慢有过一些感悟,比如如何提升性能,缓存无疑是一大重要手段,最底层开始 CPU 就有缓存,而且又小又贵,再往上一点内存一般作为硬盘存储在运行时的存储,一般在代码里也会用内存作为一些本地缓存,譬如数据库,像 mysql 这种也是有innodb_buffer_pool来提升查询效率,本质上理解就是用更快的存储作为相对慢存储的缓存,减少查询直接访问较慢的存储,并且这个都是相对的,比起 cpu 的缓存,那内存也是渣,但是与普通机械硬盘相比,那也是两个次元的水平。
闲扯这么多来说说 mybatis 的缓存,mybatis 一般作为一个轻量级的 orm 使用,相对应的就是比较重量级的 hibernate,不过不在这次讨论范围,上一次是主要讲了 mybatis 在解析 sql 过程中,对于两种占位符的不同替换实现策略,这次主要聊下 mybatis 的缓存,前面其实得了解下前置的东西,比如 sqlsession,先当做我们知道 sqlsession 是个什么玩意,可能或多或少的知道 mybatis 是有两级缓存,
一级缓存 第一级的缓存是在 BaseExecutor 中的 PerpetualCache ,它是个最基本的缓存实现类,使用了 HashMap 实现缓存功能,代码其实没几十行1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 public class PerpetualCache implements Cache { private final String id; private final Map<Object, Object> cache = new HashMap <>(); public PerpetualCache (String id) { this .id = id; } @Override public String getId () { return id; } @Override public int getSize () { return cache.size(); } @Override public void putObject (Object key, Object value) { cache.put(key, value); } @Override public Object getObject (Object key) { return cache.get(key); } @Override public Object removeObject (Object key) { return cache.remove(key); } @Override public void clear () { cache.clear(); } @Override public boolean equals (Object o) { if (getId() == null ) { throw new CacheException ("Cache instances require an ID." ); } if (this == o) { return true ; } if (!(o instanceof Cache)) { return false ; } Cache otherCache = (Cache) o; return getId().equals(otherCache.getId()); } @Override public int hashCode () { if (getId() == null ) { throw new CacheException ("Cache instances require an ID." ); } return getId().hashCode(); } }
可以看一下BaseExecutor 的构造函数1 2 3 4 5 6 7 8 9 protected BaseExecutor (Configuration configuration, Transaction transaction) { this .transaction = transaction; this .deferredLoads = new ConcurrentLinkedQueue <>(); this .localCache = new PerpetualCache ("LocalCache" ); this .localOutputParameterCache = new PerpetualCache ("LocalOutputParameterCache" ); this .closed = false ; this .configuration = configuration; this .wrapper = this ; }
就是把 PerpetualCache 作为 localCache,然后怎么使用我看简单看一下,BaseExecutor 的查询首先是调用这个函数1 2 3 4 5 6 @Override public <E> List<E> query (MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler) throws SQLException { BoundSql boundSql = ms.getBoundSql(parameter); CacheKey key = createCacheKey(ms, parameter, rowBounds, boundSql); return query(ms, parameter, rowBounds, resultHandler, key, boundSql); }
可以看到首先是调用了 createCacheKey 方法,这个方法呢,先不看怎么写的,如果我们自己要实现这么个缓存,首先这个缓存 key 的设计也是个问题,如果是以表名加主键作为 key,那么分页查询,或者没有主键的时候就不行,来看看 mybatis 是怎么设计的1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 @Override public CacheKey createCacheKey (MappedStatement ms, Object parameterObject, RowBounds rowBounds, BoundSql boundSql) { if (closed) { throw new ExecutorException ("Executor was closed." ); } CacheKey cacheKey = new CacheKey (); cacheKey.update(ms.getId()); cacheKey.update(rowBounds.getOffset()); cacheKey.update(rowBounds.getLimit()); cacheKey.update(boundSql.getSql()); List<ParameterMapping> parameterMappings = boundSql.getParameterMappings(); TypeHandlerRegistry typeHandlerRegistry = ms.getConfiguration().getTypeHandlerRegistry(); for (ParameterMapping parameterMapping : parameterMappings) { if (parameterMapping.getMode() != ParameterMode.OUT) { Object value; String propertyName = parameterMapping.getProperty(); if (boundSql.hasAdditionalParameter(propertyName)) { value = boundSql.getAdditionalParameter(propertyName); } else if (parameterObject == null ) { value = null ; } else if (typeHandlerRegistry.hasTypeHandler(parameterObject.getClass())) { value = parameterObject; } else { MetaObject metaObject = configuration.newMetaObject(parameterObject); value = metaObject.getValue(propertyName); } cacheKey.update(value); } } if (configuration.getEnvironment() != null ) { cacheKey.update(configuration.getEnvironment().getId()); } return cacheKey; }
首先需要 id,这个 id 是 mapper 里方法的 id, 然后是偏移量跟返回行数,再就是 sql,然后是参数,基本上是会有影响的都加进去了,在这个 update 里面1 2 3 4 5 6 7 8 9 10 11 public void update (Object object) { int baseHashCode = object == null ? 1 : ArrayUtil.hashCode(object); count++; checksum += baseHashCode; baseHashCode *= count; hashcode = multiplier * hashcode + baseHashCode; updateList.add(object); }
其实是一个 hash 转换,具体不纠结,就是提高特异性,然后回来就是继续调用 query1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 @Override public <E> List<E> query (MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException { ErrorContext.instance().resource(ms.getResource()).activity("executing a query" ).object(ms.getId()); if (closed) { throw new ExecutorException ("Executor was closed." ); } if (queryStack == 0 && ms.isFlushCacheRequired()) { clearLocalCache(); } List<E> list; try { queryStack++; list = resultHandler == null ? (List<E>) localCache.getObject(key) : null ; if (list != null ) { handleLocallyCachedOutputParameters(ms, key, parameter, boundSql); } else { list = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql); } } finally { queryStack--; } if (queryStack == 0 ) { for (DeferredLoad deferredLoad : deferredLoads) { deferredLoad.load(); } deferredLoads.clear(); if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) { clearLocalCache(); } } return list; }
可以看到是先从 localCache 里取,取不到再 queryFromDatabase,其实比较简单,这是一级缓存,考虑到 sqlsession 跟 BaseExecutor 的关系,其实是随着 sqlsession 来保证这个缓存不会出现脏数据幻读的情况,当然事务相关的后面可能再单独聊。
二级缓存 其实这个一级二级顺序有点反过来,其实查询的是先走的二级缓存,当然二级的需要配置开启,默认不开, 需要通过1 <setting name ="cacheEnabled" value ="true" />
来开启,然后我们的查询就会走到1 2 3 4 public class CachingExecutor implements Executor { private final Executor delegate; private final TransactionalCacheManager tcm = new TransactionalCacheManager ();
这个 Executor 中,这里我放了类里面的元素,发现没有一个 Cache 类,这就是一个特点了,往下看查询过程1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 @Override public <E> List<E> query (MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler) throws SQLException { BoundSql boundSql = ms.getBoundSql(parameterObject); CacheKey key = createCacheKey(ms, parameterObject, rowBounds, boundSql); return query(ms, parameterObject, rowBounds, resultHandler, key, boundSql); } @Override public <E> List<E> query (MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException { Cache cache = ms.getCache(); if (cache != null ) { flushCacheIfRequired(ms); if (ms.isUseCache() && resultHandler == null ) { ensureNoOutParams(ms, boundSql); @SuppressWarnings("unchecked") List<E> list = (List<E>) tcm.getObject(cache, key); if (list == null ) { list = delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql); tcm.putObject(cache, key, list); } return list; } } return delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql); }
看到没,其实缓存是从 tcm 这个成员变量里取,而这个是什么呢,事务性缓存(直译下),因为这个其实是用 MappedStatement 里的 Cache 作为key 从 tcm 的 map 取出来的1 2 3 public class TransactionalCacheManager { private final Map<Cache, TransactionalCache> transactionalCaches = new HashMap <>();
MappedStatement是被全局使用的,所以其实二级缓存是跟着 mapper 的 namespace 走的,可以被多个 CachingExecutor 获取到,就会出现线程安全问题,线程安全问题可以用SynchronizedCache来解决,就是加锁,但是对于事务中的脏读,使用了TransactionalCache来解决这个问题,1 2 3 4 5 6 7 8 public class TransactionalCache implements Cache { private static final Log log = LogFactory.getLog(TransactionalCache.class); private final Cache delegate; private boolean clearOnCommit; private final Map<Object, Object> entriesToAddOnCommit; private final Set<Object> entriesMissedInCache;
在事务还没提交的时候,会把中间状态的数据放在 entriesToAddOnCommit 中,只有在提交后会放进共享缓存中,1 2 3 4 5 6 7 public void commit () { if (clearOnCommit) { delegate.clear(); } flushPendingEntries(); reset(); }