Nicksxs's Blog

What hurts more, the pain of hard work or the pain of regret?

这一篇不再是数据结构介绍了,大致的数据结构基本都介绍了,这一篇主要是查漏补缺,或者说讲一些重要且基本的概念,也可能是经常被忽略的,很多讲 redis 的系列文章可能都会忽略,学习 redis 的时候也会,因为觉得源码学习就是讲主要的数据结构和“算法”学习了就好了。
redis 的主要应用就是拿来作为高性能的缓存,那么缓存一般有些啥需要注意的,首先是访问速度,如果取得跟数据库一样快,那就没什么存在的意义,第二个是缓存的字面意思,我只是为了让数据读取快一些,通常大部分的场景这个是需要更新过期的,这里就把我要讲的第一点引出来了(真累,

redis过期策略

redis 是如何过期缓存的,可以猜测下,最无脑的就是每个设置了过期时间的 key 都设个定时器,过期了就删除,这种显然消耗太大,清理地最及时,还有的就是 redis 正在采用的懒汉清理策略和定期清理
懒汉策略就是在使用的时候去检查缓存是否过期,比如 get 操作时,先判断下这个 key 是否已经过期了,如果过期了就删掉,并且返回空,如果没过期则正常返回
主要代码是

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
/* This function is called when we are going to perform some operation
* in a given key, but such key may be already logically expired even if
* it still exists in the database. The main way this function is called
* is via lookupKey*() family of functions.
*
* The behavior of the function depends on the replication role of the
* instance, because slave instances do not expire keys, they wait
* for DELs from the master for consistency matters. However even
* slaves will try to have a coherent return value for the function,
* so that read commands executed in the slave side will be able to
* behave like if the key is expired even if still present (because the
* master has yet to propagate the DEL).
*
* In masters as a side effect of finding a key which is expired, such
* key will be evicted from the database. Also this may trigger the
* propagation of a DEL/UNLINK command in AOF / replication stream.
*
* The return value of the function is 0 if the key is still valid,
* otherwise the function returns 1 if the key is expired. */
int expireIfNeeded(redisDb *db, robj *key) {
if (!keyIsExpired(db,key)) return 0;

/* If we are running in the context of a slave, instead of
* evicting the expired key from the database, we return ASAP:
* the slave key expiration is controlled by the master that will
* send us synthesized DEL operations for expired keys.
*
* Still we try to return the right information to the caller,
* that is, 0 if we think the key should be still valid, 1 if
* we think the key is expired at this time. */
if (server.masterhost != NULL) return 1;

/* Delete the key */
server.stat_expiredkeys++;
propagateExpire(db,key,server.lazyfree_lazy_expire);
notifyKeyspaceEvent(NOTIFY_EXPIRED,
"expired",key,db->id);
return server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
dbSyncDelete(db,key);
}

/* Check if the key is expired. */
int keyIsExpired(redisDb *db, robj *key) {
mstime_t when = getExpire(db,key);
mstime_t now;

if (when < 0) return 0; /* No expire for this key */

/* Don't expire anything while loading. It will be done later. */
if (server.loading) return 0;

/* If we are in the context of a Lua script, we pretend that time is
* blocked to when the Lua script started. This way a key can expire
* only the first time it is accessed and not in the middle of the
* script execution, making propagation to slaves / AOF consistent.
* See issue #1525 on Github for more information. */
if (server.lua_caller) {
now = server.lua_time_start;
}
/* If we are in the middle of a command execution, we still want to use
* a reference time that does not change: in that case we just use the
* cached time, that we update before each call in the call() function.
* This way we avoid that commands such as RPOPLPUSH or similar, that
* may re-open the same key multiple times, can invalidate an already
* open object in a next call, if the next call will see the key expired,
* while the first did not. */
else if (server.fixed_time_expire > 0) {
now = server.mstime;
}
/* For the other cases, we want to use the most fresh time we have. */
else {
now = mstime();
}

/* The key expired if the current (virtual or real) time is greater
* than the expire time of the key. */
return now > when;
}
/* Return the expire time of the specified key, or -1 if no expire
* is associated with this key (i.e. the key is non volatile) */
long long getExpire(redisDb *db, robj *key) {
dictEntry *de;

/* No expire? return ASAP */
if (dictSize(db->expires) == 0 ||
(de = dictFind(db->expires,key->ptr)) == NULL) return -1;

/* The entry was found in the expire dict, this means it should also
* be present in the main dict (safety check). */
serverAssertWithInfo(NULL,key,dictFind(db->dict,key->ptr) != NULL);
return dictGetSignedIntegerVal(de);
}

这里有几点要注意的,第一是当惰性删除时会根据lazyfree_lazy_expire这个参数去判断是执行同步删除还是异步删除,另外一点是对于 slave,是不需要执行的,因为会在 master 过期时向 slave 发送 del 指令。
光采用这个策略会有什么问题呢,假如一些key 一直未被访问,那这些 key 就不会过期了,导致一直被占用着内存,所以 redis 采取了懒汉式过期加定期过期策略,定期策略是怎么执行的呢

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
/* This function handles 'background' operations we are required to do
* incrementally in Redis databases, such as active key expiring, resizing,
* rehashing. */
void databasesCron(void) {
/* Expire keys by random sampling. Not required for slaves
* as master will synthesize DELs for us. */
if (server.active_expire_enabled) {
if (server.masterhost == NULL) {
activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
} else {
expireSlaveKeys();
}
}

/* Defrag keys gradually. */
activeDefragCycle();

/* Perform hash tables rehashing if needed, but only if there are no
* other processes saving the DB on disk. Otherwise rehashing is bad
* as will cause a lot of copy-on-write of memory pages. */
if (!hasActiveChildProcess()) {
/* We use global counters so if we stop the computation at a given
* DB we'll be able to start from the successive in the next
* cron loop iteration. */
static unsigned int resize_db = 0;
static unsigned int rehash_db = 0;
int dbs_per_call = CRON_DBS_PER_CALL;
int j;

/* Don't test more DBs than we have. */
if (dbs_per_call > server.dbnum) dbs_per_call = server.dbnum;

/* Resize */
for (j = 0; j < dbs_per_call; j++) {
tryResizeHashTables(resize_db % server.dbnum);
resize_db++;
}

/* Rehash */
if (server.activerehashing) {
for (j = 0; j < dbs_per_call; j++) {
int work_done = incrementallyRehash(rehash_db);
if (work_done) {
/* If the function did some work, stop here, we'll do
* more at the next cron loop. */
break;
} else {
/* If this db didn't need rehash, we'll try the next one. */
rehash_db++;
rehash_db %= server.dbnum;
}
}
}
}
}
/* Try to expire a few timed out keys. The algorithm used is adaptive and
* will use few CPU cycles if there are few expiring keys, otherwise
* it will get more aggressive to avoid that too much memory is used by
* keys that can be removed from the keyspace.
*
* Every expire cycle tests multiple databases: the next call will start
* again from the next db, with the exception of exists for time limit: in that
* case we restart again from the last database we were processing. Anyway
* no more than CRON_DBS_PER_CALL databases are tested at every iteration.
*
* The function can perform more or less work, depending on the "type"
* argument. It can execute a "fast cycle" or a "slow cycle". The slow
* cycle is the main way we collect expired cycles: this happens with
* the "server.hz" frequency (usually 10 hertz).
*
* However the slow cycle can exit for timeout, since it used too much time.
* For this reason the function is also invoked to perform a fast cycle
* at every event loop cycle, in the beforeSleep() function. The fast cycle
* will try to perform less work, but will do it much more often.
*
* The following are the details of the two expire cycles and their stop
* conditions:
*
* If type is ACTIVE_EXPIRE_CYCLE_FAST the function will try to run a
* "fast" expire cycle that takes no longer than EXPIRE_FAST_CYCLE_DURATION
* microseconds, and is not repeated again before the same amount of time.
* The cycle will also refuse to run at all if the latest slow cycle did not
* terminate because of a time limit condition.
*
* If type is ACTIVE_EXPIRE_CYCLE_SLOW, that normal expire cycle is
* executed, where the time limit is a percentage of the REDIS_HZ period
* as specified by the ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC define. In the
* fast cycle, the check of every database is interrupted once the number
* of already expired keys in the database is estimated to be lower than
* a given percentage, in order to avoid doing too much work to gain too
* little memory.
*
* The configured expire "effort" will modify the baseline parameters in
* order to do more work in both the fast and slow expire cycles.
*/

#define ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP 20 /* Keys for each DB loop. */
#define ACTIVE_EXPIRE_CYCLE_FAST_DURATION 1000 /* Microseconds. */
#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* Max % of CPU to use. */
#define ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE 10 /* % of stale keys after which
we do extra efforts. */
void activeExpireCycle(int type) {
/* Adjust the running parameters according to the configured expire
* effort. The default effort is 1, and the maximum configurable effort
* is 10. */
unsigned long
effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort,
config_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION +
ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort,
config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
2*effort,
config_cycle_acceptable_stale = ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE-
effort;

/* This function has some global state in order to continue the work
* incrementally across calls. */
static unsigned int current_db = 0; /* Last DB tested. */
static int timelimit_exit = 0; /* Time limit hit in previous call? */
static long long last_fast_cycle = 0; /* When last fast cycle ran. */

int j, iteration = 0;
int dbs_per_call = CRON_DBS_PER_CALL;
long long start = ustime(), timelimit, elapsed;

/* When clients are paused the dataset should be static not just from the
* POV of clients not being able to write, but also from the POV of
* expires and evictions of keys not being performed. */
if (clientsArePaused()) return;

if (type == ACTIVE_EXPIRE_CYCLE_FAST) {
/* Don't start a fast cycle if the previous cycle did not exit
* for time limit, unless the percentage of estimated stale keys is
* too high. Also never repeat a fast cycle for the same period
* as the fast cycle total duration itself. */
if (!timelimit_exit &&
server.stat_expired_stale_perc < config_cycle_acceptable_stale)
return;

if (start < last_fast_cycle + (long long)config_cycle_fast_duration*2)
return;

last_fast_cycle = start;
}

/* We usually should test CRON_DBS_PER_CALL per iteration, with
* two exceptions:
*
* 1) Don't test more DBs than we have.
* 2) If last time we hit the time limit, we want to scan all DBs
* in this iteration, as there is work to do in some DB and we don't want
* expired keys to use memory for too much time. */
if (dbs_per_call > server.dbnum || timelimit_exit)
dbs_per_call = server.dbnum;

/* We can use at max 'config_cycle_slow_time_perc' percentage of CPU
* time per iteration. Since this function gets called with a frequency of
* server.hz times per second, the following is the max amount of
* microseconds we can spend in this function. */
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;
timelimit_exit = 0;
if (timelimit <= 0) timelimit = 1;

if (type == ACTIVE_EXPIRE_CYCLE_FAST)
timelimit = config_cycle_fast_duration; /* in microseconds. */

/* Accumulate some global stats as we expire keys, to have some idea
* about the number of keys that are already logically expired, but still
* existing inside the database. */
long total_sampled = 0;
long total_expired = 0;

for (j = 0; j < dbs_per_call && timelimit_exit == 0; j++) {
/* Expired and checked in a single loop. */
unsigned long expired, sampled;

redisDb *db = server.db+(current_db % server.dbnum);

/* Increment the DB now so we are sure if we run out of time
* in the current DB we'll restart from the next. This allows to
* distribute the time evenly across DBs. */
current_db++;

/* Continue to expire if at the end of the cycle more than 25%
* of the keys were expired. */
do {
unsigned long num, slots;
long long now, ttl_sum;
int ttl_samples;
iteration++;

/* If there is nothing to expire try next DB ASAP. */
if ((num = dictSize(db->expires)) == 0) {
db->avg_ttl = 0;
break;
}
slots = dictSlots(db->expires);
now = mstime();

/* When there are less than 1% filled slots, sampling the key
* space is expensive, so stop here waiting for better times...
* The dictionary will be resized asap. */
if (num && slots > DICT_HT_INITIAL_SIZE &&
(num*100/slots < 1)) break;

/* The main collection cycle. Sample random keys among keys
* with an expire set, checking for expired ones. */
expired = 0;
sampled = 0;
ttl_sum = 0;
ttl_samples = 0;

if (num > config_keys_per_loop)
num = config_keys_per_loop;

/* Here we access the low level representation of the hash table
* for speed concerns: this makes this code coupled with dict.c,
* but it hardly changed in ten years.
*
* Note that certain places of the hash table may be empty,
* so we want also a stop condition about the number of
* buckets that we scanned. However scanning for free buckets
* is very fast: we are in the cache line scanning a sequential
* array of NULL pointers, so we can scan a lot more buckets
* than keys in the same time. */
long max_buckets = num*20;
long checked_buckets = 0;

while (sampled < num && checked_buckets < max_buckets) {
for (int table = 0; table < 2; table++) {
if (table == 1 && !dictIsRehashing(db->expires)) break;

unsigned long idx = db->expires_cursor;
idx &= db->expires->ht[table].sizemask;
dictEntry *de = db->expires->ht[table].table[idx];
long long ttl;

/* Scan the current bucket of the current table. */
checked_buckets++;
while(de) {
/* Get the next entry now since this entry may get
* deleted. */
dictEntry *e = de;
de = de->next;

ttl = dictGetSignedIntegerVal(e)-now;
if (activeExpireCycleTryExpire(db,e,now)) expired++;
if (ttl > 0) {
/* We want the average TTL of keys yet
* not expired. */
ttl_sum += ttl;
ttl_samples++;
}
sampled++;
}
}
db->expires_cursor++;
}
total_expired += expired;
total_sampled += sampled;

/* Update the average TTL stats for this database. */
if (ttl_samples) {
long long avg_ttl = ttl_sum/ttl_samples;

/* Do a simple running average with a few samples.
* We just use the current estimate with a weight of 2%
* and the previous estimate with a weight of 98%. */
if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
db->avg_ttl = (db->avg_ttl/50)*49 + (avg_ttl/50);
}

/* We can't block forever here even if there are many keys to
* expire. So after a given amount of milliseconds return to the
* caller waiting for the other active expire cycle. */
if ((iteration & 0xf) == 0) { /* check once every 16 iterations. */
elapsed = ustime()-start;
if (elapsed > timelimit) {
timelimit_exit = 1;
server.stat_expired_time_cap_reached_count++;
break;
}
}
/* We don't repeat the cycle for the current database if there are
* an acceptable amount of stale keys (logically expired but yet
* not reclained). */
} while ((expired*100/sampled) > config_cycle_acceptable_stale);
}

elapsed = ustime()-start;
server.stat_expire_cycle_time_used += elapsed;
latencyAddSampleIfNeeded("expire-cycle",elapsed/1000);

/* Update our estimate of keys existing but yet to be expired.
* Running average with this sample accounting for 5%. */
double current_perc;
if (total_sampled) {
current_perc = (double)total_expired/total_sampled;
} else
current_perc = 0;
server.stat_expired_stale_perc = (current_perc*0.05)+
(server.stat_expired_stale_perc*0.95);
}

执行定期清除分成两种类型,快和慢,分别由beforeSleepdatabasesCron调用,快版有两个限制,一个是执行时长由ACTIVE_EXPIRE_CYCLE_FAST_DURATION限制,另一个是执行间隔是 2 倍的ACTIVE_EXPIRE_CYCLE_FAST_DURATION,另外这还可以由配置的server.active_expire_effort参数来控制,默认是 1,最大是 10

1
2
onfig_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION +
ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort

然后会从一定数量的 db 中找出一定数量的带过期时间的 key(保存在 expires中),这里的数量是由

1
2
3
4
5
6
7
8
config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort
```
控制,慢速的执行时长是
```C
config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
2*effort
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;

这里还有一个额外的退出条件,如果当前数据库的抽样结果已经达到我们所允许的过期 key 百分比,则下次不再处理当前 db,继续处理下个 db

在Java8的stream之前,将对象进行排序的时候,可能需要对象实现Comparable接口,或者自己实现一个Comparator,

比如这样子

我的对象是Entity

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public class Entity {

private Long id;

private Long sortValue;

public Long getId() {
return id;
}

public void setId(Long id) {
this.id = id;
}

public Long getSortValue() {
return sortValue;
}

public void setSortValue(Long sortValue) {
this.sortValue = sortValue;
}
}

Comparator

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public class MyComparator implements Comparator {
@Override
public int compare(Object o1, Object o2) {
Entity e1 = (Entity) o1;
Entity e2 = (Entity) o2;
if (e1.getSortValue() < e2.getSortValue()) {
return -1;
} else if (e1.getSortValue().equals(e2.getSortValue())) {
return 0;
} else {
return 1;
}
}
}

比较代码

1
2
3
4
5
6
7
8
9
10
11
12
13
private static MyComparator myComparator = new MyComparator();

public static void main(String[] args) {
List<Entity> list = new ArrayList<Entity>();
Entity e1 = new Entity();
e1.setId(1L);
e1.setSortValue(1L);
list.add(e1);
Entity e2 = new Entity();
e2.setId(2L);
e2.setSortValue(null);
list.add(e2);
Collections.sort(list, myComparator);

看到这里的e2的排序值是null,在Comparator中如果要正常运行的话,就得判空之类的,这里有两点需要,一个是不想写这个MyComparator,然后也没那么好排除掉list里排序值,那么有什么办法能解决这种问题呢,应该说java的这方面真的是很强大

看一下nullsFirst的实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
final static class NullComparator<T> implements Comparator<T>, Serializable {
private static final long serialVersionUID = -7569533591570686392L;
private final boolean nullFirst;
// if null, non-null Ts are considered equal
private final Comparator<T> real;

@SuppressWarnings("unchecked")
NullComparator(boolean nullFirst, Comparator<? super T> real) {
this.nullFirst = nullFirst;
this.real = (Comparator<T>) real;
}

@Override
public int compare(T a, T b) {
if (a == null) {
return (b == null) ? 0 : (nullFirst ? -1 : 1);
} else if (b == null) {
return nullFirst ? 1: -1;
} else {
return (real == null) ? 0 : real.compare(a, b);
}
}

核心代码就是下面这段,其实就是帮我们把前面要做的事情做掉了,是不是挺方便的,小记一下哈

echo 实操技巧

最近做 docker 系列,会经常需要进到 docker 内部,如上一篇介绍的,这些镜像一般都有用 ubuntu 或者alpine 这样的 Linux 系统作为底包,如果构建镜像的时候没有替换源的话,因为特殊的网络原因,在内部想编辑下东西要安装个类似于 vim 这样的编辑器就会很慢很慢,像视频里 two thousand years later~ 而且如果在容器内部想改源配置的话也要编辑器,就陷入了一个鸡生蛋,跟蛋生鸡的死锁问题中,对于 linux 大神来说应该有一万种方法解决这个问题,对于我这个渣渣来说可能只想到了这个土方法,先 cp backup 一下 sources.list, 再 echo “xxx” > sources.list, 这里就碰到了一个问题,这个 sources.list 一般不止一行,直接 echo 的话就解析不了了,不过 echo 可以支持”\n”转义,就是加-e看一下解释和示例,我这里使用了 tldr ,可以用 npm install -g tldr 安装,也可以直接用man, 或者–help 来查看使用方式

查看镜像底包

还有一点也是在这个时候要安装 vim 之类的,得知道是什么镜像底包,如果是用 uname 指令,其实看到的是宿主机的系统,得用cat /etc/issue


这里稍稍记一下

寻找系统镜像源

目前国内系统源用得比较多的是阿里云源,不过这里也推荐清华源, 中科大源, 浙大源 这里不要脸的推荐下母校的源,不过还不是很完善,尽情期待下。

运行第一个 Dockerfile

上一篇的 Dockerfile 我们停留在构建阶段,现在来把它跑起来

1
2
docker run -d -p 80 --name static_web nicksxs/static_web \
nginx -g "daemon off;"

这里的-d表示以分离模型运行docker (detached),然后-p 是表示将容器的 80 端口开放给宿主机,然后容器名就叫 static_web,使用了我们上次构建的 static_web 镜像,后面的是让 nginx 在前台运行

可以看到返回了个容器 id,但是具体情况没出现,也没连上去,那我们想看看怎么访问在 Dockerfile 里写的静态页面,我们来看下docker 进程

发现为我们随机分配了一个宿主机的端口,32768,去服务器的防火墙把这个外网端口开一下,看看是不是符合我们的预期呢

好像不太对额,应该是 ubuntu 安装的 nginx 的默认工作目录不对,我们来进容器看看,再熟悉下命令docker exec -it 4792455ca2ed /bin/bash
记得容器 id 换成自己的,进入容器后得找找 nginx 的配置文件,通常在/etc/nginx,/usr/local/etc等目录下,然后找到我们的目录是在这

所以把刚才的内容复制过去再试试

目标达成,give me five✌️

第二个 Dockerfile

然后就想来动态一点的,毕竟写过 PHP,就来试试 PHP
再建一个目录叫 dynamic_web,里面创建 src 目录,放一个 index.php
内容是

1
2
<?php
echo "Hello World!";

然后在 dynamic_web 目录下创建 Dockerfile,

1
2
3
FROM trafex/alpine-nginx-php7:latest
COPY src/ /var/www/html
EXPOSE 80

Dockerfile 虽然只有三行,不过要着重说明下,这个底包其实不是docker 官方的,有两点考虑,一点是官方的基本都是 php apache 的镜像,还有就是 alpine这个,截取一段中文介绍

Alpine 操作系统是一个面向安全的轻型 Linux 发行版。它不同于通常 Linux 发行版,Alpine 采用了 musl libc 和 busybox 以减小系统的体积和运行时资源消耗,但功能上比 busybox 又完善的多,因此得到开源社区越来越多的青睐。在保持瘦身的同时,Alpine 还提供了自己的包管理工具 apk,可以通过 https://pkgs.alpinelinux.org/packages 网站上查询包信息,也可以直接通过 apk 命令直接查询和安装各种软件。
Alpine 由非商业组织维护的,支持广泛场景的 Linux发行版,它特别为资深/重度Linux用户而优化,关注安全,性能和资源效能。Alpine 镜像可以适用于更多常用场景,并且是一个优秀的可以适用于生产的基础系统/环境。

Alpine Docker 镜像也继承了 Alpine Linux 发行版的这些优势。相比于其他 Docker 镜像,它的容量非常小,仅仅只有 5 MB 左右(对比 Ubuntu 系列镜像接近 200 MB),且拥有非常友好的包管理机制。官方镜像来自 docker-alpine 项目。

目前 Docker 官方已开始推荐使用 Alpine 替代之前的 Ubuntu 做为基础镜像环境。这样会带来多个好处。包括镜像下载速度加快,镜像安全性提高,主机之间的切换更方便,占用更少磁盘空间等。

一方面在没有镜像的情况下,拉取 docker 镜像还是比较费力的,第二个就是也能节省硬盘空间,所以目前有大部分的 docker 镜像都将 alpine 作为基础镜像了
然后再来构建下

这里还有个点,就是上面的那个镜像我们也是 EXPOSE 80端口,然后外部宿主机会随机映射一个端口,为了偷个懒,我们就直接指定外部端口了
docker run -d -p 80:80 dynamic_web打开浏览器发现访问不了,咋回事呢
因为我们没看trafex/alpine-nginx-php7:latest这个镜像说明,它内部的服务是 8080 端口的,所以我们映射的暴露端口应该是 8080,再用docker run -d -p 80:8080 dynamic_web这个启动,

限制下 docker 的 cpu 使用率

这里我们开始玩一点有意思的,我们在容器里装下 vim 和 gcc,然后写这样一段 c 代码

1
2
3
4
5
6
7
#include <stdio.h>
int main(void)
{
int i = 0;
for(;;) i++;
return 0;
}

就是一个最简单的死循环,然后在容器里跑起来

1
2
$ gcc 1.c 
$ ./a.out

然后我们来看下系统资源占用(CPU)
Xs562iawhHyMxeO
上图是在容器里的,可以看到 cpu 已经 100%了
然后看看容器外面的
ecqH8XJ4k7rKhzu
可以看到一个核的 cpu 也被占满了,因为是个双核的机器,并且代码是单线程的
然后呢我们要做点啥
因为已经在这个 ubuntu 容器中装了 vim 和 gcc,考虑到国内的网络,所以我们先把这个容器 commit 一下,

1
docker commit -a "nick" -m "my ubuntu" f63c5607df06 my_ubuntu:v1

然后再运行起来

1
docker run -it --cpus=0.1 my_ubuntu:v1 bash


我们的代码跟可执行文件都还在,要的就是这个效果,然后再运行一下

结果是这个样子的,有点神奇是不,关键就在于 run 的时候的--cpus=0.1这个参数,它其实就是基于我前一篇说的 cgroup 技术,能将进程之间的cpu,内存等资源进行隔离

开始第一个 Dockerfile

上一面为了复用那个我装了 vim 跟 gcc 的容器,我把它提交到了本地,使用了docker commit命令,有点类似于 git 的 commit,但是这个不是个很好的操作方式,需要手动介入,这里更推荐使用 Dockerfile 来构建镜像

1
2
3
4
5
6
7
8
From ubuntu:latest
MAINTAINER Nicksxs "nicksxs@hotmail.com"
RUN sed -i s@/archive.ubuntu.com/@/mirrors.aliyun.com/@g /etc/apt/sources.list
RUN apt-get clean
RUN apt-get update && apt install -y nginx
RUN echo 'Hi, i am in container' \
> /usr/share/nginx/html/index.html
EXPOSE 80

先解释下这是在干嘛,首先是这个From ubuntu:latest基于的 ubuntu 的最新版本的镜像,然后第二行是维护人的信息,第三四行么作为墙内人你懂的,把 ubuntu 的源换成阿里云的,不然就有的等了,然后就是装下 nginx,往默认的 nginx 的入口 html 文件里输入一行欢迎语,然后暴露 80 端口
然后我们使用sudo docker build -t="nicksxs/static_web" .命令来基于这个 Dockerfile 构建我们自己的镜像,过程中是这样的


可以看到图中,我的 Dockerfile 是 7 行,里面就执行了 7 步,并且每一步都有一个类似于容器 id 的层 id 出来,这里就是一个比较重要的东西,docker 在构建的时候其实是有这个层的概念,Dockerfile 里的每一行都会往上加一层,这里有还注意下命令后面的.,代表当前目录下会自行去寻找 Dockerfile 进行构建,构建完了之后我们再看下我们的本地镜像

我们自己的镜像出现啦
然后有个问题,如果这个构建中途报了错咋办呢,来试试看,我们把 nginx 改成随便的一个错误名,nginxx(不知道会不会运气好真的有这玩意),再来 build 一把

找不到 nginxx 包,是不是这个镜像就完全不能用呢,当然也不是,因为前面说到了,docker 是基于层去构建的,可以看到前面的 4 个 step 都没报错,那我们基于最后的成功步骤创建下容器看看
也就是sudo docker run -t -i bd26f991b6c8 /bin/bash
答案是可以的,只是没装成功 nginx

还有一点注意到没,前面的几个 step 都有一句 Using cache,说明 docker 在构建镜像的时候是有缓存的,这也更能说明 docker 是基于层去构建镜像,同样的底包,同样的步骤,这些层是可以被复用的,这就是 docker 的构建缓存,当然我们也可以在 build 的时候加上--no-cache去把构建缓存禁用掉。

0%