特色栏目： python 批处理 net编程 Javascript Php Asp Css Html5 Android seo centos

2021Fall MIT6.S081-Lab8 Lock

来源：互联网收集：自由互联发布时间：2022-06-19

开始日期： 22.6.16 操作系统： Ubuntu20.0.4 Link：Lab Lock 目录 Lab Lock 写在前面实验内容 Memory allocator access code Buffer cache access code result 总结 Lab Lock写在前面总结一下，实验一kalloctest，使用

开始日期：22.6.16

操作系统：Ubuntu20.0.4

Link：Lab Lock

Lab Lock
- 写在前面
- 实验内容
  - Memory allocator
    - access code
  - Buffer cache
    - access code
  - result
- 总结

Lab Lock 写在前面

总结一下，实验一kalloctest，使用了自旋锁、数组链表；
实验二，bcachetest中，使用了自旋锁、睡眠锁、哈希桶（哈希链表）、LRU（time stamp实现）

参考材料

《算法精解》
LRU算法四种实现方式介绍

实验内容 Memory allocator

实现内存分配器，减少竞争，主要实现思路是原先只有一个freelist，一个lock给八个cpu竞争使用，而我们需要重新设计为八个freelist、八个lock给八个cpu使用。
中途会有一个问题，就是当某个cpu对应的freelist没有空闲页面了，该怎么办？实验介绍给出的方案是从其它cpu的freelist中偷（steal）过来，笔者的具体实现是依靠一个for循环遍历数组链表（kmems[NCPU]）找到一个空闲页面来用，如果所有页面都满了，就只能返回0。

笔者期间还遇到一个c语言的问题：结构体的是如何设置的？
经过查阅资料和自己测试，明白了在{前的是类型名（type name），是拿来定义的，在}后的是变量名（variable name），是拿来使用的

access code

// Physical memory allocator, for user processes,
// kernel stacks, page-table pages,
// and pipe buffers. Allocates whole 4096-byte pages.

#include "types.h"
#include "param.h"
#include "memlayout.h"
#include "spinlock.h"
#include "riscv.h"
#include "defs.h"

void freerange(void *pa_start, void *pa_end);

extern char end[]; // first address after kernel.
                   // defined by kernel.ld.

struct run {
  struct run *next;
};

// we must have the type_name(Keme) so we can establish the array of struct(kmems[NCPU])
struct Keme{
  struct spinlock lock;
  struct run *freelist;
};

struct Keme kmems[NCPU];

void
kinit()
{
  // set eight "kmem" for eight cpus
  for(int i = 0; i < NCPU; i++){
    initlock(&kmems[i].lock, "kmem");
      // we build a single freelist to kmems[0] firstly 
      if (i == 0) 
        freerange(end, (void*)PHYSTOP);
  }
}

void
freerange(void *pa_start, void *pa_end)
{
  char *p;
  p = (char*)PGROUNDUP((uint64)pa_start);
  for(; p + PGSIZE <= (char*)pa_end; p += PGSIZE)
    kfree(p);
}

// Free the page of physical memory pointed at by v,
// which normally should have been returned by a
// call to kalloc().  (The exception is when
// initializing the allocator; see kinit above.)
void
kfree(void *pa)
{
  struct run *r;

  if(((uint64)pa % PGSIZE) != 0 || (char*)pa < end || (uint64)pa >= PHYSTOP)
    panic("kfree");

  // Fill with junk to catch dangling refs.
  memset(pa, 1, PGSIZE);

  r = (struct run*)pa;

  // get current cpu_id
  push_off();
  int cpu_id = cpuid(); 
  pop_off();

  // free a page from freelist of cpu_id 
  acquire(&kmems[cpu_id].lock);
  r->next = kmems[cpu_id].freelist;
  kmems[cpu_id].freelist = r;
  release(&kmems[cpu_id].lock);
}

// Allocate one 4096-byte page of physical memory.
// Returns a pointer that the kernel can use.
// Returns 0 if the memory cannot be allocated.
void *
kalloc(void)
{
  struct run *r;

  // get current cpu_id
  push_off();
  int cpu_id = cpuid();
  pop_off();

  acquire(&kmems[cpu_id].lock); // use 'acquire' also disable intrruption

  r = kmems[cpu_id].freelist;
  
  // if freelist of current cpu_id has freepage we use it
  if(r){
    kmems[cpu_id].freelist = r->next;
    release(&kmems[cpu_id].lock);
    memset((char*)r, 5, PGSIZE); // fill with junk
    return (void*)r;
  }

  // else we steal freepage from other freelist of cpus
  else{
    release(&kmems[cpu_id].lock);
    for(int i = 0; i < NCPU; i++){
      // avoid race condition(same cpu_id lock)
      if (i != cpu_id){ 
        acquire(&kmems[i].lock);

        // the last_list also not memory so ret 0 
        if(i == NCPU - 1 && kmems[i].freelist == 0){  
          release(&kmems[i].lock);
          return (void*)0;
        }

        // It is OK to allocte a freepage by stealing from other cpus
        if(kmems[i].freelist){
          struct run *to_alloc = kmems[i].freelist; 
          kmems[i].freelist = to_alloc->next;
          release(&kmems[i].lock);
          memset((char*)to_alloc, 5, PGSIZE); // fill with junk
          return (void*)to_alloc;
        }

        // if no matching condition, we must release current lock
        release(&kmems[i].lock); 
      }
    }
  }

  // Returns 0 if the memory cannot be allocated. 
  return (void*)0;
}

Buffer cache

实现buffer cache的并发（concurrency），保证一个不变性（invariant ）：即每次只有一个buffer被cpu或者process使用。同时不要让其它process或cpu都等待一个cache lock（这暗示要使用睡眠锁）

实现思路主要是将30个buf（NBUF == 30）划分为13个哈希桶（bucket），每个bucket都有一个对应的自旋锁即bcache.bucket。分配一个buf时，先对bache上锁，再对这个buf对应的bucket上锁，分配结束之后，先对这个buf对应的bucket解锁，再对bache解锁，达成不变性的要求。

为了完成划分，笔者将哈希表的前两个bucket的buf为4个，后十一个bucket的buf设置为2个。这也是加上辅助函数get_max_buf的原因。
这里会涉及到一个问题，为什么不每个buffer都设一个锁，因为实际上每个buffer已经有对应的睡眠锁，这些睡眠锁是使用buf里的数据内容时才使用的。那为什么不能再设定一个自旋锁呢？笔者目前没能清楚为什么不行，笔者猜测和8.1 ~ 8.3相关，但是我没有完全弄明白这些章节。
笔者只知道，访问一个buf的结构时是不需要对sleeplock: buffer进行上锁、解锁操作，只有使用这个buf的数据内容才需要对sleeplock: buffer进行上锁、解锁操作，原因应该也是和章节相关。但是访问bucket是肯定要用两个锁的（sleeplock: cache和spinlock: bcache.bucket）
为什么将bucket的数量设置为质数后，就可以减少访问buckets时冲突，笔者猜测这里涉及到hashtable的知识点了，但没有继续深入。

为了实现LRU，笔者使用了ref_is_zero_nums、ref_is_zero[NBUK][MAXBUF]等变量用来找到最大时间戳（lru_time_stamp）对应的哈希桶编号（lru_bucket）和缓存区编号（lru_buf），这里会有三种情况。

ref_is_zero_nums == 0，说明没有buf的refcnt == 0，直接报错
ref_is_zero_nums == 1，说明只有一个buf的refcnt == 0，直接把它分配出去
其它情况，即ref_is_zero_nums >= 2，说明至少有两个buf的refcnt == 0，我们要找出它们当中的最大时间戳（lru_time_stamp）对应的哈希桶编号（lru_bucket）和缓存区编号（lru_buf）

brelse()、bpin()、 bunpin()的使用都不需要使用sleeplock: cache

brelse()是因为不再使用bcache.head了，所以不需要使用sleeplock: cache
bpin()、 bunpin()不使用是因为访问一个buf的结构时不需要任何用锁
- 使用sleeplock: cache会导致panic: sched locks

access code

首先是一些定义：

笔者遇到一件很奇葩的事情，在2021版本usertests在qemu里面跑的时候能够通过，但是使用make grade指令跑的时候就无法通过。更奇葩的是，在2020版本就能够跑过，但是2021版本就跑不过（哭笑），解决方案是参考20版本把FSSIZE改为10000。不改的话会导致panic: balloc: out of blocks

/* param.h */
...
#define FSSIZE       10000  // size of file system in blocks
#define MAXPATH      128   // maximum file path name
#define NBUK         13    // a fixed number of buckets in a hashtable of cache
#define MAXBUF       4     // buckets[0] or buckets[1] has 4 buffers, other buckets[i] has 2 buffers(i: 2 ~ 12) 
  
/* buf.h */
struct buf {
	...
  uchar data[BSIZE];
  uint time_stamp;  // the time stamp of current block 
};

然后是实现代码：

// Buffer cache.
//
// The buffer cache is a linked list of buf structures holding
// cached copies of disk block contents.  Caching disk blocks
// in memory reduces the number of disk reads and also provides
// a synchronization point for disk blocks used by multiple processes.
//
// Interface:
// * To get a buffer for a particular disk block, call bread.
// * After changing buffer data, call bwrite to write it to disk.
// * When done with the buffer, call brelse.
// * Do not use the buffer after calling brelse.
// * Only one process at a time can use a buffer,
//     so do not keep them longer than necessary.


#include "types.h"
#include "param.h"
#include "spinlock.h"
#include "sleeplock.h"
#include "riscv.h"
#include "defs.h"
#include "fs.h"
#include "buf.h"


// each bucket has a lock, the name already be setted to 'bcache.bucket'
struct Bucket{
  struct spinlock lock;
  struct buf buf[MAXBUF];
};

struct Bcache{
  // we must use sleeplock so that 
  // "don't all(processes, cpus) have to wait for bcache.lock"
  struct sleeplock lock; 
  struct Bucket buckets[NBUK];
} bcache;  // we use bcache, but we don't use bcache's'

// buckets[0] or buckets[1] has 4 buffers
// and other buckets[i] has 2 buffers(i: 2 ~ 12) 
// get the max buf of current bucket
int
get_max_buf(int i){
  int max_buf = MAXBUF - 2; 
  if (i == 0 || i == 1)
    max_buf = MAXBUF;  
  return max_buf;
}

void
binit(void)
{ 
  struct buf *b;
  initsleeplock(&bcache.lock, "bcache");
  int max_buf;

  for(int i = 0; i < NBUK; i++){
    // each bucket has a lock, the name already be setted to 'bcache.bucket'
    initlock(&bcache.buckets[i].lock, "bcache.bucket"); 

    max_buf = get_max_buf(i);
    
    for (int j = 0; j < max_buf; j++){
      b = &bcache.buckets[i].buf[j];
      initsleeplock(&b->lock, "buffer");
      acquire(&tickslock);
      b->time_stamp = ticks;  // a buf get the current time_stamp
      release(&tickslock);       
    }
  }  
}

// Look through buffer cache for block on device dev.
// If not found, allocate a buffer.
// In either case, return locked buffer.
static struct buf*
bget(uint dev, uint blockno)
{
  struct buf *b;
  int max_buf;
  
  acquiresleep(&bcache.lock);
  
  // Is the block already cached?
  for(int i = 0; i < NBUK; i++){
    acquire(&bcache.buckets[i].lock);
    max_buf = get_max_buf(i);
    for (int j = 0; j < max_buf; j++){
      b = &bcache.buckets[i].buf[j];
      if(b->dev == dev && b->blockno == blockno){
        b->refcnt++;
        
        acquire(&tickslock);
        b->time_stamp = ticks; // update time_stamp, bacause the block be used
        release(&tickslock);
        
        release(&bcache.buckets[i].lock);          
        releasesleep(&bcache.lock);
        acquiresleep(&b->lock);
        return b;
      }
    }
    release(&bcache.buckets[i].lock);
  }  

  // Not cached.
  // Choose the least recently used (LRU) unused buffer based on time_stamp.

  // init ref_is_zero[NBUF]
  int ref_is_zero[NBUK][MAXBUF];
  for(int i = 0; i < NBUK; i++){
    max_buf = get_max_buf(i);
    for(int j = 0; j < max_buf; j++)
      ref_is_zero[i][j] = 0;
  }

  // find to all refcnts equale to 0, 
  // and set a lru_bucket and lru_buf when their refcnt equal 0
  int ref_is_zero_nums = 0;
  int lru_bucket = 0;
  int lru_buf = 0;

  for(int i = 0; i < NBUK; i++){
    acquire(&bcache.buckets[i].lock);
    max_buf = get_max_buf(i);
    for (int j = 0; j < max_buf; j++){
      b = &bcache.buckets[i].buf[j];
      if (b->refcnt == 0){
        ref_is_zero[i][j] = 1;
        ref_is_zero_nums++;
        lru_bucket = i;
        lru_buf = j; 
      }
    }
    release(&bcache.buckets[i].lock);
  }

  // 0 => panic; 1 => alloc a buffer; other => find the max time_stamp then goto alloc a buffer
  if (ref_is_zero_nums == 0)
    panic("bget: no buffers");
  if (ref_is_zero_nums == 1){
    // do noting but alloc a buffer
  }
  else{
    uint lru_time_stamp;
    acquire(&bcache.buckets[lru_bucket].lock);
    b = &bcache.buckets[lru_bucket].buf[lru_buf];
    lru_time_stamp = b->time_stamp;
    release(&bcache.buckets[lru_bucket].lock);

    for(int i = 0; i < NBUK; i++){
      acquire(&bcache.buckets[i].lock);
      max_buf = get_max_buf(i);
      for(int j = 0; j < max_buf; j++){
        if (ref_is_zero[i][j] == 1){
          b = &bcache.buckets[i].buf[j];
          if (b->time_stamp > lru_time_stamp){
            lru_time_stamp = b->time_stamp;
            lru_bucket = i;
            lru_buf = j;
          }
        }        
      }
      release(&bcache.buckets[i].lock);
    } 
  }

  // alloc a buffer
  acquire(&bcache.buckets[lru_bucket].lock);
  b = &bcache.buckets[lru_bucket].buf[lru_buf];
  b->dev = dev;
  b->blockno = blockno;
  b->valid = 0;
  b->refcnt = 1;
  
  acquire(&tickslock);
  b->time_stamp = ticks; // update time_stamp, bacause the block be used
  release(&tickslock);
  
  release(&bcache.buckets[lru_bucket].lock);   
  releasesleep(&bcache.lock);
  acquiresleep(&b->lock);
  return b;
}

// Return a locked buf with the contents of the indicated block.
struct buf*
bread(uint dev, uint blockno)
{
  struct buf *b;

  b = bget(dev, blockno);
  if(!b->valid) {
    virtio_disk_rw(b, 0);
    b->valid = 1;
  }
  return b;
}

// Write b's contents to disk.  Must be locked.
void
bwrite(struct buf *b)
{
  if(!holdingsleep(&b->lock))
    panic("bwrite");
  virtio_disk_rw(b, 1);
}

// Release a locked buffer.
// Move to the head of the most-recently-used list.
void
brelse(struct buf *b)
{
  if(!holdingsleep(&b->lock))
    panic("brelse");

  releasesleep(&b->lock);

  // why don't acquire lock of 'cahce'?
  // we use lock of 'cahce', because we use 'bcache.head' in orinal version
  // because we just change time_stamp of 'b'(buffer), we don't use any lock
  b->refcnt--;
  acquire(&tickslock);
  b->time_stamp = ticks; // update time_stamp, bacause the block be used
  release(&tickslock);

}

// we don't use bache.lock, becasue we can directly access the struct of buf 
void
bpin(struct buf *b) {
  // acquiresleep(&bcache.lock);
  b->refcnt++;
  // releasesleep(&bcache.lock);
}

void
bunpin(struct buf *b) {
  // acquiresleep(&bcache.lock);
  b->refcnt--;
  // releasesleep(&bcache.lock);
}

result

笔者还改了下timeout=300 为 times=500

== Test running kalloctest ==
$ make qemu-gdb
(107.5s)
== Test   kalloctest: test1 ==
  kalloctest: test1: OK
== Test   kalloctest: test2 ==
  kalloctest: test2: OK
== Test kalloctest: sbrkmuch ==
$ make qemu-gdb
kalloctest: sbrkmuch: OK (15.7s)
== Test running bcachetest ==
$ make qemu-gdb
(22.0s)
== Test   bcachetest: test0 ==
  bcachetest: test0: OK
== Test   bcachetest: test1 ==
  bcachetest: test1: OK
== Test usertests ==
$ make qemu-gdb
usertests: OK (289.6s)
== Test time ==
time: OK
Score: 70/70

总结

完成日期：22.6.18
本次实验，笔者践行“无论如何都不看参考答案”的信念，第一次体验到完全靠自己完成的快感，在图书馆里欢呼起来了，体验到了知识被应用、知识被链接时，自我生长的感觉
一开始关于hashtable的概念不是很清楚，导致结构体的定义出错，后面就不可能对了。看了《算法精解》才搞懂
写概括性注释真的很重要，16号写完kalloctest，等到18号写blog的时候就记得不是很清楚了
最近在听Sunshine Girl moumoon

上一篇：SmartIDE v0.1.19 - 码云（Gitee）最有价值开源项目奖项、工作区策略、类虚拟机镜像
下一篇：没有了

2021Fall MIT6.S081-Lab8 Lock

相关文章