开始日期:22.07.15
操作系统:Ubuntu20.0.4
Link:Lab: mmap
目录- Lab mmap
- 写在前面
- 踩坑
- 参考材料
- 实验内容
- mmap
- access code
- result
- mmap
- 总结
- 写在前面
此部分不涉及实现细节,可以放心阅读
实现简易的mmap
,munmap
,需要对virtual address和physical address转换过程中使用的函数有一定理解,同时这里也采用了和cow、lazy alloction一样的lazy alloc方式。
实现之后,即可直接从程序中直接访问file。
踩坑-
英文阅读问题
-
as if 等价于 like,意思为“就像”,我错误理解为如果
munmap
被调用之后才调用exit
,导致我一直没思路去修改exit
Modify
exit
to unmap the process's mapped regions as ifmunmap
had been called.修改
exit
来解除程序中已映射空间的映射,就像调用munmap
一样 -
mmaptest.c
中,有一句测试概括,这是个否定句,所以read/write按照英文习惯应该翻译为读写,但我一开始按照中文习惯错误翻译为读或写,导致编程时调试了一段时间才过这个测试check that mmap doesn't allow read/write mapping of a file opened read-only.
检查:mmap不允许读写只读文件的映射
-
-
p->sz
p->sz
不但是程序的virtual space的大小也是physical space的大小 -
编程过程中,可能出现关于
strcut file
报错和PROT_READ
等定义的报错,在报错文件的开头处添加strcut file
和#include "fcntl.h"
即可
- book-riscv-rev2
- riscv-privileged
警告:此部分涉及实现细节
mmap实现要求:实现mmap
、munmap
,分配物理内存时要采用lazy alloc
的方式,解除映射时要写回文件
实现思路:按照hint一步步来即可,但有不少地方需要自己发挥
mmap
的vma应该选择放在哪里?我采用的是先用程序的p->sz
当作映射的虚拟地址,当然p->sz
有可能已经被使用了(映射了两个文件),那我们就做检查,往后延长即可,注意这样做成功之后,肯定要将原p->sz
增加length
。munmap
中要对p->sz
、vma结构数据的length
、addr
、f
等数据进行处理,详细处理见注释,从而舍弃掉将要被解除的区域。同时在munmap
中要注意写回文件这个操作,写回文件时要判断是不是第一次写回文件,从而判断要不要写回时进行偏移(使用oldsz
判断)。除此之外,当解除实际上没有映射到物理空间的vma时,我们不执行uvmunmap()
,也不会执行写回文件,因为这个vma对应的pte中的有效位(PTE_V)没有在lazy alloc中被设置。- lazy alloc中,我写了个辅助函数
mmap_lazyalloc
,每次只分配一个page,为了能够在文件的中间部分就被读,当然,分配之前要先进行的检查。最后需要注意readi()
被调用时,读文件时的偏移应该是多少,用va
减去vma地址即可求出偏移。 exit
中,类似于munmap
,但可以减少一些vma结构数据处理,因为最后结构都会被释放掉。注意在这里写回文件时,使用的是vma的length。而不像munmap
使用的是传入的lengthfork
中,我没有采用父子共享vmas的方式(没那么酷 = =),直接copy过去了,copy时vmas不用copy,因为父程序的vmas可能还没lazy alloc。这里需要注意一点,就是p->sz
已经被改变了,我们要算出最开始没有vmas的p->sz
,用当前的p->sz
减掉所有vmas的length
即可。
首先是mmap
、munmap
,这两个syscall的设置,参考lab syscall即可,在此不赘述
然后是定义
/* proc.h */
struct vma_t {
// int fd;
struct file *f;
int length;
int prot;
int flags;
int offset;
int oldsz;
uint64 addr;
};
// Per-process state
struct proc {
struct spinlock lock;
...
char name[16]; // Process name (debugging)
struct vma_t vmas[16]; // VMAs helps the kernel to decide how to handle page faults
};
sys_mmap
/* */
uint64
sys_mmap(void)
{
struct file *f;
int length , prot, flags, offset;
uint64 addr;
// get args, 0:addr, 1:length, 2:prot, 3:flags, 5:offset, 4:fd=>*f
if(argaddr(0, &addr) < 0
|| argint(1, &length) < 0 || argint(2, &prot) < 0 || argint(3, &flags) < 0 || argint(5, &offset) < 0
|| argfd(4, 0, &f) < 0 )
return -1;
// mmap doesn't allow read/write mapping of a file opened read-only with MAP_SHARED
// "or", "/" also meaning "and" in english when use "doesn't", "not" and so on
if(f->readable && !f->writable && (prot & PROT_READ) && (prot & PROT_WRITE) && (flags & MAP_SHARED))
return -1;
struct proc *p = myproc();
int oldsz = p->sz;
// get va without anything by p->sz
if(addr == 0){
uint64 va = p->sz;
int is_same = 0;
while(1){
// avoid: some VAs may get same addr, becase we use 'lazy alloc'
int i ;
for(i = 0; i < 16; i++){
if(p->vmas[i].addr <= va && va < (p->vmas[i].addr + p->vmas[i].length))
is_same = 1;
}
if(!is_same){
addr = va;
p->sz = va + length;
break;
}
// reset is_same
is_same = 0;
if(va >= MAXVA)
return -1;
va += PGSIZE;
}
}
for(int i = 0; i < 16; i++){
// case: maybe remove and rebuild a vmas[i] when slots full
// not case in test
// build a vams[i]
if(p->vmas[i].addr == 0){
p->vmas[i].f = f;
p->vmas[i].length = length;
p->vmas[i].prot = prot;
p->vmas[i].flags = flags;
p->vmas[i].offset = offset;
p->vmas[i].addr = addr;
p->vmas[i].oldsz = oldsz;
// mmap should increase the file's reference count
filedup(f);
return addr;
}
}
// failed, ret -1
return -1;
}
sys_munmap
uint64
sys_munmap(void)
{
uint64 addr;
int length;
// get args, 0:addr, 1:length
if(argaddr(0, &addr) < 0 || argint(1, &length) < 0)
return -1;
struct proc *p = myproc();
int has_a_vma = 0;
int i;
for(i = 0 ; i < 16; i++){
if(p->vmas[i].addr <= addr && addr < (p->vmas[i].addr + p->vmas[i].length)){
has_a_vma = 1;
break;
}
}
// not find a vma, so ret -1
if(has_a_vma == 0){
return -1;
}
pte_t *pte = walk(p->pagetable, addr, 0);
int offset = p->vmas[i].f->off; // addr is beginning of vma, so use file->off
if(p->vmas[i].oldsz != addr) // addr is't beginning of vma, so use p->vmas[i].offset
offset = p->vmas[i].offset;
// If an unmapped page has been modified and the file is mapped MAP_SHARED,
// write the page back to the file.
if((*pte & PTE_V) && p->vmas[i].flags & MAP_SHARED){
begin_op();
ilock(p->vmas[i].f->ip);
writei(p->vmas[i].f->ip, 1, addr, offset, length);
iunlock(p->vmas[i].f->ip);
end_op();
}
// If munmap removes all pages of a previous mmap,
// it should decrement the reference count of the corresponding struct file.
// we keep end of old addr by 'p->vmas[i].addr += length' and 'p->vmas[i].length -= length'
// we check by 'addr < (p->vmas[i].addr + p->vmas[i].length' in sys_mmap()
// so we can't mmap [length munmap] and we will mmap after [p->vmas[i].length]
// figure:
// ' p->vmas[i].addr
// [process data][ p->vmas[i].length ]
// [ p->sz ]
// ==>
// ' p->vmas[i].addr
// [process data][ length munmap ][p->vmas[i].length]
// [ p->sz ]
if(length < p->vmas[i].length){
p->sz -= length;
p->vmas[i].addr += length;
p->vmas[i].length -= length;
}
else if(length == p->vmas[i].length){
p->sz -= length;
p->vmas[i].f->ref--;
p->vmas[i].f = 0;
p->vmas[i].addr = 0;
p->vmas[i].prot = 0;
p->vmas[i].flags = 0;
p->vmas[i].length = 0;
p->vmas[i].oldsz = 0;
}
else
return -1;
if((*pte & PTE_V) == 0) // don't write and uvmunmap(), only change data of vma
return 0;
// uvmunmap() after writing
// find the VMA for the address range and unmap the specified pages
// note: free physical memory => 'do_free = 1'
uvmunmap(p->pagetable, addr, length/PGSIZE, 1);
// success, ret 0
return 0;
}
lazy alloc是当作page fault(13)来处理的
/* trap.c */
int
mmap_lazyalloc(pagetable_t pagetable, uint64 va)
{
struct proc *p = myproc();
struct file *f;
int prot;
int has_a_vam = 0;
int perm = 0;
char *mem;
// find va between vma.addr and vma.addr+vma.lenght
int i;
for(i = 0; i < 16; i++){
if(p->vmas[i].addr <= va && va < (p->vmas[i].addr + p->vmas[i].length)){
has_a_vam = 1;
f = p->vmas[i].f;
prot = p->vmas[i].prot;
break;
}
}
// not find vma, ret -1
if(has_a_vam == 0){
return -1;
}
// PTE_U controls whether instructions in user mode are allowed to access the page;
// if PTE_U is notset, the PTE can be used only in supervisor mode.
perm |= PTE_U;
// MAYBE sets PTE_R, PTE_W, PTE_X
if(prot & PROT_READ){
perm |= PTE_R;
}
if(prot & PROT_WRITE){
perm |= PTE_W;
}
if(prot & PROT_EXEC){
perm |= PTE_X;
}
// big bug: not alloc mem(4096) to all virtual addresses
if((mem = kalloc()) == 0){
return -1;
}
// In mmaptest/makefile()
// create a file to be mapped, containing
// 1.5 pages of 'A' and half a page of zeros.
// so we must set 0 of length after getting mem
memset(mem, 0, PGSIZE);
// note: mem is new address of phycial memory
if(mappages(pagetable, va, PGSIZE, (uint64)mem, perm) == -1){
kfree(mem);
return -1;
}
// we not set PTE_D, becasue we always directly wirite back to file in munmap()
// length is the number of bytes to map; it might not be the same as the file's length.
// read data from file, then put data to va
ilock(f->ip);
if(readi(f->ip, 1, va, va - p->vmas[i].addr, PGSIZE) < 0){ // readi offset by 'va - p->vmas[i].addr'
iunlock(f->ip);
return -1;
}
iunlock(f->ip);
p->vmas[i].offset += PGSIZE;
// success, ret 0
return 0;
}
void
usertrap(void)
{
int which_dev = 0;
if((r_sstatus() & SSTATUS_SPP) != 0)
panic("usertrap: not from user mode");
// send interrupts and exceptions to kerneltrap(),
// since we're now in the kernel.
w_stvec((uint64)kernelvec);
struct proc *p = myproc();
// save user program counter.
p->trapframe->epc = r_sepc();
if(r_scause() == 8){
// system call
if(p->killed)
exit(-1);
// sepc points to the ecall instruction,
// but we want to return to the next instruction.
p->trapframe->epc += 4;
// an interrupt will change sstatus &c registers,
// so don't enable until done with those registers.
intr_on();
syscall();
// Fill in the page table lazily, in response to page faults.
} else if(r_scause() == 13){
uint64 fault_va = r_stval();
int is_alloc = mmap_lazyalloc(p->pagetable, fault_va);
if(fault_va > p->sz || is_alloc == -1){
p->killed = 1;
}
} else if((which_dev = devintr()) != 0){
// ok
} else {
printf("usertrap(): unexpected scause %p pid=%d\n", r_scause(), p->pid);
printf(" sepc=%p stval=%p\n", r_sepc(), r_stval());
p->killed = 1;
}
if(p->killed)
exit(-1);
// give up the CPU if this is a timer interrupt.
if(which_dev == 2)
yield();
usertrapret();
}
exit
void
exit(int status)
{
struct proc *p = myproc();
if(p == initproc)
panic("init exiting");
// Close all open files.
for(int fd = 0; fd < NOFILE; fd++){
if(p->ofile[fd]){
struct file *f = p->ofile[fd];
fileclose(f);
p->ofile[fd] = 0;
}
}
// 'as if' == 'like'
for(int i = 0; i < 16; i++){
if(p->vmas[i].addr){
int offset = p->vmas[i].f->off; // addr is beginning of vma, so use file->off
if(p->vmas[i].oldsz != p->vmas[i].addr) // addr is't beginning of vma, so use p->vmas[i].offset
offset = p->vmas[i].offset;
if(p->vmas[i].flags & MAP_SHARED){
begin_op();
ilock(p->vmas[i].f->ip);
writei(p->vmas[i].f->ip, 1, p->vmas[i].addr, offset, p->vmas[i].length);
iunlock(p->vmas[i].f->ip);
end_op();
}
p->sz -= p->vmas[i].length;
p->vmas[i].f->ref--;
pte_t *pte = walk(p->pagetable, p->vmas[i].addr, 0);
if((*pte & PTE_V) == 0) // don't write and uvmunmap(), only change data of vma
continue;
uvmunmap(p->pagetable, p->vmas[i].addr, p->vmas[i].length/PGSIZE, 1);
}
}
begin_op();
iput(p->cwd);
end_op();
p->cwd = 0;
acquire(&wait_lock);
// Give any children to init.
reparent(p);
// Parent might be sleeping in wait().
wakeup(p->parent);
acquire(&p->lock);
p->xstate = status;
p->state = ZOMBIE;
release(&wait_lock);
// Jump into the scheduler, never to return.
sched();
panic("zombie exit");
}
fork
int
fork(void)
{
int i, pid;
struct proc *np;
struct proc *p = myproc();
// Allocate process.
if((np = allocproc()) == 0){
return -1;
}
// virtual map vs. real map
// Copy user memory from parent to child.
int length = 0;
for(int i = 0; i < 16; i++){
if(p->vmas[i].length){
length += p->vmas[i].length;
}
}
// use the first p->sz by (p->sz-length)
if(uvmcopy(p->pagetable, np->pagetable, p->sz-length) < 0){
freeproc(np);
release(&np->lock);
return -1;
}
for(int i = 0; i < 16; i++){
if(p->vmas[i].addr){
np->vmas[i].f = p->vmas[i].f;
np->vmas[i].length = p->vmas[i].length;
np->vmas[i].prot = p->vmas[i].prot;
np->vmas[i].flags = p->vmas[i].flags;
np->vmas[i].offset = 0; // ret offset, becasue we maybe read before fork()
np->vmas[i].addr = p->vmas[i].addr;
np->vmas[i].oldsz = p->vmas[i].oldsz;
filedup(p->vmas[i].f);
}
}
np->sz = p->sz;
// copy saved user registers.
*(np->trapframe) = *(p->trapframe);
// Cause fork to return 0 in the child.
np->trapframe->a0 = 0;
// increment reference counts on open file descriptors.
for(i = 0; i < NOFILE; i++)
if(p->ofile[i])
np->ofile[i] = filedup(p->ofile[i]);
np->cwd = idup(p->cwd);
safestrcpy(np->name, p->name, sizeof(p->name));
pid = np->pid;
release(&np->lock);
acquire(&wait_lock);
np->parent = p;
release(&wait_lock);
acquire(&np->lock);
np->state = RUNNABLE;
release(&np->lock);
return pid;
}
result
make[1]: Leaving directory '/home/duile/xv6-labs-2021'
== Test running mmaptest ==
$ make qemu-gdb
(3.9s)
== Test mmaptest: mmap f ==
mmaptest: mmap f: OK
== Test mmaptest: mmap private ==
mmaptest: mmap private: OK
== Test mmaptest: mmap read-only ==
mmaptest: mmap read-only: OK
== Test mmaptest: mmap read/write ==
mmaptest: mmap read/write: OK
== Test mmaptest: mmap dirty ==
mmaptest: mmap dirty: OK
== Test mmaptest: not-mapped unmap ==
mmaptest: not-mapped unmap: OK
== Test mmaptest: two files ==
mmaptest: two files: OK
== Test mmaptest: fork_test ==
mmaptest: fork_test: OK
== Test usertests ==
$ make qemu-gdb
usertests: OK (141.4s)
== Test time ==
time: OK
Score: 140/140
总结
- 完成日期:22.07.19
- 耗时30h,1h看材料,2h在review,剩下的大部分时间都在debug
- vma位置的确定花了很长时间,最后才使用
p->sz
,因为一开始没清楚vma位置的要求是什么,同时对process的本身结构不清楚。 munmap
一开始写的时候,根本没有考虑到vma结构数据的改变和删除。导致vma的数量一直在增加。如果测试一多,可能就超过16个了- lazy alloc当中内存的分配我一开始是直接分配length大小(可能大于PGSIZE),没有考虑
kalloc()
的大小是PGSIZE
,也没有考虑可能从中间读。但在fork_test中才报错,竟然无法执行kalloc()
,很长时间无法复现错误,最后差不多是无意中发现的。 - 因为写的mit s6.0812021 fall,所以10个lab已经结束了,第二个课程labs完成!其中6成是靠自己完成的,很明显,后面3个lab的才收获很大,前面7个lab的收获一般,因为只有一半是独立完成的。以后刷课一定要力求独立完成。调试的
gdb
没咋学会,printf
倒是使用得很开心,写printf
时要注明清楚打印意图,能提高debug效率。以后刷课尽快用上gdb
进行更细致的debug,打断点功能是printf无法涉及的。 - 最近在听《穿越时空的少女(Toki o kakeru shôjo)》的:
- スケッチ(ロング?バージョン)
- 変わらないもの(ストリングス・バージョン)