admin 管理员组文章数量: 887021
2024年1月18日发(作者:amaze ui 上传文件)
中文3150字
本科毕业设计外文文献翻译
(
学生姓名:XXX
学 院:信息工程学院
系 别:计算机系
专 业:软件工程
班 级:软件06
指导教师:XXX 副教 授
二 〇 一 〇 年 六 月
XXX工业大学本科毕业设计外文文献翻译
Process Management
The process is one of the fundamental abstractions in Unix operating systems. A process
is a program(object code stored on some media) in execution. Processes are, however, more
than just the executingprogram code (often called the text section in Unix). They also include
a set of resources such as open files and pending signals, internal kernel data, processor state,
an address space, one or more threads ofexecution, and a data section containing global
variables. Processes, in effect, are the living result of running program code.
Threads of execution, often shortened to threads, are the objects of activity within the
process. Each thread includes a unique program counter, process stack, and set of processor
registers. The kernel schedules individual threads, not processes. In traditional Unix systems,
each process consists of one thread. In modern systems, however, multithreaded
programsthose that consist of more than one threadare common. Linux has a unique
implementation of threads: It does not differentiate between threads and processes. To Linux,
a thread is just a special kind of process.
On modern operating systems, processes provide two virtualizations: a virtualized
processor and virtual memory. The virtual processor gives the process the illusion that it alone
monopolizes the system, despite possibly sharing the processor among dozens of other
processes. discusses this virtualization. Virtual memory lets the process allocate and manage
memory as if it alone owned all the memory in the system. Interestingly, note that threads
share the virtual memory abstraction while each receives its own virtualized processor.
A program itself is not a process; a process is an active program and related resources.
Indeed, two or more processes can exist that are executing the same program. In fact, two or
more processes can exist that share various resources, such as open files or an address space.
A process begins its life when, not surprisingly, it is created. In Linux, this occurs by
means of the fork()system call, which creates a new process by duplicating an existing one.
The process that calls fork() is the parent, whereas the new process is the child. The parent
resumes execution and the child starts execution at the same place, where the call returns. The
fork() system call returns from the kernel twice:once in the parent process and again in the
newborn child.
1
XXX工业大学本科毕业设计外文文献翻译
Often, immediately after a fork it is desirable to execute a new, different, program. The
exec*() family of function calls is used to create a new address space and load a new program
into it. In modern Linux kernels, fork() is actually implemented via the clone() system call,
which is discussed in a followingsection.
Finally, a program exits via the exit() system call. This function terminates the process
and frees all its resources. A parent process can inquire about the status of a terminated child
via the wait4() system call, which enables a process to wait for the termination of a specific
process. When a process exits, it is placed into a special zombie state that is used to represent
terminated processes until the parent calls wait() or waitpid().
Another name for a process is a task. The Linux kernel internally refers to processes as
tasks. although when I say task I am generally referring to a process from the kernel's point of
view.
1 Process Descriptor and the Task Structure
The kernel stores the list of processes in a circular doubly linked list called the task list.
Each element in the task list is a process descriptor of the type struct task_struct, which is
defined in
specific process.
The task_struct is a relatively large data structure, at around 1.7 kilobytes on a 32-bit
machine. This size,however, is quite small considering that the structure contains all the
information that the kernel has and needs about a process. The process descriptor contains the
data that describes the executing programopen files, the process's address space, pending
signals, the process's state, and much more
2 Allocating the Process Descriptor
The task_struct structure is allocated via the slab allocator to provide object reuse and
cache coloring Prior to the 2.6 kernel series, struct task_struct was stored at the end of the
kernel stack of each process. This allowed architectures with few registers, such as x86, to
calculate the location of the process descriptor via the stack pointer without using an extra
register to store the location. With the process descriptor now dynamically created via the slab
2
XXX工业大学本科毕业设计外文文献翻译
allocator, a new structure, struct thread_info, was created that again lives at the bottom of the
stack and at the top of the stack . The new structure also makes it rather easy to calculate
offsets of its values for use in assembly code.
The thread_info structure is defined on x86 in
struct thread_info {
struct task_struct *task;
struct exec_domain *exec_domain;
unsigned long flags;
unsigned long status;
__u32 cpu;
__s32 preempt_count;
mm_segment_t addr_limit;
struct restart_block restart_block;
unsigned long previous_esp;
__u8 supervisor_stack[0];
};
Each task's tHRead_info structure is allocated at the end of its stack. The task element of the
structure is a pointer to the task's actual task_struct.
3 Storing the Process Descriptor
The system identifies processes by a unique process identification value or PID. The PID
is a numerical value that is represented by the opaque type pid_t, which is typically an int.
Because of backward compatibility with earlier Unix and Linux versions, however, the
default maximum value is only 32,768 although the value can optionally be increased to the
full range afforded the type. The kernel stores this value as pid inside each process descriptor.
This maximum value is important because it is essentially the maximum number of
processes that may exist concurrently on the system. Although 32,768 might be sufficient for
a desktop system, large servers may require many more processes. The lower the value, the
sooner the values will wrap around, destroying the useful notion that higher values indicate
later run processes than lower values. If the system is willing to break compatibility with old
3
XXX工业大学本科毕业设计外文文献翻译
applications, the administrator may increase the maximum value via
/proc/sys/kernel/pid_max.
Inside the kernel, tasks are typically referenced directly by a pointer to their task_struct
structure. In fact, most kernel code that deals with processes works directly with struct
task_struct. Consequently, it is very useful to be able to quickly look up the process descriptor
of the currently executing task, which is done via the current macro. This macro must be
separately implemented by each architecture. Some architectures save a pointer to the
task_struct structure of the currently running process in a register, allowing for efficient
access. Other architectures, such as x86 (which has few registers to waste), make use of the
fact that struct thread_info is stored on the kernel stack to calculate the location of thread_info
and subsequently the task_struct.
On x86, current is calculated by masking out the 13 least significant bits of the stack
pointer to obtain the thread_info structure. This is done by the current_thread_info() function.
The assembly is shown here:
movl $-8192, %eax
andl %esp, %eax
This assumes that the stack size is 8KB. When 4KB stacks are enabled, 4096 is used in lieu of
8192.
Finally, current dereferences the task member of thread_info to return the
task_struct:current_thread_info()->task; Contrast this approach with that taken by PowerPC
(IBM's modern RISC-based microprocessor), which stores the current task_struct in a register.
Thus, current on PPC merely returns the value stored in the register r2. PPC can take this
approach because, unlike x86, it has plenty of registers. Because accessing the process
descriptor is a common and important job, the PPC kernel developers deem using a register
worthy for the task.
4 Process State
The state field of the process descriptor describes the current condition of the process.
Each process on the system is in exactly one of five different states. This value is represented
by one of five flags:
4
XXX工业大学本科毕业设计外文文献翻译
(1) TASK_RUNNING The process is runnable; it is either currently running or on a runqueue
waiting to run. This is the only possible state for a process executing in user-space; it can also
apply to a process in kernel-space that is actively running.
(2) TASK_INTERRUPTIBLE. The process is sleeping (that is, it is blocked), waiting for
some condition to exist. When this condition exists, the kernel sets the process's state to
TASK_RUNNING. The process also awakes prematurely and becomes runnable if it
receives a signal.
(3) TASK_UNINTERRUPTIBLE This state is identical to TASK_INTERRUPTIBLE except
that it does not wake up and become runnable if it receives a signal. This is used in situations
where the process must wait without interruption or when the event is expected to occur quite
quickly. Because the task does not respond to signals in this state, this state is less often used
than TASK_INTERRUPTIBLE
(4)TASK_ZOMBIE The task has terminated, but its parent has not yet issued a wait4()
system call. The task's process descriptor must remain in case the parent wants to access it. If
the parent calls wait4(), the process descriptor is deallocated.
(5) TASK_STOPPED Process execution has stopped; the task is not running nor is it eligible
to run. This occurs if the task receives the SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU
signal or if it receives any signal while it is being debugged.
5 Manipulating the Current Process State
Kernel code often needs to change a process's state. The preferred mechanism is using
set_task_state(task, state); This function sets the given task to the given state. If applicable, it
also provides a memory barrier to force ordering on other processors (this is only needed on
SMP systems). Otherwise, it is equivalent to task->state = state; The method
set_current_state(state) is synonymous to set_task_state(current, state).
6 Process Context
One of the most important parts of a process is the executing program code. This code is
read in from an executable file and executed within the program's address space. Normal
program execution occurs in userspace. When a program executes a system call or triggers an
5
XXX工业大学本科毕业设计外文文献翻译
exception, it enters kernel-space. At this point, the kernel is said to be "executing on behalf of
the process" and is in process context. When in process context, the current macro is valid.
Upon exiting the kernel, the process resumes execution in user-space, unless a higher-priority
process has become runnable in the interim, in which case the scheduler is invoked to select
the higher priority process.
System calls and exception handlers are well-defined interfaces into the kernel. A
process can begin executing in kernel.
6
XXX工业大学本科毕业设计外文文献翻译
进程管理
进程是Uinx操作系统最基本的抽象之一。一个进程就是处于执行期间的程序(目标代码放在某种存储介质上)。但进程并不仅仅局限于一段可执行程序(Unix称其为代码段(text section))。通常进程还要包含其他资源,像打开的文件、挂起的信号、内核内部数据、处理器状态、地址空间及一个或多个执行线程、当然还包括用来存放全局变量的数据段等。实际上,进程就是正在执行的程序代码的活标本。
执行线程,简称线程(thread),是在进程中活动的对象。每个线程用由一个独立的程序计数器、进程栈和一组进程寄存器。内核调度的对象是线程,而不是进程。在传统的Unix系统中,一个进程只包含一个线程,但现在的系统中,包含多个线程的多线程程序司空见惯。Linux系统的线程实现非常特别—他对线程和进程并不特别区分。对Linux而言,线程只不过是一种特殊的进程罢了。
在现代操作系统中,进程提供两种虚拟机制:虚拟处理器和虚拟内存。虽然实际上可能是许多进程正在分享一个处理器,但虚拟处理器给进程一种假象,让这些进程觉得自己在独享处理器。而虚拟内存让进程在获取和使用内存是觉得自己拥有整个操作系统的所有内存资源。有趣的是,注意在线程之间(这里是指包含在同一个进程中的进程)可以共享虚拟内存,但拥有各自的虚拟处理器。
程序本身并不是进程:进程是处于执行期间的程序以及它所包含的资源的总称。实际上完全可以存在两个或者多个不同的进程执行的是同一个程序。并且两个或两个以上并存的进程还可以共享许多诸如打开的文件、地址空间之类的资源。无疑,进程在它被创建的时刻开始存活。在Linux系统中,这通常是调用fork()系统调用的结果,该系统调用通过复制一个现有进程来创建一个全新的进程。调用fork()的进程被称为父子进程,新产生的进程被称为子进程。在调用结束的时,在返回这个相同位置上,父进程恢复执行,子进程开始执行。Fork()系统调用从内核返回两次:一次回到父进程,另一个回到新诞生的子进程。
通常,创建新的进程都是为了立即执行新的、不同的程序,而接着调用exec()这族函数就可以创建新的地址空间,并把新的程序载入。在现代Linux内核中,fork()实际上是由clone()系统调用实现的,后者将在后面讨论。
最终,程序通过exit()系统调用退出。这个函数会终结进程并将其占有的资源释放掉。父进程可以通过wait()系统调用查询子进程是否终结,这其实使得进程拥有了
7
XXX工业大学本科毕业设计外文文献翻译
等待指定进程执行完毕的能力。进程退出执行后被置为僵死状态,直到它的父进程调用wait()或waitpid()为止。
进程的另一个名字是任务(task)。Linux内核通常把进程也叫做任务。在这里所说的任务是指从内核观点看到的进程。
1 进程描述符及任务结构
内核把进程存放在叫做任务队列(task list)的双向循环链表中。链表的每一项都是类型为task_struct、称为进程描述符的结构,改结构定义在
task_struct相对较大,在32位机器上,它大约有1.7k字节。但如果考虑到该结构内包含了内核管理一个进程所需要的所有信息,那么它的大小也相当小了。进程描述符中包含的数据能完整的描述一个正在执行的程序:它打开的文件,进程的地址空间,挂起的信号,进程的状态,还有其他更多的信息。
2 进程描述符
Linux通过slab非配器分配task_struct结构,这样能达到对象复用和缓存着色的目的。在2.6以前的内核中,各个进程的task_struct存放在他们的内核栈的尾端。这样做的目的是为了让那些像x86这样寄存器较的硬件体系结构只要通过栈指针就能算出它的位置,从而避免使用额外的寄存器专门记录。由于现在用slab分配器动态生成task_struct,所以只需在栈底或栈顶创建一个新的结构struct thread)info。这个新的结构能使在汇编代码中计算器偏移变得相当的容易。
在x86上,thread_info {
Struct task_struct *任务;
Struct exec_domain *exec_domain;
Unsigned long flags;
Unsigned long
__u32
__s32
status;
cpu;
preempt_count;
Mm_segment addr_limit;
Struct restart_block restart_block;
Unsigned long
8
previous_esp;
XXX工业大学本科毕业设计外文文献翻译
}
__u8 supervisor_stack[0];
每个任务的thread_info 结构在它的内核栈的尾端分配。结构中task域中存放的是指向该任务实际task_struct的指针。
3 进程描述符的存放
内核通过一个唯一的进程标识值或PID来表示每个进程。PID 是一个数,表示为pid_t隐含类型,实际上就是一个int类型。为了老版本的Unix和Linux兼容,PID 的最大值默认设置为32768,尽管这个值也可以增加到类型所允许的范围。内核把每个进程PID存放在他们各自的进策划那个描述符中。
这个值很重要,因为它实际上就是系统中允许同时存在的进程的最大数目。尽管32768对一般的桌面系统足够用了,但是大型服务器可能需要更新进程。这个值越小,转一圈就越快,本类数值大的进程比数值小的进程迟运行,但这样一来就破坏了这一原则。如果确实需要的话,可以不考虑与老式系统的兼容,由系统管理员通过修改/proc/sys/kernel/pid_max来提高上限。
在内核中,访问任务通常需要获得指向其task_struct指针。实际上,内核中大部分处理进程的代码都是直接通过task_struct进行的。因此,通过current宏查找到当前正在运行进程的进程描述符的速度就显得尤为重要。硬件体系结构不同,该宏的实现也就不同,它必须针对专门的硬件体系结构作处理。有的硬件体系结构可以拿出一个专门寄存器来存放指向当前进程task_strcut的指针,用于加快访问速度。而有些像x86这样的体系结构,就只能在内核栈的尾端创建thread_info结构,通过计算偏移间接地查找task_struct结构。
在x86体系上,current把栈指针的后13个有效位屏蔽掉,用来计算出thread_info的偏移。该操作通过current_thread_info()函数完成的。汇编代码如下:
Mov $-81925, %eax
Andl %esp, %eax
这里假定栈的大小为8KB。当4KB的栈启用时,就用4096,而不是8192。
最后,current_thread_info()->task;
对比一下这部分在PowerPC上的实现(IBM基于RISC的现代微处理器),我们可以发现当前task_struct的地址是保存在一个寄存器中的。也就是说,在PPC上,
9
XXX工业大学本科毕业设计外文文献翻译
current宏只需要把r2寄存器中的值返回就行了。与x86不一样,PPC有足够多的寄存器,所以它的实现有这样的余地。而访问进程描述符是一个重要的频繁的操作,所以PPC的内核开发者会觉得完全有必要为此使用一个专门的寄存器。
4 进程状态
进程描述符中的state域描述了进程的当前状态。系统的每个进程都必然处于五种进程状态的一种。该域的值也必为下列五种状态标志之一:
(1) TASK_RUNNING(运行)——进程是可执行的,它或者正在执行,或者在运行队列中等待执行。这是进程在用户空间中执行唯一可能的状态,也可以应用到内核空间中正在执行的进程。
(2) TASK_INTERRUPTIBLE(可中断)——进程正在睡眠(也就是说它被阻塞),等待某些条件的达成。一档这些条件达成,内核就会把进程状态设置为运行。处于此状态的进程也会因为接受到信号而提前被唤醒并投入到运行。
(3) TASK_UNINTERRUPTIBLE(不可中断)——除了不会因为接受到信号而被唤醒从而投入运行外,这个状态与可打断的状态相同。这个状态通常在进程必须在等待时不受干扰或等待时间很快就会发生时出现。由于处于此状态的任务对信号不做响应,所以较之可中断状态,使用的较少。
(4) TASK_ZOMBIE(僵死)——该进程已经结束了,但是其父进程还没有调用wait()系统调用,为了父进程能够获知它的消息,子进程的进程描述符仍然被保留着。一旦父进程调用了wait进程描述符就会被释放掉。
(5) TASK_STOPPED(停止)——进程停止执行,进程没有投入运行也不能投入欲行。通常这种状态发生在接受到SIGSTOP、SIGTTIN、SIGTTOU等信号的时候。此外,在调试期间受到任何信号,都会使进程进入这种状态。
5 设置当前进程状态
内核经常需要调整某个进程的状态。这时最好使用set_task_state(task, state); 函数。该函数将制定的进程设置为给定的状态。必要的时候,它会设置内存屏障来强制其他处理器作重新排序(一般只有在SMP系统中有此必要),否则,它等价于:Task->state = state;方法set_current_state(state),和set_task_state(current, state)含义是等同的。
6 进程上下文
10
XXX工业大学本科毕业设计外文文献翻译
可执行程序代码是进程的重要组成部分。这些代码从可执行文件载入到进程的地址空间执行。一般程序在用户空间执行。当一个程序调用执行了系统调用或者触发了某个异常,它就陷入了内核空间。此时,我们称内核“代表进程执行”并处于进程上下文中。在此上下文中current宏是有效的。除非在此间隙有更高优先级的进程需要执行并由调度器做出了相应的调整,否则在内核退出的时候,程序恢复在用户空间继续执行。
系统调用和异常处理程序是对内核明确定义的接口。进程只有通过这些接口才能陷入内核执行——对内核的所有的访问都必须通过这些接口。
11
版权声明:本文标题:进程管理外文翻译 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.freenas.com.cn/jishu/1705560913h490007.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论