Spec
spec描述一个完成的容器的全部信息。
type Spec struct {
// Version of the Open Container Initiative Runtime Specification with which the bundle complies.
Version string `json:"ociVersion"`
// Process configures the container process.
Process *Process `json:"process,omitempty"`
// Root configures the container's root filesystem.
Root *Root `json:"root,omitempty"`
// Hostname configures the container's hostname.
Hostname string `json:"hostname,omitempty"`
// Domainname configures the container's domainname.
Domainname string `json:"domainname,omitempty"`
// Mounts configures additional mounts (on top of Root).
Mounts []Mount `json:"mounts,omitempty"`
// Hooks configures callbacks for container lifecycle events.
Hooks *Hooks `json:"hooks,omitempty" platform:"linux,solaris,zos"`
// Annotations contains arbitrary metadata for the container.
Annotations map[string]string `json:"annotations,omitempty"`
// Linux is platform-specific configuration for Linux based containers.
Linux *Linux `json:"linux,omitempty" platform:"linux"`
// Solaris is platform-specific configuration for Solaris based containers.
Solaris *Solaris `json:"solaris,omitempty" platform:"solaris"`
// Windows is platform-specific configuration for Windows based containers.
Windows *Windows `json:"windows,omitempty" platform:"windows"`
// VM specifies configuration for virtual-machine-based containers.
VM *VM `json:"vm,omitempty" platform:"vm"`
// ZOS is platform-specific configuration for z/OS based containers.
ZOS *ZOS `json:"zos,omitempty" platform:"zos"`
}这里我们主要关注
- Process:容器内运行的进程信息
- Root:容器内的文件系统信息
- Mounts:会挂载到容器内的路径
- Hooks:用于为容器不同生命周期切换触发的事件设置回调函数
- Linux:linux平台相关的配置,其他平台这里不关注。
只要能够按照上面的规范进行填充好字段,那么就足以完成容器的管理。
Root与Mounts
两者一起沟通了容器的完整的文件系统。
Root配置了容器中的根目录,除此之外就可以一个是否将根目录对容器只读的选项了。
// Root contains information about the container's root filesystem on the host.
type Root struct {
// Path is the absolute path to the container's root filesystem.
Path string `json:"path"`
// Readonly makes the root filesystem for the container readonly before the process is executed.
Readonly bool `json:"readonly,omitempty"`
}Mount则是挂载其他路径到容器根目录下。
// Mount specifies a mount for a container.
type Mount struct {
// Destination is the absolute path where the mount will be placed in the container.
Destination string `json:"destination"`
// Type specifies the mount kind.
Type string `json:"type,omitempty" platform:"linux,solaris,zos"`
// Source specifies the source path of the mount.
Source string `json:"source,omitempty"`
// Options are fstab style mount options.
Options []string `json:"options,omitempty"`
// UID/GID mappings used for changing file owners w/o calling chown, fs should support it.
// Every mount point could have its own mapping.
UIDMappings []LinuxIDMapping `json:"uidMappings,omitempty" platform:"linux"`
GIDMappings []LinuxIDMapping `json:"gidMappings,omitempty" platform:"linux"`
}- Type:挂载的文件系统类型。常见的有
bind- 绑定挂载,将主机目录/文件绑定到容器内tmpfs- 内存文件系统proc- /proc 文件系统sysfs- /sys 文件系统devpts- 设备伪终端文件系统ext4,xfs,btrfs等传统文件系统类型
- Options:挂载选项,控制挂载行为
ro- 只读挂载rw- 读写挂载nosuid- 禁用 setuid 位nodev- 禁用设备文件noexec- 禁止执行文件bind- 绑定挂载标志rbind- 递归绑定挂载shared,private,slave- 挂载传播模式
- UIDMappings和GIDMappings:用于将容器中的用户id或者组id映射到宿主机上。这样可以避免容器启动大量的chown操作。但是需要文件系统支持id映射挂载
Hooks生命周期管理
一个容器具有如下的生命周期
- CreateRuntime:容器已经创建。命名空间还是宿主机命名空间。用来设置宿主机层面的资源,比如网络配置、设备等。
- CreateContainer:容器已经创建。进入到容器命名空间了。在容器内部进行初始化,如挂载文件系统、设置容器内环境等。pivot_root之前,可以理解为chroot切换根目录之前。
- StartContainer:start操作调用,但是容器进程启动前。命名空间还是容器命名空间,用来做容器启动前最后的工作。pivot_root完成之后。也就是之后看到的就是容器内文件系统视图了。
- Poststart:容器进程启动后。命名空间为宿主机命名空间。做启动后的清理,监视工作
- Poststop:容器进程退出后。命名空间为宿主机命名空间。做退出后的资源清理等工作
type Hooks struct {
// Prestart is Deprecated. Prestart is a list of hooks to be run before the container process is executed.
// It is called in the Runtime Namespace
//
// Deprecated: use [Hooks.CreateRuntime], [Hooks.CreateContainer], and
// [Hooks.StartContainer] instead, which allow more granular hook control
// during the create and start phase.
Prestart []Hook `json:"prestart,omitempty"`
// CreateRuntime is a list of hooks to be run after the container has been created but before pivot_root or any equivalent operation has been called
// It is called in the Runtime Namespace
CreateRuntime []Hook `json:"createRuntime,omitempty"`
// CreateContainer is a list of hooks to be run after the container has been created but before pivot_root or any equivalent operation has been called
// It is called in the Container Namespace
CreateContainer []Hook `json:"createContainer,omitempty"`
// StartContainer is a list of hooks to be run after the start operation is called but before the container process is started
// It is called in the Container Namespace
StartContainer []Hook `json:"startContainer,omitempty"`
// Poststart is a list of hooks to be run after the container process is started.
// It is called in the Runtime Namespace
Poststart []Hook `json:"poststart,omitempty"`
// Poststop is a list of hooks to be run after the container process exits.
// It is called in the Runtime Namespace
Poststop []Hook `json:"poststop,omitempty"`
}每个生命周期可以设置一组回调命令,Hook的结构体如下
type Hook struct {
Path string `json:"path"`
Args []string `json:"args,omitempty"`
Env []string `json:"env,omitempty"`
Timeout *int `json:"timeout,omitempty"`
}从定义中就可以看出来就是执行一个可执行文件。
Process
这个比较复杂,描述了容器内一个进程要运行所需的所有信息
type Process struct {
// Terminal creates an interactive terminal for the container.
Terminal bool `json:"terminal,omitempty"`
// ConsoleSize specifies the size of the console.
ConsoleSize *Box `json:"consoleSize,omitempty"`
// User specifies user information for the process.
User User `json:"user"`
// Args specifies the binary and arguments for the application to execute.
Args []string `json:"args,omitempty"`
// CommandLine specifies the full command line for the application to execute on Windows.
CommandLine string `json:"commandLine,omitempty" platform:"windows"`
// Env populates the process environment for the process.
Env []string `json:"env,omitempty"`
// Cwd is the current working directory for the process and must be
// relative to the container's root.
Cwd string `json:"cwd"`
// Capabilities are Linux capabilities that are kept for the process.
Capabilities *LinuxCapabilities `json:"capabilities,omitempty" platform:"linux"`
// Rlimits specifies rlimit options to apply to the process.
Rlimits []POSIXRlimit `json:"rlimits,omitempty" platform:"linux,solaris,zos"`
// NoNewPrivileges controls whether additional privileges could be gained by processes in the container.
NoNewPrivileges bool `json:"noNewPrivileges,omitempty" platform:"linux,zos"`
// ApparmorProfile specifies the apparmor profile for the container.
ApparmorProfile string `json:"apparmorProfile,omitempty" platform:"linux"`
// Specify an oom_score_adj for the container.
OOMScoreAdj *int `json:"oomScoreAdj,omitempty" platform:"linux"`
// Scheduler specifies the scheduling attributes for a process
Scheduler *Scheduler `json:"scheduler,omitempty" platform:"linux"`
// SelinuxLabel specifies the selinux context that the container process is run as.
SelinuxLabel string `json:"selinuxLabel,omitempty" platform:"linux"`
// IOPriority contains the I/O priority settings for the cgroup.
IOPriority *LinuxIOPriority `json:"ioPriority,omitempty" platform:"linux"`
// ExecCPUAffinity specifies CPU affinity for exec processes.
ExecCPUAffinity *CPUAffinity `json:"execCPUAffinity,omitempty" platform:"linux"`
}包括命令、参数、环境变量、linux平台相关一些进程信息。就不细讲了,在后续容器运行过程中涉及到比较重要再单独进行说明就好
Linux平台配置
runc整个就是基于linux平台所提供的容器技术来实现的。包括namespace隔离与cgroup限制等都是使用linux的平台支撑的。
type Linux struct {
// UIDMapping specifies user mappings for supporting user namespaces.
UIDMappings []LinuxIDMapping `json:"uidMappings,omitempty"`
// GIDMapping specifies group mappings for supporting user namespaces.
GIDMappings []LinuxIDMapping `json:"gidMappings,omitempty"`
// Sysctl are a set of key value pairs that are set for the container on start
Sysctl map[string]string `json:"sysctl,omitempty"`
// Resources contain cgroup information for handling resource constraints
// for the container
Resources *LinuxResources `json:"resources,omitempty"`
// CgroupsPath specifies the path to cgroups that are created and/or joined by the container.
// The path is expected to be relative to the cgroups mountpoint.
// If resources are specified, the cgroups at CgroupsPath will be updated based on resources.
CgroupsPath string `json:"cgroupsPath,omitempty"`
// Namespaces contains the namespaces that are created and/or joined by the container
Namespaces []LinuxNamespace `json:"namespaces,omitempty"`
// Devices are a list of device nodes that are created for the container
Devices []LinuxDevice `json:"devices,omitempty"`
// NetDevices are key-value pairs, keyed by network device name on the host, moved to the container's network namespace.
NetDevices map[string]LinuxNetDevice `json:"netDevices,omitempty"`
// Seccomp specifies the seccomp security settings for the container.
Seccomp *LinuxSeccomp `json:"seccomp,omitempty"`
// RootfsPropagation is the rootfs mount propagation mode for the container.
RootfsPropagation string `json:"rootfsPropagation,omitempty"`
// MaskedPaths masks over the provided paths inside the container.
MaskedPaths []string `json:"maskedPaths,omitempty"`
// ReadonlyPaths sets the provided paths as RO inside the container.
ReadonlyPaths []string `json:"readonlyPaths,omitempty"`
// MountLabel specifies the selinux context for the mounts in the container.
MountLabel string `json:"mountLabel,omitempty"`
// IntelRdt contains Intel Resource Director Technology (RDT) information for
// handling resource constraints and monitoring metrics (e.g., L3 cache, memory bandwidth) for the container
IntelRdt *LinuxIntelRdt `json:"intelRdt,omitempty"`
// Personality contains configuration for the Linux personality syscall
Personality *LinuxPersonality `json:"personality,omitempty"`
// TimeOffsets specifies the offset for supporting time namespaces.
TimeOffsets map[string]LinuxTimeOffset `json:"timeOffsets,omitempty"`
}下一步就是进入容器run