[shimV2] implements task, shimdiag service and wires up pod/container controllers#2685
[shimV2] implements task, shimdiag service and wires up pod/container controllers#2685rawahars wants to merge 2 commits intomicrosoft:mainfrom
Conversation
… controllers Implements the full TaskService and ShimDiag service methods for the LCOW containerd shim (containerd-shim-lcow-v2), replacing the previous `ErrNotImplemented` stubs with working lifecycle management. Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
helsaawy
left a comment
There was a problem hiding this comment.
nit about comments, but lgtm overall
| // Set the sandbox ID in the logger context for all subsequent logs in this request. | ||
| ctx, _ = log.WithContext(ctx, logrus.WithField(logfields.SandboxID, s.sandboxID)) | ||
|
|
||
| r, e := s.diagExecInHostInternal(ctx, request) |
There was a problem hiding this comment.
not: this is unnecessary/duplicative since any log using ctx will have span and trace IDs, and that span would have the attributes set above.
There was a problem hiding this comment.
You're right about that. While there was a way to correlate the sandbox ID from the other fields, it seemed more straightforward to just add it for now. I anyways have #2683 to re-calibrate the logging in the new shim. If it works for you, we can take a deeper look there if we want different log behaviour.
| // Originally this method was intended to be used in a single pod setup and therefore, | ||
| // we do not specify a TaskID in the request. Since this shim can support multiple pods, | ||
| // we will return all tasks running in the UVM, regardless of which pod they belong to. |
There was a problem hiding this comment.
future PR work, but we should add a sandbox id field to proto request to select which tasks we want listed
and maybe a DiagSandboxes rpc as well?
There was a problem hiding this comment.
Created an Github Issue for tracking the same- #2695
Summary
This PR replaces all
ErrNotImplementedstubs in the LCOW containerd shim (containerd-shim-lcow-v2) with fullTaskServiceandShimDiagimplementations. These updates successfully delegate tasks to the pod, container, and process controllers introduced in the previous commit.File Changes & Implementations
service_task_internal.go: Implements every task lifecycle RPC (Create,Start,Delete,Kill,Exec,Wait,State,Pids,Stats,Update,CloseIO,ResizePty,Shutdown).Create: Distinguishes sandbox vs. workload containers via OCI annotations and manages pod/container controller mappings.Delete: Handles both individual container and full pod teardown, including network cleanup.Kill: Supports fan-out to all workload containers when theAllflag is set on a sandbox.Shutdown: Operates as a no-op since teardown is deferred toSandboxService.ShutdownSandbox.service_shimdiag_internal.go:DiagTasks(lists all containers and their exec processes across every pod).DiagShare(shares a host path into the guest VM via the plan9 controller).DiagStacks(collects goroutine stacks from both the host shim and the guest VM).service.go:podControllersandcontainerPodMappingmaps to track active pods and their containers.sync.Mutextosync.RWMutex.ensureVMRunning()precondition guard.send()for task event publishing.main.go: Adds a minimum Windows build version check (≥ 26100 / V25H1Server) at startup and updates the build tag towindows && lcow.plugin/plugin.go: Updates ETW provider ID and uses thelogfields.SandboxIDconstant for structured logging.//go:build windows && lcowbuild constraint.Signed-off-by: Harsh Rawat harshrawat@microsoft.com