using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container...

30
Inspektor Gadget and traceloop Tracing containers syscalls using BPF FOSDEM | 1 Feb 2020 https://tinyurl.com/fosdem-gadget

Transcript of using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container...

Page 1: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Inspektor Gadget and traceloopTracing containers syscalls using BPFFOSDEM | 1 Feb 2020https://tinyurl.com/fosdem-gadget

Page 2: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Hi, I’m Alban

Alban CrequyCTO, Kinvolk

Github: albanTwitter: albcrEmail: [email protected]

Page 3: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Driving Kubernetes Forward

Engineering products + support services for Kubernetes, containers, process management and Linux user-space + kernelBlog: kinvolk.io/blogGithub: kinvolkTwitter: kinvolkioEmail: [email protected]

Kinvolk

Page 4: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Kubernetesstrace BPF

Page 5: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

TraceloopTracing system calls in cgroups using BPF and overwritable ring buffershttps://github.com/kinvolk/traceloop

Inspektor GadgetCollection of gadgets for developers of Kubernetes applicationshttps://github.com/kinvolk/inspektor-gadget

Kubernetes Slack: #inspektor-gadget

Page 6: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

BPF in a nutshell

Page 7: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Debugging with “strace” on Kubernetes

- Strace is slow- cannot be used for all pods on prod

- We need to know what’s going to crash- And start strace just before- Problem with unreproducible crashes

- Idea: “flight recorder”- Capture syscalls with BPF instead of strace- Send the events to a per-pod ring buffer- Only read the ring buffer when the pod crashed

Page 8: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Comparing strace and traceloop

strace traceloop

Capture method ptrace BPF on tracepoints

Granularity process cgroup

Speed slow fast

Reliability SynchronousCannot lose events

AsynchronousCan lose events

Can fail to read buffers (EFAULT)

Page 9: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Debugging with “strace” on KubernetesBPF program

(tracepoint sys_enter)

BPF program(tail call)

perf ring buffer

BPF program(tail call)

perf ring buffer

HashMap “cgrpTailcall”Key: cgroup_id Value: BPF program

Pod 1:

Pod 2:

kernel

userspace

Daemon Set

Only read the ring buffer when the pod crashes

Page 10: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

DEMO

traceloop

Page 11: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Adapting BPF tracing tools to Kubernetes

Page 12: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

What do we need for Kubernetes?

❏ Granularity of tracing: your pod❏ Pids are not useful when we don’t know which container it is❏ We don’t want to trace all the system processes on a node

❏ Aggregation❏ Using Kubernetes labels

❏ kubectl-like UX experience❏ Developers should not need to SSH❏ Developers should not need to deploy a pod + kubectl-exec for each tracing

Page 13: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Tracing tools for Kubernetes

Linux tracing tool Kubernetes tracing tool

bpftracehttps://github.com/iovisor/bpftrace https://github.com/iovisor/kubectl-trace

BPF Compiler Collection (BCC)

https://github.com/iovisor/bcc Inspektor Gadget

https://github.com/kinvolk/inspektor-gadgettraceloop

https://github.com/kinvolk/traceloop

Page 14: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

K8s integration

My laptop

$ kubectl gadget...

kubectl-gadget

Kubernetes Control Plane

(API Server, scheduler, ...)

exec client plugin

worker node

“gadget” podexec traceloop & bcc

kernel

InstallBPF program

Deploy gadget pods

Kubernetes cluster

Create DaemonSetkubectl-exec

Page 15: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

DEMO

Inspektor Gadget+traceloop

Page 16: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Stopgaps in traceloop

Page 17: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Inspektor Gadget + traceloop

- Works on:- Kinvolk’s Flatcar Container Linux + Lokomotive- Minikube (Linux 4.14)- GKE (Linux 4.14)

- Without:- Linux >= 4.18 (for bpf_get_current_cgroup_id)- cgroup-v2- runc without using OCI hooks

Page 18: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

No cgroup-v2

- bpf_get_current_cgroup_id not available- Detect new namespaces:

struct task_struct -> struct nsproxy -> struct uts_namespace -> inode- Find out struct offsets at startup to support several kernel versions without

recompiling the BPF program

Page 19: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

No OCI hooks

- Cannot add a new “tailcall” module in the PreStart OCI hook

- Cannot directly use the Kubernetes API

- That would be too late to get the early syscalls

Page 20: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

- Add a pool of “tailcall” modules for future containers- When detecting a new container from BPF, plug the

prog map array from BPF- Reconcile with containers from the Kubernetes API

No OCI hooks

Page 21: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Other gadgets

Page 22: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Use cases

- Debugging your app- ✅ traceloop- ✅ opensnoop, execsnoop- ❌ WIP: tcptop

- Help writing Kubernetes network policies- ❌ TODO (tcpconnect)

- Help writing Kubernetes PSP- ❌ WIP: capabilities

Page 23: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

DEMO

Inspektor Gadget+ execsnoop, opensnoop

Page 24: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Gadget Tracer Manager

Page 25: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Selecting containers

$ kubectl gadget execsnoop \--label k8s-app=myapp1,tier=bar \--namespace default \--podname myapp1-l9ttj \--node ip-10-0-12-31 \--containerindex 0

Page 26: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Pods & tracers come and go

Pod “myapp1-l9ttj”

tracer 1 Pod “myapp1-1bis9j”

Pod “myapp2-7fd9zx”tracer 2

Page 27: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Addcontainer

Keeping track of containers & tracers

BPF Map/sys/fs/bpf/gadget/cgroupidset-1a16cf

for tracer “1a16cf”(set of matching containers)

Gadget Tracer Manager

(gRPC API)

OCI HookPreStart

OCI HookPostStop

Remove container

Inspektor Gadget

Addtracer

Remove tracer

bcc-wrapper.sh kubectlexec

Update BPF maps

BCC’s execsnoop

pseudo BPF codeu64 cgroupid = bpf_get_current_cgroup_id();if (cgroupset.lookup(&cgroupid) == NULL) return 0;

BPF programkprobe “syscall__execve”

Page 28: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Contribute

Page 29: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

- Join the Kubernetes Slack #inspektor-gadget- GitHub issues with label “good first issue”

How to contribute

Page 30: using BPF Tracing containers syscalls traceloop Inspektor ... · - Kinvolk’s Flatcar Container Linux + Lokomotive - Minikube (Linux 4.14) - GKE (Linux 4.14) - Without: - Linux >=

Alban CrequyGithub: albanTwitter: albcrEmail: [email protected]

KinvolkBlog: kinvolk.io/blogGithub: kinvolkTwitter: kinvolkioEmail: [email protected]

Kubernetes Slack: #inspektor-gadgetSlides: https://tinyurl.com/fosdem-gadget

Thank you!