Containers Security

Daniel J Walsh

Distinquished Engineer

Twitter: @rhatdan



Rules of this talk

Audience must say all text that is written in Red?

Container Security

As explained by the three pigs

Pig == Application

Chapter 1

Where should the pigs live?

When should I use containers versus virtual machines?

Standalone Homes
(Separate Physical Machines)

Duplex Home
(Virtual Machines)

Apartment Building

(Services Same Machine)

(setenforce 0)

Pigs in Apartment Buildings

Best combination resource sharing
ease of maintenance & Security.

Chapter 2 - What kind of
apartment building?

What platform should host your containers?


Running containers on do it yourself platform.


Running containers on community platform.


Running containers on RHEL.

Chapter 3

How do I separate/secure pig apartments?

How do you ensure container separation?

Containers do not Contain

Do you care?

Should you care?

Treat Container Services just like
regular services

Drop privileges as quickly as possible

Run your services as non Root whenever possible

Treat root within a container the same as root outside of the container

"Docker is about running random crap from the internet as root on your host"

Only run containers from trusted parties

Why don't containers contain?

Everything in Linux is not namespaced

Containers are not comprehensive like virtual machines (kvm)

Kernel file systems: /sys, /sys/fs, /proc/sys

Cgroups, SELinux, /dev/mem, kernel modules

Overview of Security within OCI containers

Protecting Host Kernel
from processes within containers

But Dan

Our engineers say their
applications need to run as root

Protecting Kernel file systems

Protect Kernel file systems: /sys, /sys/fs, /proc/sys

Read Only Mount Points

Mask Out Kernel file systems

Limiting the power of root

Stop root from remounting file systems as read/write


man capabilities

     For  the  purpose  of  performing  permission  checks, traditional UNIX
     implementations distinguish two  categories  of  processes:  privileged
     processes  (whose  effective  user ID is 0, referred to as superuser or
     root), and unprivileged processes (whose  effective  UID  is  nonzero).
     Privileged processes bypass all kernel permission checks, while 
     unprivileged processes are subject to full permission checking based on
     the process's credentials (usually: effective UID, effective GID, and 
     supplementary group list).

     Starting with kernel 2.2, Linux divides  the  privileges  traditionally
     associated  with  superuser into distinct units, known as capabilities,
     which can be independently enabled and disabled.   Capabilities  are  a
     per-thread attribute.

Capabilities Removed

CAP_SETPCAPModify process capabilities

CAP_SYS_MODULEInsert/Remove kernel modules
CAP_SYS_RAWIOModify Kernel Memory
CAP_SYS_PACCTConfigure process accounting
CAP_SYS_NICEModify Priotity of processes
CAP_SYS_RESOURCEOverride Resource Limits
CAP_SYS_TIMEModify the system clock
CAP_SYS_TTY_CONFIGConfigure tty devices
CAP_AUDIT_WRITEWrite the audit log
CAP_AUDIT_CONTROLConfigure Audit Subsystem
CAP_MAC_OVERRIDEIgnore Kernel MAC Policy
CAP_MAC_ADMINConfigure MAC Configuration
CAP_SYSLOGModify Kernel printk behavior

Capabilities Removed

CAP_NET_ADMINConfigure the network


less /usr/include/linux/capability.h 
/* Allow configuration of the secure attention key */
/* Allow administration of the random device */
/* Allow examination and configuration of disk quotas */
/* Allow setting the domainname */
/* Allow setting the hostname */
/* Allow calling bdflush() */
/* Allow mount() and umount(), setting up new smb connection */
/* Allow some autofs root ioctls */
/* Allow nfsservctl */
/* Allow VM86_REQUEST_IRQ */
/* Allow to read/write pci config on alpha */
/* Allow irix_prctl on mips (setstacksize) */
/* Allow flushing all cache on m68k (sys_cacheflush) */
/* Allow removing semaphores */
/* Used instead of CAP_CHOWN to "chown" IPC message queues, semaphores
   and shared memory */
/* Allow locking/unlocking of shared memory segment */
/* Allow turning swap on/off */
/* Allow forged pids on socket credentials passing */
/* Allow setting readahead and flushing buffers on block devices */


/* Allow setting geometry in floppy driver */
/* Allow turning DMA on/off in xd driver */
/* Allow administration of md devices (mostly the above, but some
   extra ioctls) */
/* Allow tuning the ide driver */
/* Allow access to the nvram device */
/* Allow administration of apm_bios, serial and bttv (TV) device */
/* Allow manufacturer commands in isdn CAPI support driver */
/* Allow reading non-standardized portions of pci configuration space */
/* Allow DDI debug ioctl on sbpcd driver */
/* Allow setting up serial ports */
/* Allow sending raw qic-117 commands */
/* Allow enabling/disabling tagged queuing on SCSI controllers and sending
   arbitrary SCSI commands */
/* Allow setting encryption key on loopback filesystem */
/* Allow setting zone reclaim policy */

Limiting operating systems view


PID Name Space

Network Name Space

Controlling interaction with Device nodes


Device Cgroup

Controls which device nodes can be created within namespace

Device nodes allow processes to configure kernel


images mounted with nodev

Protecting the
host file system


Everyone Please Stand Up

SELinux is a LABELING system

Every Process has a LABEL

Every File, Directory, System object has a LABEL

Policy rules control access between labeled processes and labeled objects

The Kernel enforces the rules

Grab your
Text Book

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Protects the host system from container processes

Container processes can only read/execute /usr files

Container processes only write to container files.

process typecontainer_t
file typecontainer_file_t

MCS Enforcement

Multi Category Security

Based on Multi Level Security (MLS)

MCS Enforcement

MCS Enforcement

MCS Enforcement

MCS Enforcement

MCS Enforcement

Protects containers from each other.

Container Processes can only read/write their own files.

Container Runtimes pick out unique random MCS Label.

Assigns MCS Label to all content

Launches the container processes with same label

Limiting the syscall attack surface on the kernel


Shrink the attack surface on the kernel

Eliminate syscalls
kexec_load, open_by_handle_at, init_module, finit_module, delete_module, iopl, ioperm, swapon, swapoff, sysfs, sysctl, adjtimex, clock_adjtime, lookup_dcookie, perf_event_open, fanotify_init, kcmp

block 32 bit syscalls

block old weird networks

USING DAC to control root within containers

User Name Space

Map non root user to root within container

Protect the host from containers

Can be used to protect one container from another

Biggest Problem lack of file system support

Podman has full support for User Namespace


New Container Runtimes


Dedicated small daemon for running containers under Kubernetes


Dedicated tool for building container images


Replacement CLI for Docker
Run/develop containers as non root

Don't let this be you.


Chapter 4

How do you furnish the pigs apartment?

How do I secure content inside container?

LINUX 1999

Where did you go to get software?

Go to or
and google it?

I found it on, download and install.

Hey I hear there is a big Security vulnerability in Zlib.

How many copies of the Zlib vulnerability to you have?

I have no clue!!!

Red Hat to the rescue

Red Hat Enterprise Linux solved this problem

Certified software and hardware platforms

People have no idea of quality of software in container images

Or they are building them themselves?

Lets Talk About DEV/OPS

Containers move the responsibility for security updates from the Operator to the Developer.

Do you trust developers to
fix security issues in their images?

What happens when the next Shell Shock hits

RHEL Certified Images

Introducing Atomic Scan

Don't let this be you.


Introducing Simple Signing

Who maintains your container environment?

Community Standards?

Community Standards?

Don't let this be you.