At Netflix we have 15 BPF programs running on cloud servers by default; Facebook has 40. These programs are not processes or kernel modules, and don't appear in traditional observability tools. They are a new type of software, and make a fundamental change to a 50-year old kernel model by introducing a new interface for applications to make kernel requests, alongside syscalls.
BPF originally stood for Berkeley Packet Filter, but has been extended in Linux to become a generic kernel execution engine, capable of running a new type of user-defined and kernel-mode applications. This is what BPF is really about, and I described this for the first time in my [Ubuntu Masters] keynote.
The video is on [youtube]:
And the slides are on [slideshare]:
My [BPF Performance Tools] book was just released as an eBook, and covers just one use case of BPF: observability. I'm also speaking about this topic at [re:Invent] this week as well (see you there).
BPF is the biggest operating systems change I've seen in my career, and it's thrilling to be a part of it. Thanks to Canonical for inviting me to speak about it at the first Ubuntu Masters event.
[Ubuntu Masters]: https://ubuntu.com/blog/the-masters-speak-forward-thinking-ubuntu-users-gather-to-share-their-experiences
[BPF Performance Tools]: http://www.brendangregg.com/bpf-performance-tools-book.html
At Tuesday Night
Live with James Hamilton at the 2016 AWS re:Invent conference, I introduced
the first Amazon Web Services custom silicon. The ASIC I showed formed the foundational
core of our second generation custom network interface controllers and, even
back in 2016, there was at least one of these ASICs going into every new server
in the AWS fleet. This work has continued for many years now and this part and
subsequent generations form the hardware basis of the AWS Nitro System. The Nitro system is used to deliver these
features for AWS Elastic Compute Cluster
(EC2) instance types:
High speed networking with hardware offload
High speed EBS storage with hardware offload
NVMe local storage
Remote Direct Memory Access (RDMA) for MPI and
Hardware protection/firmware verification for bare
All business logic needed to control EC2 instances
We continue to consume millions of the Nitro ASICs every
year so, even though it’s only used by AWS, it’s actually a fairly high volume
server component. This and follow-on technology has been supporting much of the
innovation going on in EC2 but haven’t had a chance to get into much detail on
how Nitro actually works.
At re:Invent 2018
Anthony Liguori, one of the lead engineers on the AWS Nitro System project gave
what was, at least for me, one of the best talks at re:Invent outside of the
keynotes. It’s worth watching the video (URL below) but I’ll cover some of what
Anthony went through in his talk here.
The Nitro System powers all EC2 Instance types over the last
couple of years. There are three major
Nitro Card I/O Acceleration
Nitro Security Chip
Different EC2 server instance types include different Nitro
System features and some server types have many Nitro System cards that
implement the five main features of the AWS Nitro System:
These features formed the backbone for Anthony Liguori’s 2018
re:Invent talk and he went through some of the characteristics of each.
Nitro Card for VPC
The Nitro card for VPC is essentially a PCIe attached Network Interface Card (NIC) often called a network adapter or, in some parts of the industry, a network controller. This is the card that implements the hardware interface between EC2 servers and the network connection or connections implemented on that server type. And, like all NICs, interfacing with it requires that there be a specific device driver loaded to support communicating with the network adapter. In the case of AWS NICs, the Elastic Network Adapter (ENA) is the device driver support for our NICs. This driver is now included in all major operating systems and distributions.
The Nitro Card for VPC supports network packet
encapsulation/decapsulation, implements EC2 security groups, enforces limits,
and is responsible for routing. Having
these features implemented off of the server hardware rather than in the
hypervisor allows customers to fully use the underlying server hardware without
impacting network performance, impacting other users, and we don’t have to have
some server cores unavailable to customers to handle networking tasks. And, it
also allows secure networking support without requiring server resources to be
reserved for AWS use. The largest instance types get access to all server cores.
It wasn’t covered in the talk but the Nitro Card for VPC
also supports Remote Direct
Memory Access (RDMA) networking. The Elastic Fabric Adapter (EFA) supports
both the OpenFabrics AllianceLibfabric API or the popular Message Passing
Interface (MPI). These APIs both provide network access with operating
system bypass when used with EFA. MPI is in common use in high performance
computing applications and, to a lesser extent, in latency sensitive data
intensive applications and some distributed databases.
Nitro Card for EBS
The Nitro Card for EBS supports storage acceleration for
EBS. All instance local storage is
implemented as NVMe
devices and the Nitro Card for EBS supports transparent encryption, limits to
protect the performance characteristics of the system for other users, drive
monitoring to monitor SSD wear, and it also supports bare metal instance types.
Remote storage is again
implemented as NVMe devices but this time as NVMe
over Fabrics supporting access to EBS volumes again with encryption and
again without impacting other EC2 users and with security even in a bare metal
The Nitro card for EBS was first
launched in the EC2 C4 instance family.
Nitro Card for Instance Storage
The Nitro Card for Instance storage also implements NVMe (Non-Volatile Memory
for PCIe) for local EC2 instance storage.
Nitro Card Controller
The Nitro Card Controller coordinates all other Nitro cards,
the server hypervisor, and the Nitro Security Chip. It implements the hardware
root of trust using the Nitro Security Chip and supports instance monitoring
functions. It also implements the NVMe controller functionality for one or more
Nitro Cards for EBS.
Nitro Security Chip
The Nitro security chip traps all I/O to non-volatile
storage including BIOS and all I/O device firmware and any other controller
firmware on the server. This is a simple approach to security where the general
purpose processor is simply unable to change any firmware or device
configuration. Rather than accept the error prone and complex task of ensuring
access is approved and correct, no access is allowed. EC2 servers can’t update
their firmware. This is GREAT from a security perspective but the obvious
question is how is the firmware updated. It’s updated using by AWS and AWS only
through the Nitro System.
The Nitro Security Chip also implements the hardware root of
trust. This system replaces 10s of millions of lines of code that for the Unified
Extensible Firmware Interface (UEFI) and supports secure boot. In starts
the server up untrusted, then measures every firmware system on the server to
ensure that none have been modified or changed in any unauthorized way. Each checksum (device measure) is checked against
the verified correct checksum stored in the Nitro Security Chip.
The Nitro System supports key network, server, security,
firmware patching, and monitoring functions freeing up the entire underlying
server for customer use. This allows EC2 instances to have access to all cores
– none need to be reserved for storage or network I/O. This both gives more
resources over to our largest instance types for customer use – we don’t need
to reserve resource for housekeeping, monitoring, security, network I/O, or
storage. The Nitro System also makes possible the use of a very simple, light
weight hypervisor that is just about always quiescent and it allows us to
securely support bare metal instance types.
More data on the AWS Nitro System from Anthony Liguori, one
of the lead engineers behind the software systems that make up the AWS Nitro
If you are bored and disgusted by politics and don’t bother to
vote, you are in effect voting for the entrenched Establishments
of the two major parties, who please rest assured are not dumb,
and who are keenly aware that it is in their interests to keep you
disgusted and bored and cynical and to give you every possible
psychological reason to stay at home doing one-hitters and
watching MTV on primary day. By all means stay home if you want,
but don’t bullshit yourself that you’re not voting. In reality,
there is no such thing as not voting: you either vote by voting,
or you vote by staying home and tacitly doubling the value of some
Please check your registration status and register to vote…
it takes two minutes. [Voter registration deadlines] are fast
approaching in many US states — there are deadlines tomorrow in
Arizona, Arkansas, Florida, Georgia, Indiana, Kentucky, Louisiana,
Michigan, Mississippi, New Mexico, Ohio, Pennsylvania, Tennessee,
Kottke wrote that yesterday, so those registration deadlines are today. I don’t care who you want to vote for, I implore you to register and vote. And if you think you are registered, double-check. It really does just take a minute.
AWS just released a new podcast on how next generation security technology, backed by automated reasoning, is providing you higher levels of assurance for key components of your AWS architecture. Byron Cook, Director of the AWS Automated Reasoning Group, discusses how automated reasoning is embedded within AWS services and code and the tools customers can take advantage of today to achieve provable security.
As the AWS cloud continues to grow, offering more services and features for you to architect your environment, AWS is working to ensure that the security of the cloud meets the pace of growth and your needs. To address the evolving threat landscape, AWS has made it easier to operate workloads securely in the cloud with a host of services and features that strengthen your security posture. Using automated reasoning, a branch of artificial intelligence, AWS is implementing next-generation security technology to help secure its platform. Automated reasoning at AWS helps customers verify and ensure continuous security of key components in the cloud, providing provable security—the highest assurance of cloud security.
Automated reasoning is powered by mathematical logic and consists of designing and implementing mechanized mathematical proofs that key components in the cloud are operating in alignment with customers’ intended security measures. With automated reasoning, AWS enables customers to detect entire classes of misconfigurations that could potentially expose vulnerable data. This relieves the customers’ burden of having to manually verify increasingly granular configurations for a complex organization, providing new levels of assurance that security verification scales with enterprise growth.
We hope you enjoy the podcast! If you have feedback about this blog post, submit comments in the Comments section below.
Want more AWS Security news? Follow us on Twitter.