If one thing sets SwiftUI apart from other declarative UI frameworks like React, it is its use of a static type system. The typing of views makes the diffing of views fast and unambiguous. How exactly does that work? And what more is involved in the diffing that SwiftUI has to perform on views?
Recently I have started SwiftTUI, which is a SwiftUI clone for text-based terminal applications. For implementing this, I needed to find out how the diffing in SwiftUI works, and here I present my findings.
Declarative UI frameworks and diffing go hand in hand. With declarative UI programming, instead of directly modifying the UI elements displayed on screen, you modify some state. The UI framework will have to compute how this affects the views.
It cannot just replace the views in their entirety after state changed. Many views have child views that also have state. We need to know if a child view was removed or inserted, to see if we need to let go of state for this view, or create new state.
We also need to know which views are added or removed to update the container they are in. Stacks will want to layout their children again if their content changed, and a list might only act on an insertion or removal if the view in question is in the currently visible area. If it knows what changed, it can also animate those changes.
The problem with diffing views is that it can not only be inefficient, the changes can even be ambiguous.
Take a list of two labels:
→ List
→ Label("One")
→ Label("Two")
And say that it changed to the following list:
→ List
→ Label("Two")
→ Label("Three")
What has changed here? Maybe the label displaying "One" was removed from the beginning of the list, and the label displaying "Three" was added at the end. Or maybe the first label had its text changed from "One" into "Two", and the second label from "Two" into "Three". Both could be true.
A UI framework can decide to just favor one type of changes over the other. Or it can require UI elements to have identifiers so that it can keep them apart. React gives you the choice.
If you have a list of UI elements with identifiers, you need a shortest-edit-script algorithm like the Myers algorithm, that can find the smallest set of inserts and removals to get from one sequence to another.
For most changes, SwiftUI does not need such an algorithm. It also does not need identifiers for views to prevent ambiguity. This relies solely on the static typing of views.
This is the official description of a View
in SwiftUI:
A type that represents part of your app’s user interface and provides modifiers that you use to configure views.
There is something a bit misleading with this description. The description makes sense if we think of views like Text
, Color
or of shapes. But views like ForEach
combine multiple other views. Those views are still SwiftUI views, but they represent multiple parts of an app's user interface.
A particularly confusing view in this sense is EmptyView
. EmptyView
is a view that does not represent any user interface element. If you put only an EmptyView
in a List
, the list will not have any cells.
Other interesting views are those created with view builders. View builders can have if
-statements, so views built with them can represent a dynamic number of user interface elements, like ForEach
.
In code, View
is defined by a protocol, for which the only requirement is that is has a body which is also a view.
protocol View {
associatedtype Body: View
var body: Body { get }
}
This cannot go on forever, there must be some views which are not defined in terms of other views. I will call those views primitive views. Normal views that are built out of other views are composed views. Primitive views have a body type Never
and would crash if you were to call their body. They also have logic that is hidden from SwiftUI's public interface.
As clients of the SwiftUI framework, we can only create composed views. Composed views can have state, and can be seen as a function of this state to another view. Composed views use primitive views in their body. Primitive views are the building blocks of any type of view.
Diffing commonly starts with a composed view, when its body needs to be re-evaluated in response to some state change. SwiftUI then has an old and a new body, made of some combination of primitive and composed views. It needs to diff them, and respond to the changes. The type of the old and new body is always the same.
Diffing commonly also ends with composed views, which is a SwiftUI performance optimization. If the body of an invalidated view contains another composed view of which the properties haven't changed, SwiftUI does not need to evaluate the body of this view further, and is done.
SwiftUI thus mainly compares the remaining, primitive views to find out what changed, making them the most important in the diffing algorithm. Composed views trigger diffing and provide new bodies that need to be diffed. They can also prevent further diffing. But the primitive views need to be compared, to find out what the changes were.
The fact that a view represents a number of user interface elements, does not necessarily mean that all those elements are there when the view is on screen. Lazy views, such as List
, only fetch new user interface elements from underlying views if they are or will be inside the scrolling area. We can call the user interface elements that a view can generate displayables. A SwiftUI view can have zero or more dispayables.
When it comes to diffing, we can put the primitive views that SwiftUI provides into four categories:
ForEach
, EmptyView
, and the views used by ViewBuilder
, such as TupleView
and _ConditionalView
.HStack
, VStack
, List
, LazyVStack
..border
, .padding
, .frame
generate, which are of type ModifiedContent
.You can find the types of the structural views that view builders create in the documentation. (As a recap: Multiple statements are combined into a TupleView
. if
-statements without else
create an optional views, which are views themselves. if
-statements with else
become _ConditionalView
s.)
Container views take the displayables of the view they wrap and put them on screen. I will call a displayable that is rendered a graphic. HStack
and VStack
always make graphics for all the displayables in a view and lay them out. Other containers, such as List
are lazy and do not immediately turn all displayables into graphics. Container views present themselves as views with a single displayable again to views higher up in the hierarchy.
Modifiers applied to views create ModifiedContent
views. Modifiers apply an effect to all the displayables of another view individually. A .border
modifier used on a TupleView
will put a border graphic on top of all the displayables of the TupleView
. This means modifiers can have displayables that are put on screen multiple times, so a single displayable is turned into multiple graphics.
Every view hierarchy contains primitive views, and every primitive view will be the combination of types of views of the categories listed above. Take this view, the body of which only has primitive views.
struct MyView: View {
@State var showSecret = false
var body: some View {
VStack {
Group {
Button("Show secret") { showSecret.toggle() }
if showSecret {
Text("Secret")
} else {
Color.red
}
}
.padding()
}
}
}
SwiftUI will need to keep track of the view hierarchy of an app at runtime, in a structure that I will call the view graph. The view graph stores references to the View
that they were built for, and it can store state for the views.
For an app that uses MyView
, the part of the view graph associated with MyView
will look like this:
I have colored the nodes in this view graph by the category of their view.
View graphs for primitive views look very similar in structure to initialized View
structures themselves. But they are not exactly the same. Composed views have a child view graph node for their body. Views underneath lazy container views might not have been evaluated completely.
Very interesting in this sense are the overlay and background modifiers. A view that is used as an overlay or background, has seperate state for every graphic it is used upon. The overlay and background modifier view graph nodes will thus have as many extra child nodes as they have graphics.
View graphs are not static, and a major part in the diffing that SwiftUI does will be to find out how to update the view graph. In the example of MyView
, if showSecret
is changed, the Text
underneath the _ConditionalView
node in the view graph is replaced by a Color
node.
The structure of views graphs can only change in a limited number of ways:
_ConditionalView
changes from showing its TrueContent
to its FalseContent
or the other way around. This is commonly the result of a boolean condition in a composed view changing such that the body of the view uses a different branch of an if
statement.Optional<View>
changes from showing nothing to showing something, or the other way around. This is commonly the result of a boolean condition in a composed view changing such that the body of the view uses the content inside an if-statement or not.ForEach
view changes, so that it needs to update its children. The type of the children cannot change.AnyView
changes. The old content is removed from screen, and the new content is added.In all four cases, entire view subgraphs are added or removed from underneath the view graph nodes. The view graph node relationships cannot be changed in any other way. You can never add or remove a view modifier. You can also never move a view to a different parent.
Not all the views in a view graph play a role in layout, and the ones that do can play different roles. We can visualize the displayables of MyView
in the following layout tree:
Because the view graph is the representation of all the views at runtime, SwiftUI will need to be able to build a layout tree from the view graph. During diffing, it will need to compute how the displayables in the layout tree change.
Notice how the view graph and layout tree are related, but different. Structural views do not have any displayable of their own, and are absent in the layout tree. Modifiers are parents of a structural view in the view graph, but parents of every displayable of the unary views in the layout tree.
There are multiple ways in which the graphics can take shape. SwiftUI can create a UIView
or a CALayer
to display them. But although some views, such as HStack
and VStack
participate in layout, they don't get their own UIView
or CALayer
.
SwiftUI needs to perform diffing to find out how to change the view graph and how to update the layout tree. Primitive views play the most important role, and there are only a small number of structural views underneath which a view graph can change. Let us take a closer look at how SwiftUI can use static types to do this.
We know that a SwiftUI view can generate a list of displayables, so we can create our own View
protocol where we make this explicit. For the diffing, we will also give it an update
function, so that the view can help us decide which view graph and layout tree changes to make.
protocol View {
associatedtype Body: View
var body: Body { get }
var displayables: [Displayable] { get }
func update(using view: Self, offset: Int)
}
The computed variable displayables
in the View
protocol gives us all the displayables in the view as an array. Similarly, SwiftUI will have a way to get all displayable items out of a View
, but this is not public. Container views, that are present in the layout tree, will get the displayables
out of the view they wrap. The containers themselves will have a displayables
of length 1.
The update
function in our view protocol is used for diffing. It compares the current view to a new view of the same type. It should be able to tell us which view changed so that we can update the view graph, and how the displayables
changed so that we can update the container views.
We use the offset
parameter for the index that the first displayable has in the displayables
list of the current container. We need this, because the view on which update
is called, might not be the first view inside a container.
For composed views, we will provide a default implementation of displayables
and update
that forwards the calls to the body. SwiftUI avoids calling the body when it isn't necessary, but this is only an optimization and doesn't affect the diffing.
extension View {
var displayables: [Displayable] { body.displayables }
func update(using view: Self, offset: Int) {
body.update(using: view.body, offset: offset)
}
}
To make primitive views work, with a body of type Never
, we also need to make Never
itself conform to View
.
extension Never: View {
var body: some View {
fatalError()
}
}
We know that a Displayable
is part of the layout tree, and that it can generate a graphic. But the internals of Displayable
aren't relevant either for the SwiftUI diffing algorithm right now.
protocol Displayable {}
Using our View
protocol, we will now re-define a number of primitive views from SwiftUI. We would then like to diff two views of the same type, and see if we can find out which views are added or removed from their view graph, and which displayables were added to or removed from their container in the layout tree.
We would also like to see how the types of the views are helping us when comparing them, and why using them is better than comparing the displayables
of type [Displayable]
directly.
Alongside making some primitive views, we will also make a view builder so that we can use the same nice syntax SwiftUI provides.
@resultBuilder
struct ViewBuilder {
// This is the only required function
static func buildBlock<Content: View>(_ content: Content) -> Content {
content
}
}
Let us first create just one single unary view, a Text
. This view will produce a single displayable, I will say that this is a view with a size of 1.
struct Text: View {
let text: String
init(_ text: String) {
self.text = text
}
var body: Never { fatalError() }
}
extension Text {
var displayables: [Displayable] {
[TextDisplayable(text: text)]
}
}
struct TextDisplayable: Displayable {
let text: String
}
Finding out what changed in the update
function is also simple. The string value used inside the text either changed, or it didn't.
extension Text {
func update(using view: Text, offset: Int) {
if self.text != view.text {
print("Changed displayable at \(offset)")
}
}
}
The easiest View
with a size bigger then 1 is probably the TupleView
. Like SwiftUI, we will also create one if you put two views into a ViewBuilder
. For now, our TupleView
will only support tuples of length 2.
struct TupleView<C0: View, C1: View>: View {
let content: (C0, C1)
var body: Never { fatalError() }
}
extension ViewBuilder {
static func buildBlock<C0: View, C1: View>(_ c0: C0, _ c1: C1) -> TupleView<C0, C1> {
TupleView(content: (c0, c1))
}
}
A tuple view with a tuple of length 2 takes takes two other views, and combines their displayables.
extension TupleView {
var displayables: [Displayable] {
content.0.displayables + content.1.displayables
}
}
Tuple views are fixed, there is nothing inside this view that can change. Only its children can change.
extension TupleView {
func update(using view: TupleView<C0, C1>, offset: Int) {
content.0.update(using: view.content.0, offset: offset)
content.1.update(using: view.content.1, offset: offset + view.content.0.displayables.count)
}
}
Do you know what the size of a tuple view of length 2 is? Trick question, because it is not necessarily 2. Rather, it is the size of the views in the tuple combined. TupleView<Text, Text>
has size 2, but TupleView<TupleView<Text, Text>, Text>
has size 3.
For a more interesting view, let us create _ConditionalView
, and the view builder functions that allow us to make them by writing if
-statements.
struct _ConditionalView<TrueContent: View, FalseContent: View>: View {
enum ConditionalContent {
case first(TrueContent)
case second(FalseContent)
}
let content: ConditionalContent
var body: Never { fatalError() }
}
extension ViewBuilder {
static func buildEither<TrueContent: View, FalseContent: View>(first: TrueContent) -> _ConditionalView<TrueContent, FalseContent> {
_ConditionalView(content: .first(first))
}
static func buildEither<TrueContent: View, FalseContent: View>(second: FalseContent) -> _ConditionalView<TrueContent, FalseContent> {
_ConditionalView(content: .second(second))
}
}
extension _ConditionalView {
var displayables: [Displayable] {
switch content {
case .first(let content):
return content.displayables
case .second(let content):
return content.displayables
}
}
}
The conditional view is more interesting, because it is the first view we look at, where content can be added to or removed from screen. If the content didn't change from TrueContent
to FalseContent
or the other way around, only the child of _ConditionalView
can have changed, and we need to update it similarly to how we did it for TupleView
. But if the content did change, we remove the old items from screen, and add the new items.
extension _ConditionalView {
func update(using view: _ConditionalView<TrueContent, FalseContent>, offset: Int) {
switch (content, view.content) {
case (.first(let oldValue), .first(let newValue)):
// Content didn't change, but the child may have changed
oldValue.update(using: newValue, offset: offset)
case (.second(let oldValue), .second(let newValue)):
// Content didn't change, but the child may have changed
oldValue.update(using: newValue, offset: offset)
case (.first(let oldValue), .second(let newValue)):
// Content was replaced!
for i in 0 ..< oldValue.displayables.count {
print("Removed displayable at \(offset + i)")
}
for i in 0 ..< newValue.displayables.count {
print("Inserted displayable at \(offset + i)")
}
case (.second(let oldValue), .first(let newValue)):
// Content was replaced!
for i in 0 ..< oldValue.displayables.count {
print("Removed displayable at \(offset + i)")
}
for i in 0 ..< newValue.displayables.count {
print("Inserted displayable at \(offset + i)")
}
}
}
}
We can write the structural views Group
, Optional where Wrapped: View
, and EmptyView
in similar ways. We will look into ForEach
later, but let's take what we already have out for a spin!
struct TestList: View {
var value: Bool
@ViewBuilder
var body: some View {
Text("One")
if (value) {
Text("Two")
} else {
Text("Three")
}
}
}
let view1 = TestList(value: true)
print(view1.body.displayables)
[TextDisplayable(text: "One"), TextDisplayable(text: "Two")]
let view2 = TestList(value: false)
view1.body.update(using: view2.body, offset: 0)
Removed displayable at 1
Inserted displayable at 1
The code we wrote found out that a view was replaced, and could tell how the displayables should change in the container we are currently inside.
Let us change the experiment slightly.
struct OtherTestList: View {
var value: Bool
@ViewBuilder
var body: some View {
Text("One")
Text(value ? "Two" : "Three")
}
}
let view1 = OtherTestList(value: true)
print(view1.body.displayables)
[TextDisplayable(text: "One"), TextDisplayable(text: "Two")]
let view2 = OtherTestList(value: false)
view1.body.update(using: view2.body, offset: 0)
Changed displayable at 1
Remember the example in the introduction where it was ambiguous what the changes in a list of labels was? This is the same example! But the algorithm we created knows exactly whether a label's text has changed, or whether it was replaced with another label. It was able to do that, by using static typing, and by making the view types themselves help determine what has changed.
We now know how to get a list of displayables from a view. We also know what changed if we are given a view of the same type. A container view, like a List
, would read the displayables of the view it wraps, and register itself for change notifications of those displayables. You can see how it could use a UITableView
with that.
In the View
protocol in the example above, views compare other instances of the same type to themselves. Like that, we could find out which views are added, removed or changed. We could also find what changed in the current container view, and thus what the changes to the layout tree are.
But the update
function did not have any references to the view graph nodes, nor to the current container, so apart from printing what needs to change, we can't actually apply the changes. To make the View
protocol internal functions closer to ones that could actually work, we would have to add those references.
We can add a reference to the current view graph node as a parameter to the update
function. That way, when we see that a view is inserted or removed, we can also create or remove a view graph node.
The reference to the current container can be added to the view graph node. When we see that a view was inserted or removed, we can then inform the current container of the change with the offsets of the associated displayables.
View graph nodes of unary views would also have to hold on to the displayable. That way, it can change it later. If the text inside a Text
changes, we need to update the displayable as well. If a modifier changes, we need to get all the displayables under its view graph node and update them.
We didn't discuss ForEach
yet, because unfortunately, ForEach
is different. We were able to use the types of views to see how the bodies of views change, but there is no way for us to tell how the data inside a ForEach
changed. That means, we cannot prevent the ambiguity in the same way we did it for other structural views.
But it also makes sense: this is why ForEach
requires the data it uses to be identifiable. And it means we need a shortest-edit-script algorithm. It seems that SwiftUI is not using it, but Swift even has a such a diffing algorithm built in.
Laziness might not be directly associated with SwiftUI's diffing algorithm, but we have explored the view graph and containers already, and it would be a shame not to discuss a very interesting property of lazy views. Also, it will give us a hint on how to make the offset
parameter more efficient.
One of the lazy container views in SwiftUI is List
. That is a nice view, because it internally uses a UITableView
, which we know how to work with. You can even check that, by putting breakpoints on the insertRows
and removeRows
methods on UITableView.
Although a list is lazy, because it uses a table view, it does still need to know how many rows it will have to display, and what the offset is of elements that are added or removed. That doesn't seem to work well together with laziness. Can we know the size of a composed view without evaluating it?
Turns out, in many cases we can, just by knowing the type of the body of a composed view. For example, the size of a TupleView<(Text, Text, Button)>
is always 3, no matter the particular instance. So a composed view where the body is a TupleView<(Text, Text, Button)>
also always has size 3.
But you can't always know the size of a view just by its type. A _ConditionalView<EmptyView, Text>
can have sizes 0 or 1. If a composed view has a body of this type, and it is used in a lazy stack or a list, we still have to evaluate it to determine its size.
We can add a static computed size
variable to our View
protocol for this. A composed view will forward the call to the Body
type. Primitive types can either return a size if that size is static, or nil
if the size is dynamic. If it is nil
, we cannot determine the size just by looking at the type.
You can confirm that this is how it works in SwiftUI. Let's say that you have a view like this:
struct ContentView: some View {
var body: some View {
List {
ForEach(0 ..< 1000, id: \.self) { i in
ListItems(i)
}
}
}
}
If ListItems
has a statically sized body type, such as Text
or TupleView<(Text, Text, Button)>
, the body of your view is only called for views that are about to be displayed on screen. But if you put a composed view with a dynamically sized body in it, for example because you wrote an if
-statement in the body, all the bodies are called immediately.
We have explored how the static view types in SwiftUI can be used to perform efficient and unambiguous diffing. If you want to know even more about how this part of SwiftUI could be implemented, check the SwiftTUI source code.
Using what we explored, we can explain some of the behavior of SwiftUI. We see why it is not possible to move a view to another place, by moving the view to a different parent. We also see why we cannot add or remove modifiers to views.
SwiftUI seems magical because it is closed-source and because its internals are not described. But by investigating concrete examples of views, we can find out many of its principles. And by using features of the Swift language, we can re-create many parts of it. I hope that like me, it lets you appreciate the framework even more, understand it better, and make even better use of it.