For the longest time, Ironclad has featured three types of operation that allow
for a userland process to execute other programs. These are the exec
, fork
,
and spawn
syscalls, which userland builds abstractions on. As the complexity
of applications Ironclad is capable of running has grown, these options
(especially spawn
) have become extremely clubbersome to use and justify.
In this article, we will cover what changes we are doing, why, and where to go from here.
POSIX, the overarching “archetype” of operating system Ironclad fits in, provides two basic options, in the form of library calls:
fork()
: The standard UNIX-like fork()
, which can be called by a process to
fork the current address space and caller thread, and thus make an identical
copy, creating two processes from one by “mitosis”.
exec()
: Another UNIX classic. It replaces the caller process with the
passed program, thrashing all state.
Ironclad implements these two in the kernel in a purely standard complying way, integrating it with our Mandatory Access Control (MAC) checks, which restrict what the user can execute and whether the process can fork, but fundamentally, it is the same thing.
fork
fundamentally relies on the concept of virtual memory. At time of fork,
two processes need to share the same addresses, as they are identical copies,
while pointing to different, isolated physical regions. This limits the systems
Ironclad can work on, as we fundamentally require a Memory Management Unit
(MMU) to handle this
virtual <-> physical memory translation, which are not that common in small
embedded chips.
In standard UNIX systems, when you are to launch a process, for example, from a
shell, the standard process to do so is doing a fork
+ exec
, in a
construction as follows:
int pid = fork();
if (pid == 0) {
execvp("other_program", argv);
}
This is brutally inneficient, as fork
copies all the data, especially the
memory address space, which is expensive to copy, to then trash it with exec
,
as loading a new program on the caller process will trash the previously forked
state.
POSIX systems have historically added Copy on Write (COW)
support for fork
, which basically makes it so memory is only copied when
needed, instead of all at once. Ironclad cannot do this, as we are a hard
real-time capable system, and copy on write relies on latency introducing
methods, like page faults, in order to determine what to copy and when. This
means fork
+ exec
is fundamentally slow and inefficient in Ironclad.
For cases like these, Ironclad provides a spawn
primitive, which basically
condenses both steps into a single syscall, so the kernel can avoid unnecessary
copying, with something for the previous fork
+ exec
example being:
int pid = spawn("other_program", argv);
This solution is far from a panacea though. The problem is that software does
not only fork
to then exec
. A really common pattern in POSIX systems is:
int child = fork();
if (child == 0) {
// Replace std streams.
dup2(new_tty_fd, 0);
dup2(new_tty_fd, 1);
dup2(new_tty_fd, 2);
ioctl(new_tty_fd, TIOCSCTTY, NULL);
execvp(start_path, args);
perror("Could not start");
}
Where environment is set up between fork
and exec
. These preparations are
rooted on how POSIX I/O and process management works, so it is not a thing we
can just tell people not to do. Not only that, but we aim for Ironclad to be
an easy system to port existing POSIX software to, as proven by Gloire,
our biggest distribution, which has hundreds of packages ported using constructs
like these with minimal patching. If spawn
cannot approach these issues, it
cannot be used as an alternative.
A way spawn
functions have evolved to fit these usecases is
posix_spawn
, a
POSIX-provided alternative with extremely complex options, trying to bound
the arbitrary execution between fork
and exec
. Trying to bound infinite
behaviour is hard though, and this has lead to an extremely complex and ugly
interface to try to accomodate everything, that has lead to poor adoption.
One cannot just put such a complicated interface in the kernel, the more complex an interface, the more likely you will cause bugs and design mistakes. You would need a subset, which inherently will cause adaptability issues in this case.
A really good answer to all these issues is to just not engage. It was a bad
decision to allow this pattern to root itself on POSIX software, the fork
+
exec
approach has caused tons of issues historically, and it will keep
causing them.
This has caused a lot of research operating systems, some similar to Ironclad,
like Zircon, to just not implement fork
. exec
is not as problematic, fork
is the root of all issues.
Do not get me wrong, I would love to do that, but Ironclad’s scope is different than Zircon. Zircon is an attempt to build a full OS solution from the ground up, so they can afford to break compatibility. With Ironclad, we want to port as much POSIX software as possible, both because we cannot afford to implement a whole userland on top of our whole kernel, but because Ironclad is meant to be a pragmatic POSIX system. Thus, unless we want to patch fundamental behaviour on all programs, we need to compromise on our approach.
vfork
The solution that we are pivoting to right now is an old POSIX friend, vfork
.
vfork
is fork
, but instead of copying memory state, it locks the parent
process out of execution until the child calls exec
. No two copies of the
address space run at once, thus solving a ton of the issues in performance and
portability.
vfork
also allows for the user to execute arbitrary code between fork
and
exec
, which makes it so no extremely complicated spawn
have to be supported
in the kernel.
We are going to replace spawn
, which has run out of usefulness due to the
usability issues mentioned before, with vfork
in the form of a flag to the
fork
syscall, which will be exposed as part of the posix_spawn
and vfork
functions in the libc. This will be a transparent change for POSIX software,
but it will do a lot to workaround the issues.
In a future, as we work on patching and pivoting software to not rely on the
notion of fork
, in a process that will not doubt will take a long time, we
will remove support for fork
all-together from the kernel, moving fork
to a
deprecated userland-only implementation.
For now, these are transparent changes for POSIX software, since these abstractions are used by the libc to provide the standard C functions. If you use the syscalls directly, you may need to adapt.
If you maintain POSIX applications, both for Ironclad and any other system, if
you use fork
, you should consider adapting your application to use
posix_spawn
. As we change scheduling interfaces, the libc implementations
will use them transparently, so you will always be ready for the next step. Do
not adapt by using vfork
explicitly, as, while it will work just fine for
Ironclad, POSIX dropped it as part of POSIX.1-2008.
For any feedback please join an Ironclad community, or as a second choice, contact me at my email. I would love to hear back!