Asynchronous Programming in Rust

Concurrency is about dealing with a lot of things at the same time, Concurrency is about working smarter
Parallelism is about doing a lot of things at the same time, Parallelism is a way of throwing more resources atthe problem
threads can be a means to perform tasks in parallel, but they can also be a means to achieve concurrency
asynchronous programming: the way a programming language or library abstracts over concurrent operations, and how we as users of a language or library use that abstraction to execute tasks concurrently
the operating system is an abstraction over the hardware
a system call uses a public API that the operating system provides so that programs we write in userland can communicate with the OS
memory management unit: translate the virtual address we use in our programs to a physical address
most modern CPUs are built with an OS in mind, so they provide the hooks and infrastructure that the OS latches onto upon bootup
cooperative: tasks that yield voluntarily when it can't progress further before another operation has finished
non-cooperative: tasks that don't necessarily yield voluntarily, the scheduler can stop the task and take control over the CPU even through the task would have been able to do work and progress
stackful:
- each task has its own call stack(similar to the stack used by the os for its threads)
- stackful tasks can suspend execution at any point in the program as the whole stack is preserved
stackless:
- there is not a separate stack for each task, they all run sharing the same call stack
- a task can't be suspended in the middle of a stack frame
- they need to store/restore less imformation when switching between tasks
os threads:
- simple to understand, easy to use, switching between tasks is reasonably fast, get parallelism for free
- come with a rather large stack, context switching can be costly, might not switch back to you thread as fast as you'd wish, tightly coupled to an os abstraction
fibers/green thread:
- simple to use(look like os threads), context switching is fast, less memory issue, full control over how tasks are scheduled
- stacks need a way to grow when run out of space, still need to save the CPU state on contextswitch, complicated to support many platforms and/or CPU architectures
- FFI is very complicated
callback based approaches: the whole idea is to save a pointer to a set of instructions we want to run later together with whatever state is needed(closure in Rust)
- easy to implement, no context switching, low memory overhead(in most cases)
- memory usage grows linearly, hard to reason about, affect almost all aspects of the program, owership can be hard to reason about, sharing state between tasks is difficult, debugging can be difficult
coroutines: promises and futures, each task is represented as a state machine
coroutines and async/await:
- write code the same way you normally would, no context switching, can be implemented in a memory-efficient way, easy to implement for various platforms
- pre-emption can be hard, need compiler support to leverage all advantages, debugging can be difficult
is all I/O blocking? The answer is Maybe: os is smart and will cache a lot of information in memory, a syscall requesting that information would simply return immediately
a bitmask is a way to treat each bit as a switch, or a flag, to indicate that an option is either enabled or disabled
epoll can notify events in a level-triggered or edge-triggered mode
- level-triggered: a read event has accurred as long as there is data in the buffer associated with the file handle. can encounter a case where we get multiple notifications on the same event
- edge-triggered: a read event has occurred when the buffer has changed from having no data to having data, if you don't drain the buffer properly, you will never receive a notification on that file handle again
CPUs implement an instruction set, the instruction set defines what instructions the CPU can execute and the infrastructure it should provide to programmers
registers are fixed sized slots the CPU need to temporarily store and operate on data
the stack is nothing more than a piece of contiguous memory
- the stack supports simple push/pop instructions on a contiguous part of memory, thst's what makes it fast to use
- the heap memory is allocated by a memory allocator on demand and can be scattered around in different locations
async in Rust using a poll-based approach inwhich an asynchronous task will have three phases
1. poll phase
  - resulting in the task progressing until a point where it can no longer make progress
  - executor: the part of the runtime that polls a future
2. wait phase
  - reactor: an event source, registers that a future is waiting for an event to happen and makes sure that it will wake the future when that event is ready
3. wake phase
  - the event happens and the future is woken up
  - it is now up to the executor that polled the future in step 1 to schecule the future to be polled again and make further progress until it completes or reaches a new pointer where it can not make futher progress and the cycle repeats
leaf futures: runtimes creates leaf futures, which represent a resource such as a socket
non-leaf futures: the kind of futures we as users of a runtime write ourselves using the async keyword to create a task that can be run on the executor
a fully working async system in Rust can be divided into three parts:
- reactor: responsible for notifying about I/O events
- executor: scheduler
- future: a task taht can stop and resume at specific points
step1
- executor holds a list of futures
- try to run the future by polling it, and hands it a waker
- when receiving Pending/Ready, the executor knows it can start polling a different future
step2
- reactor stores a copy of waker that the executor passed to the future when it polled it
- reactor tracks events on that I/O source
step3
- when the reactor gets a notification that an event has happend on one of the tracked sources, it locates the waker associated with that source and calls Waker::wake on it
- this will in turn inform the executor that the future is ready to make progress so it can poll it once more
executors and reactors never need to communicate with one another directly since the waker is the same across all executors
rust only provides what's necessary to model asynchronous operations in the language:
- a common interface that represents an operation, which will be completed in the future through the Future trait
- an ergonomic way of creating tasks that can be suspended and resumed through async and await
- a defined interface to wake up a suspended task through the Waker type
a stackless coroutine is a way of representing a task that can be interrupted and resumed
async/await: if we tagged the start of each function and each point we wanted to yield control back to the caller with a few keywords and had our state machine generated for us
all runtimes in Rust need to do at least two things: schedule and drive objects implementing Future trait to completion
- reactor/driver: the part that thracks events we're waiting on and makes sure to wait on notifications from the OS in an efficient manner
- executor: the part that schedules tasks and polls them to completion
an executor simply decides who gets time on the CPU to progress and when they get it, it must also call Future::poll and advance the state machines to its next state
a loosely coupled reactor and executor
the space coroutines need will grow linearly with the number of different variables that need to be stored/restored, Rust optimizes this by implementing coroutines as enums, where each state only holds the data it needs to restore in the next state, this way, the size of a coroutine is not dependent on the total number of variables, it's only dependent on the size of the largest state that needs to be saved/restored. it's an advantage that stackless coroutines have over stackful coroutines when it comes to memory efficiency
moving a self-referential struct

Rain all day, Code all night

Asynchronous Programming in Rust