I am asked to describe the steps involved in a context switch (1) between two different processes and (2) between two different threads in the same process.
Is this too general or what would you add to explain the process clearer?
It's much easier to explain those in reverse order because a process-switch always involves a thread-switch.
A typical thread context switch on a single-core CPU happens like this:
All context switches are initiated by an 'interrupt'. This could be an actual hardware interrupt that runs a driver, (eg. from a network card, keyboard, memory-management or timer hardware), or a software call, (system call), that performs a hardware-interrupt-like call sequence to enter the OS. In the case of a driver interrupt, the OS provides an entry point that the driver can call instead of performing the 'normal' direct interrupt-return & so allows a driver to exit via the OS scheduler if it needs the OS to set a thread ready, (eg. it has signaled a semaphore).
Non-trivial systems will have to initiate a hardware-protection-level change to enter a kernel-state so that the kernel code/data etc. can be accessed.
Core state for the interrupted thread has to be saved. On a simple embedded system, this might just be pushing all registers onto the thread stack and saving the stack pointer in its Thread Control Block (TCB).
Many systems switch to an OS-dedicated stack at this stage so that the bulk of OS-internal stack requirements are not inflicted on the stack of every thread.
It may be necessary to mark the thread stack position where the change to interrupt-state occurred to allow for nested interrupts.
The driver/system call runs and may change the set of ready threads by adding/removing TCB's from internal queues for the different thread priorities, eg. network card driver may have set an event or signaled a semaphore that another thread was waiting on, so that thread will be added to the ready set, or a running thread may have called sleep() and so elected to remove itself from the ready set.
The OS scheduler algorithm is run to decide which thread to run next, typically the highest-priority ready thread that is at the front of the queue for that priority. If the next-to-run thread belongs to a different process to the previously-run thread, some extra stuff is needed here, (see later).
The saved stack pointer from the TCB for that thread is retrieved and loaded into the hardware stack pointer.
The core state for the selected thread is restored. On my simple system, the registers would be popped from the stack of the selected thread. More complex systems will have to handle a return to user-level protection.
An interrupt-return is performed, so transferring execution to the selected thread.
In the case of a multicore CPU, things are more complex. The scheduler may decide that a thread that is currently running on another core may need to be stopped and replaced by a thread that has just become ready. It can do this by using its interprocessor driver to hardware-interrupt the core running the thread that has to be stopped. The complexities of this operation, on top of all the other stuff, is a good reason to avoid writing OS kernels :)
A typical process context switch happens like this:
Process context switches are initiated by a thread-context switch, so all of the above, 1-9, is going to need to happen.
At step 5 above, the scheduler decides to run a thread belonging to a different process from the one that owned the previously-running thread.
The memory-management hardware has to be loaded with the address-space for the new process, ie whatever selectors/segments/flags/whatever that allow the thread/s of the new process to access its memory.
The context of any FPU hardware needs to be saved/restored from the PCB.
There may be other process-dedicated hardware that needs to be saved/restored.
On any real system, the mechanisms are architecture-dependent and the above is a rough and incomplete guide to the implications of either context switch. There are other overheads generated by a process-switch that are not strictly part of the switch - there may be extra cache-flushes and page-faults after a process-switch since some of its memory may have been paged out in favour of pages belonging to the process owning the thread that was running before.