All Systems Go 2017: Updating Embedded Systems
At the All Systems Go conference in Berlin, Michael Obrich reported at 2017-10-22 about "Updating Embedded Systems - Putting it all together", based on the experiences of the Pengutronix integration team.
While some years ago embedded systems have almost never been updated, the recent IoT security incidents started to make the device manufacturers think about how to bring new software to their embedded devices. Many Linux systems implement an "A/B" scheme, consisting of two system software slots. One is currently running and an update is written to the currently-unused slot. This mechanism can be implemented in an "atomar" way, so even a failure in the update process cannot brick the system.
Besides writing the actual update (being more or less generic), there is another important task: the functionality of the updated system needs to be checked against possible errors. In case something went wrong, interactive devices have the possibility to show a related message on the screen, giving the user the option to switch back to the old software. The same concept works for non-interactive systems which are controlled by a rollout server. Completely unattended systems are a more difficult case: there is the possibility to use the watchdog to supervise the startup procedure; after the kernel has booted, systemd can be used as the watchdog handler. While in normal desktop usecases systemd falls back to a shell in case of an error, non-interactive embedded usecases need to take additional effort to supervise its applications correctly:
- The system can try to boot a 2nd time in case of an error, counting the boot attempts in the bootloader (i.e. barebox). When the reboot counter reaches a limit, the system can do an automatic fallback to the old version.
- In case of transient errors, this mechanism can run in a loop.
- For more safety critical systems, the device can be brought into a safe state and shutdown in case of an error.
Usually the exact setup is highly project specific.
The infrastructure for writing updates and taking care of the software slots is often implemented with RAUC in our projects; RAUC is a sophisticated tool that can check cryptographical signatures or update the boot slots while transferring minimal data by using the new casync mechanism.
Wir wollen zum Bundesweiten Digitaltag am 18.6.2021 das Thema "Smarte Städte" ein bisschen von der technischen Seite beleuchten, aber keine Angst: es bleibt für alle verständlich.
Distributionen wie Raspbian lassen die passgenaue Zusammenstellung eines Betriebssystems kinderleicht aussehen. Image herunterladen, Pakete installieren, noch ein paar Änderungen - fertig. Alles wie auf dem Laptop oder Server. Warum ein Betriebssystem aus einer klassischen Distribution im Produkt-Kontext zur Katastrophe führen kann, beleuchtet der Vortrag "Raspbian vs. Build-Systeme: Das richtige Werkzeug für solide Produkte".
"FOSDEM is a free event for software developers to meet, share ideas and collaborate. Every year, thousands of developers of free and open source software from all over the world gather at the event in Brussels. In 2021, they will gather online." -- FOSDEM
Now that, due to the COVID-19 pandemic, everyone has gotten used to digitalisation and online conferences - it has never been easier to organise a conference and bring together all experts and interested parties for a few hours of intensive exchange of ideas on a certain topic.
In this blog post I would like to address the challenges of performing unattended and verified updates of embedded Linux systems in the field using open source software and workflows. While updating is not a end in itself, a second part of my considerations goes even further and also works out the necessities and possible workflows for keeping the software stack of a project up to date and thus either preventing security issues or at least enabling a short reaction time in case of severe CVE'S discovered.