INITIALIZATION Knowledgebase: ki-dev-large Base Query: what is an overview for best practices in web api design?> Model: gemini-1.5-pro-preview-0409 Use Curl?: None ================================================== **Elapsed Time: 0.00 seconds** ================================================== ROUTING Query type: summary ================================================== **Elapsed Time: 2.19 seconds** ================================================== PRIMER Primer: You are Simon, a highly intelligent personal assistant in a system called KIOS. You are a chatbot that can read knowledgebases through the "CONTEXT" that is included in the user's chat message. Your role is to act as an expert at summarization and analysis. In your responses to enterprise users, prioritize clarity, trustworthiness, and appropriate formality. Be honest by admitting when a topic falls outside your scope of knowledge, and suggest alternative avenues for obtaining information when necessary. Make effective use of chat history to avoid redundancy and enhance response relevance, continuously adapting to integrate all necessary details in your interactions. Use as much tokens as possible to provide a detailed response. ================================================== **Elapsed Time: 0.49 seconds** ================================================== FINAL QUERY Final Query: CONTEXT: ########## File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 1 Context: ```markdown # The little book about OS development **Authors:** Erik Helin, Adam Renberg **Date:** 2015-01-19 **Commit:** fe8e3e27dab3c399303354d2lea38f6d4ee2928212 ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 2 Context: I'm unable to assist with that. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 3 Context: ```markdown # Contents 1. [Introduction](#introduction) 7 - [About the Book](#about-the-book) 7 - [The Reader](#the-reader) 8 - [Credits, Thanks and Acknowledgements](#credits-thanks-and-acknowledgements) 8 - [Contributors](#contributors) 8 - [Changes and Corrections](#changes-and-corrections) 8 - [Issues and where to get help](#issues-and-where-to-get-help) 9 - [License](#license) 9 2. [First Steps](#first-steps) 11 - [Tools](#tools) 11 - [Quick Setup](#quick-setup) 11 - [Programming Languages](#programming-languages) 11 - [Host Operating System](#host-operating-system) 12 - [Build System](#build-system) 12 - [Virtual Machine](#virtual-machine) 12 - [Booting](#booting) 12 - [BIOS](#bios) 12 - [The Bootloader](#the-bootloader) 13 - [The Operating System](#the-operating-system) 13 - [Hello Cafebabe](#hello-cafebabe) 13 - [Compiling the Operating System](#compiling-the-operating-system) 13 - [Linking the Kernel](#linking-the-kernel) 14 - [Obtaining GRUB](#obtaining-grub) 15 - [Building an ISO Image](#building-an-iso-image) 15 - [Running Bochs](#running-bochs) 16 3. [Further Reading](#further-reading) 17 ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 4 Context: ```markdown # 3 Getting to C 19 - **Setting Up a Stack** .......................................... 19 - **Calling C Code From Assembly** .................................. 20 - Packing Structs ............................................... 20 - **Compiling C Code** ............................................... 21 - **Build Tools** ................................................... 21 - **Further Reading** ................................................ 22 # 4 Output 23 - **Interacting with the Hardware** .................................. 23 - **The Framebuffer** ................................................ 23 - Writing Text .................................................... 23 - Moving the Cursor ............................................... 25 - The Driver ..................................................... 26 - **The Serial Ports** ............................................... 26 - Configuring the Serial Port ..................................... 27 - Configuring the Line ............................................ 27 - Configuring the Buffers ......................................... 29 - Configuring the Modem .......................................... 29 - Writing Data to the Serial Port ................................ 30 - Configuring Bochs .............................................. 31 - The Driver ..................................................... 31 - **Further Reading** ............................................... 31 # 5 Segmentation 33 - **Accessing Memory** ............................................... 33 - **The Global Descriptor Table (GDT)** ................................ 35 - **Loading the GDT** ............................................... 36 - **Further Reading** ............................................... 37 # 6 Interrupts and Input 39 - **Interrupts Handlers** ............................................ 39 - **Creating an Entry in the IDT** .................................. 39 - **Handling an Interrupt** .......................................... 40 - **Creating a Generic Interrupt Handler** .......................... 41 - **Loading the IDT** ............................................... 43 ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 5 Context: ```markdown # Table of Contents ## 7 The Road to User Mode 47 - **Loading an External Program** 47 - GRUB Modules 47 - **Executing a Program** 48 - A Very Simple Program 48 - Compiling 48 - Finding the Program in Memory 49 - Jumping to the Code 49 - **The Beginning of User Mode** 49 ## 8 A Short Introduction to Virtual Memory 51 - **Virtual Memory Through Segmentation?** 51 - **Further Reading** 51 ## 9 Paging 53 - **Why Paging?** 53 - **Paging in x86** 53 - **Identity Paging** 55 - **Enabling Paging** 55 - **A Few Details** 55 - **Paging and the Kernel** 55 - **Reasons to Not Identity Map the Kernel** 56 - **The Virtual Address for the Kernel** 56 - **Placing the Kernel at 0xc0000000** 56 - **Higher-half Linker Script** 57 - **Entering the Higher Half** 57 - **Running in the Higher Half** 58 - **Virtual Memory Through Paging** 58 - **Further Reading** 59 ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 6 Context: ```markdown # 10 Page Frame Allocation Managing Available Memory - How Much Memory is There? .......................................... 61 - Managing Available Memory ........................................... 63 - How Can We Access a Page Frame? ..................................... 63 - A Kernel Heap ....................................................... 63 - Further Reading .................................................... 63 # 11 User Mode Segments for User Mode ................................................ 65 - Setting Up For User Mode ............................................. 65 - Entering User Mode .................................................. 66 - Using C for User Mode Programs ...................................... 67 - A C Library ....................................................... 68 - Further Reading .................................................... 68 # 12 File Systems Why a File System? ................................................... 69 - A Simple Read-Only File System ...................................... 69 - Inodes and Writable File Systems ..................................... 70 - A Virtual File System ............................................... 70 - Further Reading .................................................... 70 # 13 System Calls Designing System Calls ................................................ 71 - Implementing System Calls ............................................ 71 - Further Reading .................................................... 72 # 14 Multitasking Creating New Processes ................................................ 73 - Cooperative Scheduling with Yielding ................................ 73 - Preemptive Scheduling with Interrupts ................................ 74 - Programmable Interval Timer ....................................... 74 - Separate Kernel Stacks for Processes ............................... 74 - Difficulties with Preemptive Scheduling .............................. 75 - Further Reading .................................................... 75 # References ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 7 Context: ```markdown # Chapter 1 ## Introduction This text is a practical guide to writing your own x86 operating system. It is designed to give enough help with the technical details while at the same time not reveal too much with samples and code excerpts. We’ve tried to collect parts of the vast (and often excellent) expanse of material and tutorials available, on the web and otherwise, and add our own insights into the problems we encountered and struggled with. This book is not about the theory behind operating systems, or how any specific operating system (OS) works. For OS theory, we recommend the book *Modern Operating Systems* by Andrew Tanenbaum [1]. Lists and details on current operating systems are available on the Internet. The starting chapters are quite detailed and explicit, to quickly get you into coding. Later chapters give more of an outline of what is needed, as more and more of the implementation and design comes up to the reader, who should now be more familiar with the world of kernel development. At the end of some chapters there are links for further reading, which might be interesting and give a deeper understanding of the topics covered. In chapters 2 and 3 we set up our development environment and boot up our OS kernel in a virtual machine, eventually starting to write code in C. We continue in chapter 4 with writing to the screen and the serial port, and then we dive into segmentation in chapter 5 and interrupts and input in chapter 6. After this we have a quite functional but bare-bones OS kernel. In chapter 7 we start the road to user mode applications, with virtual memory through paging (chapters 8 and 9), memory allocation (chapter 10), and finally running a user application in chapter 11. In the last three chapters we discuss the more advanced topics of file systems (chapter 12), system calls (chapter 13), and multitasking (chapter 14). ## About the Book The OS kernel and this book were produced as part of an advanced individual course at the Royal Institute of Technology [2], Stockholm. The authors had previously taken courses in OS theory, but had only limited practical experience with OS kernel development. In order to get more insight and a deeper understanding of how the theory from the previous OS courses works in practice, the authors decided to create a new course, which focused on the development of a small OS. Another goal of the course was writing a thorough tutorial on how to develop a small OS basically from scratch, and this short book is the result. The x86 architecture is, and has been for a long time, one of the most common hardware architectures. It was not a difficult choice to use the x86 architecture as the target of the OS, with its large community, extensive ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 8 Context: ```markdown # Reference Material The OS was developed in about six weeks of full-time work. The implementation was done in many small steps, and after each step the OS was tested manually. By developing in this incremental and iterative way, it was often easier to find any bugs that were introduced, since only a small part of the code had changed since the last known good state of the code. We encourage the reader to work in a similar way. During the six weeks of development, almost every single line of code was written by the authors together (this way of working is also called pair-programming). It is our belief that we managed to avoid a lot of bugs due to this style of development, but this is hard to prove scientifically. ## The Reader The reader of this book should be comfortable with UNIX/Linux, systems programming, the C language, and computer systems in general (such as hexadecimal notation [3]). This book could be a way to get started learning those things, but it will be more difficult, and developing an operating system is already challenging on its own. Search engines and other tutorials are often helpful if you get stuck. ## Credits, Thanks and Acknowledgements We'd like to thank the OSDev community [4] for their great wiki and helpful members, and James Malloy for his eminent kernel development tutorial [5]. We'd also like to thank our supervisor Torbjørn Granlund for his insightful questions and interesting discussions. Most of the CSS formatting of the book is based on the work by Scott Chacon for the book Pro Git, [http://progit.org](http://progit.org). ## Contributors We are very grateful for the patches that people send us. The following users have all contributed to this book: - alexschneider - Arvidanborisov - nirs - kedarambaswade - vamanaea - ansjob ## Changes and Corrections This book is hosted on GitHub - if you have any suggestions, comments or corrections, just fork the book, write your changes, and send us a pull request. We'll happily incorporate anything that makes this book better. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 9 Context: ```markdown # Issues and where to get help If you run into problems while reading the book, please check the issues on GitHub for help: [GitHub Issues](https://github.com/littleosbook/littleosbook/issues). # License All content is under the Creative Commons Attribution Non Commercial Share Alike 3.0 license: [CC BY-NC-SA 3.0](http://creativecommons.org/licenses/by-nc-sa/3.0/). The code samples are in the public domain - use them however you want. References to this book are always received with warmth. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 10 Context: I'm unable to assist with this request. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 11 Context: ```markdown # Chapter 2 ## First Steps Developing an operating system (OS) is no easy task, and the question “How do I even begin to solve this problem?” is likely to come up several times during the course of the project for different problems. This chapter will help you set up your development environment and booting a very small (and primitive) operating system. ## Tools ### Quick Setup We (the authors) have used Ubuntu[^6] as the operating system for doing OS development, running it both physically and virtually (using the virtual machine VirtualBox[^7]). A quick way to get everything up and running is to use the same setup as we did, since we know that these tools work with the samples provided in this book. Once Ubuntu is installed, either physical or virtual, the following packages should be installed using `apt-get`: ```bash sudo apt-get install build-essential nasm genisoimage bochs bochs-sdl ``` ## Programming Languages The operating system will be developed using the C programming language[^8][^9], using GCC[^10]. We use C because developing an OS requires a very precise control of the generated code and direct access to memory. Other languages that provide the same features can also be used, but this book will only cover C. The code will make use of one type attribute that is specific for GCC: ```c __attribute__((packed)) ``` This attribute allows us to ensure that the compiler uses a memory layout for a `struct` exactly as we define it in the code. This is explained in more detail in the next chapter. Due to this attribute, the example code might be hard to compile using a C compiler other than GCC. For writing assembly code, we have chosen NASM[^11] as the assembler, since we prefer NASM's syntax over GNU Assembler. Bash[^12] will be used as the scripting language throughout the book. [^6]: Ubuntu documentation link [^7]: VirtualBox documentation link [^8]: C programming documentation link [^9]: C language overview link [^10]: GCC documentation link [^11]: NASM documentation link [^12]: Bash documentation link ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 12 Context: ```markdown # Host Operating System All the code examples assume that the code is being compiled on a UNIX-like operating system. All code examples have been successfully compiled using Ubuntu [6] versions 11.04 and 11.10. # Build System Make [13] has been used when constructing the Makefile examples. # Virtual Machine When developing an OS, it is very convenient to be able to run your code in a virtual machine instead of on a physical computer, since starting your OS in a virtual machine is much faster than getting your OS onto a physical medium and then running it on a physical machine. Bochs [14] is an emulator for the x86 (IA-32) platform which is well suited for OS development due to its debugging features. Other popular choices are QEMU [15] and VirtualBox [7]. This book uses Bochs. By using a virtual machine, we cannot ensure that our OS works on real physical hardware. The environment simulated by the virtual machine is designed to be very similar to their physical counterparts, and the OS can be tested on one by just copying the executable to a CD and finding a suitable machine. # Booting Booting an operating system consists of transferring control along a chain of small programs, each one more "powerful" than the previous one, where the operating system is the last "program". See the following figure for an example of the boot process: ``` BIOS → GRUB1 → GRUB2 → OS ``` **Figure 2.1**: An example of the boot process. Each box is a program. # BIOS When the PC is turned on, the computer will start a small program that adheres to the Basic Input Output System (BIOS) [16] standard. This program is usually stored on a read-only memory chip on the motherboard of the PC. The original role of the BIOS program was to export some library functions for printing to the screen, reading keyboard input, etc. Modern operating systems do not use the BIOS' functions; they use... ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 13 Context: ```markdown # Bootloader The BIOS program will transfer control of the PC to a program called a **bootloader**. The bootloader's task is to transfer control to us, the operating system developers, and our code. However, due to some restrictions¹ of the hardware and because of backward compatibility, the bootloader is often split into two parts: the first part of the bootloader will transfer control to the second part, which finally gives control of the PC to the operating system. Writing a bootloader involves writing a lot of low-level code that interacts with the BIOS. Therefore, an existing bootloader will be used: the GNU GRand Unified Bootloader (GRUB)[17]. Using GRUB, the operating system can be built as an ordinary ELF[18] executable, which will be loaded by GRUB into the correct memory location. The compilation of the kernel requires that the code is laid out in memory in a specific way (how to compile the kernel will be discussed later in this chapter). ## The Operating System GRUB will transfer control to the operating system by jumping to a position in memory. Before the jump, GRUB will look for a magic number to ensure that it is actually jumping to an OS and not some random code. This magic number is part of the multiboot specification[19] that GRUB adheres to. Once GRUB has made the jump, the OS has full control of the computer. ## Hello Cafebabe This section will describe how to implement the smallest possible OS that can be used together with GRUB. The only thing the OS will do is write `0xCAFEBABE` to the exit register (most people would probably not even call this an OS). ## Compiling the Operating System This part of the OS has to be written in assembly code, since C requires a stack, which isn’t available (the chapter “Getting to C” describes how to set one up). Save the following code in a file called `loader.s`: ```assembly global loader ; the entry symbol for ELF MAGIC_NUMBER equ 0x1BADB002 ; define the magic number constant FLAGS equ 0x0 ; multiboot flags CHECKSUM equ -MAGIC_NUMBER ; calculate the checksum ; (magic number + checksum + flags should equal 0) section .text ; start of the text (code) section align 4 ; the code must be 4 byte aligned dd MAGIC_NUMBER ; write the magic number to the machine code dd FLAGS ; the flags; ``` > ¹ The bootloader must fit into the master boot record (MBR) boot sector of a hard drive, which is only 512 bytes large. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 14 Context: ```markdown ```assembly dd CHECKSUM ; and the checksum loader: ; the loader label (defined as entry point in linker script) mov eax, 0xCAFEBABE ; place the number 0xCAFEBABE in the register eax .loop: jmp .loop ; loop forever ``` The only thing this OS will do is write the very specific number 0xCAFEBABE to the eax register. It is very unlikely that the number 0xCAFEBABE would be in the eax register if the OS did not put it there. The file `loader.s` can be compiled into a 32 bits ELF [18] object file with the following command: ```bash nasm -f elf32 loader.s ``` ## Linking the Kernel The code must now be linked to produce an executable file, which requires some extra thought compared to when linking most programs. We want GRUB to load the kernel at a memory address larger than or equal to `0x00100000` (1 megabyte (MB)), because addresses lower than 1 MB are used by GRUB itself, BIOS and memory-mapped I/O. Therefore, the following linker script is needed (written for GNU LD [20]): ```ld ENTRY(loader) /* the name of the entry label */ SECTIONS { . = 0x00100000; /* the code should be loaded at 1 MB */ .text ALIGN(0x1000) : /* align at 4 KB */ { *(.text) /* all text sections from all files */ } .rodata ALIGN(0x1000) : /* align at 4 KB */ { *(.rodata) /* all read-only data sections from all files */ } .data ALIGN(0x1000) : /* align at 4 KB */ { *(.data) /* all data sections from all files */ } .bss ALIGN(0x1000) : /* align at 4 KB */ { *(.COMM) /* all COMMON sections from all files */ *(.bss) /* all bss sections from all files */ } } ``` Save the linker script into a file called `link.ld`. The executable can now be linked with the following command: ```bash ld -o kernel.bin -T link.ld loader.o ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 15 Context: ```markdown # Building an ISO Image The following command will be executed to create the final executable: ``` ld -T link.ld -m elf_i386 loader.o -o kernel.elf ``` The final executable will be called `kernel.elf`. ## Obtaining GRUB The GRUB version we will use is GRUB Legacy, since the OS ISO image can then be generated on systems using both GRUB Legacy and GRUB 2. More specifically, the GRUB Legacy `stage2_eltorito` bootloader will be used. This file can be built from GRUB 0.97 by downloading the source from [ftp://alpha.gnu.org/gnu/grub/grub-0.97.tar.gz](ftp://alpha.gnu.org/gnu/grub/grub-0.97.tar.gz). However, the `configure` script doesn't work well with Ubuntu [21], so the binary file can be downloaded from [http://littlebook.github.com/files/stage2_eltorito](http://littlebook.github.com/files/stage2_eltorito). Copy the file `stage2_eltorito` to the folder that already contains `loader.s` and `link.ld`. ## Building an ISO Image The executable must be placed on media that can be loaded by a virtual or physical machine. In this book we will use ISO [22] image files as the media, but one can also use floppy images, depending on what the virtual or physical machine supports. We will create the kernel ISO image with the program `genisomage`. A folder must first be created that contains the files that will be on the ISO image. The following commands create the folder and copy the files to their correct places: ```bash mkdir -p iso/boot/grub # create the folder structure cp stage2_eltorito iso/boot/grub/ # copy the bootloader cp kernel.elf iso/boot/ # copy the kernel ``` A configuration file `menu.lst` for GRUB must be created. This file tells GRUB where the kernel is located and configures some options: ```plaintext default=0 timeout=0 title OS kernel /boot/kernel.elf ``` Place the file `menu.lst` in the folder `iso/boot/grub/`. The contents of the `iso` folder should now look like the following structure: ``` iso └── boot └── grub ├── menu.lst ├── stage2_eltorito └── kernel.elf ``` The ISO image can then be generated with the following command: ```bash ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 16 Context: ```markdown genisoimage -R \ -b boot/grub/stage2_eltorito \ -no-emul-boot \ -boot-load-size 4 \ -boot-info-table \ -input-charset utf8 \ -quiet \ -o os.iso \ iso For more information about the flags used in the command, see the manual for `genisoimage`. The ISO image `os.iso` now contains the kernel executable, the GRUB bootloader, and the configuration file. ## Running Bochs Now we can run the OS in the Bochs emulator using the `os.iso` ISO image. Bochs needs a configuration file to start and an example of a simple configuration file is given below: ``` megs: 32 display_library: sdl romimage: file=/usr/share/bochs/BIOS-bochs-latest vgaromimage: file=/usr/share/bochs/VGABIOS-lgpl-latest ata0-master: type=disk, path=./os.iso, status=inserted boot: cdrom log: bochlog.txt clock: sync=realtime, time=local cpu: count=1, ips=1000000 ``` You might need to change the path to `romimage` and `vgaromimage` depending on how you installed Bochs. More information about the Bochs config file can be found at Bochs's website [23]. If you saved the configuration in a file named `bochsrc.txt`, then you can run Bochs with the following command: ``` bochs -f bochsrc.txt -q ``` The flag `-f` tells Bochs to use the given configuration file and the flag `-q` tells Bochs to skip the interactive start menu. You should now see Bochs starting and displaying a console with some information from GRUB on it. After quitting Bochs, display the log produced by Bochs: ``` cat bochlog.txt ``` You should now see the contents of the registers of the CPU simulated by Bochs somewhere in the output. If you find `RAX=000000000000CAFE` or `EAX=CAFEBABE` (depending on if you are running Bochs with or without 64 bit support) in the output then your OS has successfully booted! ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 17 Context: ```markdown # Further Reading - Gustavo Duarte has written an in-depth article about what actually happens when an x86 computer boots up: [How Computers Boot Up](http://duartes.org/gustavo/blog/post/how-computers-boot-up) - Gustavo continues to describe what the kernel does in the very early stages at: [Kernel Boot Process](http://duartes.org/gustavo/blog/post/kernel-boot-process) - The OSDev wiki also contains a nice article about booting an x86 computer: [Boot Sequence](http://wiki.osdev.org/Boot_Sequence) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 18 Context: I'm unable to view or interpret images. Please provide the text you want to be formatted in Markdown, and I'll help you correct it. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 19 Context: ```markdown # Chapter 3 ## Getting to C This chapter will show you how to use C instead of assembly code as the programming language for the OS. Assembly is very good for interacting with the CPU and enables maximum control over every aspect of the code. However, at least for the authors, C is a much more convenient language to use. Therefore, we would like to use C as much as possible and use assembly code only where it makes sense. ### Setting Up a Stack One prerequisite for using C is a stack, since all non-trivial C programs use a stack. Setting up a stack is not harder than to make the `esp` register point to the end of an area of free memory (remember that the stack grows towards lower addresses on the x86) that is correctly aligned (alignment on 4 bytes is recommended from a performance perspective). We could point `esp` to a random area in memory since, so far, the only thing in the memory is GRUB, BIOS, the OS kernel, and some memory-mapped I/O. This is not a good idea – we don’t know how much memory is available or if the area `esp` would point to is used by something else. A better idea is to reserve a piece of uninitialized memory in the .bss section in the ELF file of the kernel. It is better to use the .bss section instead of the data section to reduce the size of the OS executable. Since GRUB understands ELF, GRUB will allocate any memory reserved in the .bss section when loading the OS. The NASM pseudo-instruction `resb` [24] can be used to declare uninitialized data: ``` KERNEL_STACK_SIZE equ 4096 ; size of stack in bytes section .bss align 4 ; align at 4 bytes kernel_stack: ; label points to beginning of memory resb KERNEL_STACK_SIZE ; reserve stack for the kernel ``` There is no need to worry about the use of uninitialized memory for the stack, since it is not possible to read a stack location that has not been written (without manual pointer fiddling). A (correct) program can not pop an element from the stack without having pushed an element onto the stack first. Therefore, the memory locations of the stack will always be written to before they are being read. The stack pointer is then set up by pointing `esp` to the end of the `kernel_stack` memory: ``` mov esp, kernel_stack + KERNEL_STACK_SIZE ; point esp to the start of the stack (end of memory area) ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 20 Context: ```markdown # Calling C Code From Assembly The next step is to call a C function from assembly code. There are many different conventions for how to call C code from assembly code [25]. This book uses the `cdecl` calling convention, since that is the one used by GCC. The `cdecl` calling convention states that arguments to a function should be passed via the stack (on x86). The arguments of the function should be pushed on the stack in a right-to-left order, that is, you push the rightmost argument first. The return value of the function is placed in the eax register. The following code shows an example: ```c /* The C function */ int sum_of_three(int arg1, int arg2, int arg3) { return arg1 + arg2 + arg3; } ``` ```assembly ; The assembly code external sum_of_three ; the function sum_of_three is defined elsewhere push dword 3 ; arg3 push dword 2 ; arg2 push dword 1 ; arg1 call sum_of_three ; call the function, the result will be in eax ``` ## Packing Structs In the rest of this book, you will often come across “configuration bytes” that are a collection of bits in a very specific order. Below follows an example with 32 bits: ``` Bit: | 31 | 24 | 23 | 8 | 7 | 0 | Content:| index | address | config | ``` Instead of using an `unsigned int`, `unsigned int`, for handling such configurations, it is much more convenient to use “packed structures”: ```c struct example { unsigned char config; /* bit 0 - 7 */ unsigned short address; /* bit 8 - 23 */ unsigned char index; /* bit 24 - 31 */ }; ``` When using the struct in the previous example there is no guarantee that the size of the struct will be exactly 32 bits - the compiler can add some padding between elements for various reasons, for example to speed up element access or due to requirements set by the hardware and/or compiler. When using a struct to represent configuration bytes, it is very important that the compiler does not add any padding because the struct will eventually be treated as a 32 bit unsigned integer by the hardware. The attribute `packed` can be used to force GCC to not add any padding: ```c struct example { unsigned char config; /* bit 0 - 7 */ } __attribute__((packed)); ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 21 Context: ```markdown unsigned short address; /* bit 8 - 23 */ unsigned char index; /* bit 24 - 31 */ } __attribute__((packed)); Note that `__attribute__((packed))` is not part of the C standard - it might not work with all C compilers. ## Compiling C Code When compiling the C code for the OS, a lot of flags to GCC need to be used. This is because the C code should not assume the presence of a standard library, since there is no standard library available for our OS. For more information about the flags, see the GCC manual. The flags used for compiling the C code are: ``` -m32 -nostdlib -nostartfiles -fno-builtin -fno-stack-protector -nostartfiles -nodefaultlibs ``` As always when writing C programs we recommend turning on all warnings and treating warnings as errors: ``` -Wall -Wextra -Werror ``` You can now create a function `kmain` in a file called `kmain.c` that you call from `loader.s`. At this point, `kmain` probably won't need any arguments (but in later chapters it will). ## Build Tools Now is also probably a good time to set up some build tools to make it easier to compile and test-run the OS. We recommend using `make`, but there are plenty of other build systems available. A simple Makefile for the OS could look like the following example: ```makefile OBJS = loader.o kmain.o CC = gcc CFLAGS = -m32 -nostdlib -nostartfiles -fno-builtin -fno-stack-protector \ -nostartfiles -nodefaultlibs LDFLAGS = -T link.ld -m elf_i386 AS = nasm ASFLAGS = -f elf all: kernel.elf kernel.elf: $(OBJS) ld $(LDFLAGS) $(OBJS) -o kernel.elf os.iso: kernel.elf cp kernel.elf iso/boot/kernel.elf \ && genisoimage -R \ -b boot/grub/stage2_eltorito \ --no-emul-boot \ os.iso ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 22 Context: ```markdown ```bash -boot-load-size 4 \ -A os \ -input-charset utf8 \ -quiet \ -boot-info-table \ -o os.iso \ iso run: os.iso \tbochs -f bochsrc.txt -q %.c: %.c \t$(CC) $(CFLAGS) $< -o $@ %.s: %.s \t$(AS) $(ASFLAGS) $< -o $@ clean: \trm -rf *.o kernel.elf os.iso ``` The contents of your working directory should now look like the following figure: ``` . ├── bochsrc.txt ├── iso │ ├── boot │ │ ├── grub │ │ │ ├── menu.lst │ │ │ └── stage2_elotorito ├── xmain.c ├── loader.s └── Makefile ``` You should be able to start the OS with the simple command `make run`, which will compile the kernel and boot it up in Bochs (as defined in the Makefile above). ## Further Reading - Kernighan & Ritchie's book, *The C Programming Language, Second Edition*, is great for learning about all the aspects of C. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 23 Context: ```markdown # Chapter 4 ## Output This chapter will present how to display text on the console as well as writing data to the serial port. Furthermore, we will create our first driver, that is, code that acts as a layer between the kernel and the hardware, providing a higher abstraction than communicating directly with the hardware. The first part of this chapter is about creating a driver for the framebuffer [26] to be able to display text on the console. The second part shows how to create a driver for the serial port. Both can store output from the serial port in a file, effectively creating a logging mechanism for the operating system. ## Interacting with the Hardware There are usually two different ways to interact with the hardware: memory-mapped I/O and I/O ports. If the hardware uses memory-mapped I/O, then you can write to a specific memory address and the hardware will be updated with the new data. One example of this is the framebuffer, which will be discussed in more detail later. For example, if you write the value `0x14` to address `0x000B8000`, you will see the letter 'A' in white color on a black background (see the section on the framebuffer for more details). If the hardware uses I/O ports, then the assembly code instructions `out` and `in` must be used to communicate with the hardware. The instruction `out` takes two parameters: the address of the I/O port and the data to send. The instruction `in` takes a single parameter, the address of the I/O port, and returns data from the hardware. One can think of I/O ports as communicating with hardware the same way you communicate with a server using sockets. The cursor (the blinking rectangle) of the framebuffer is one example of hardware controlled via I/O ports on a PC. ## The Framebuffer The framebuffer is a hardware device that is capable of displaying a buffer of memory on the screen [26]. The framebuffer has 80 columns and 25 rows, and the row and column indices start at 0 (so rows are labelled 0-24). ## Writing Text Writing text to the console via the framebuffer is done with memory-mapped I/O. The starting address of the memory-mapped I/O for the framebuffer is `0x000B8000` [27]. The memory is divided into 16-bit cells, 23 ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 24 Context: ```markdown where the 16 bits determine both the character, the foreground color and the background color. The highest eight bits is the ASCII [28] value of the character, bit 7 - 4 the background and bit 3 - 0 the foreground, as can be seen in the following figure: ``` Bit: | 15 14 13 12 11 10 9 8 | 7 6 5 4 | 3 2 1 0 | Content: | ASCII | FG | BG | ``` The available colors are shown in the following table: | Color | Value | Color | Value | Color | Value | Color | Value | |-------------|-------|-------------|-------|-------------|-------|-------------|-------| | Black | 0 | Red | 4 | Dark grey | 8 | Light red | 12 | | Blue | 1 | Magenta | 5 | Light blue | 9 | Light magenta| 13 | | Green | 2 | Brown | 6 | Light green | 10 | Light brown | 14 | | Cyan | 3 | Light grey | 7 | Light cyan | 11 | White | 15 | The first cell corresponds to row zero, column zero on the console. Using an ASCII table, one can see that A corresponds to 65 or 0x41. Therefore, to write the character A with a green foreground (2) and dark grey background (8) at place (0,0), the following assembly code instruction is used: ``` mov [0x000B8000], 0x128 ``` The second cell then corresponds to row zero, column one and its address is therefore: ``` 0x000B8000 + 16 = 0x000B8001 ``` Writing to the framebuffer can also be done in C by treating the address 0x000B8000 as a char pointer, `char *fb = (char *) 0x000B8000`. Then, writing A at place (0,0) with green foreground and dark grey background becomes: ```c fb[0] = 'A'; fb[1] = 0x28; ``` The following code shows how this can be wrapped into a function: ```c /* fb_write_cell: * Writes a character with the given foreground and background to position i * in the framebuffer. * * @param i The location in the framebuffer * @param c The character * @param fg The foreground color * @param bg The background color */ void fb_write_cell(unsigned int i, char c, unsigned char fg, unsigned char bg) { fb[i] = c; fb[i + 1] = ((fg & 0x0F) << 4) | (bg & 0x0F); } ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 25 Context: ```markdown The function can then be used as follows: ```c #define FB_GREEN 2 #define FB_DARK_GREY 8 fb_write_cell(0, 'A', FB_GREEN, FB_DARK_GREY); ``` ## Moving the Cursor Moving the cursor of the framebuffer is done via two different I/O ports. The cursor's position is determined with a 16 bits integer: 0 means row zero, column zero; 1 means row zero, column one; 80 means row one, column zero and so on. Since the position is 16 bits large, and the out assembly code instruction argument is 8 bits, the position must be sent in two turns, first 8 bits then the next 8 bits. The framebuffer has two I/O ports, one for accepting the data, and one for describing the data being received. Port `0x3D4` [29] is the port that describes the data and port `0x3D5` [29] is for the data itself. To set the cursor at row one, column zero (position 80 = `0x0050`), one would use the following assembly code instructions: ```assembly out 0x3D4, 14 ; 14 tells the framebuffer to expect the highest 8 bits of the position out 0x3D5, 0x00 ; sending the highest 8 bits of 0x0050 out 0x3D4, 15 ; 15 tells the framebuffer to expect the lowest 8 bits of the position out 0x3D5, 0x50 ; sending the lowest 8 bits of 0x0050 ``` The out assembly code instruction can't be executed directly in C. Therefore, it is a good idea to wrap out in a function in assembly code which can be accessed from C via the direct calling standard [25]: ```assembly global outb ; make the label outb visible outside this file ; outb - send a byte to an I/O port ; stack: [esp + 8] the data byte ; [esp + 4] the I/O port ; [esp] return address outb: mov al, [esp + 8] ; move the data to be sent into the al register mov dx, [esp + 4] ; move the address of the I/O port into the dx register out dx, al ; send the data to the I/O port ret ; return to the calling function ``` By storing this function in a file called `io.s` and also creating a header `io.h`, the out assembly code instruction can be conveniently accessed from C: ```c #ifndef INCLUDE_IO_H #define INCLUDE_IO_H /** * outb: * Sends the given data to the given I/O port. Defined in io.s * * @param port The I/O port to send the data to * @param data The data to send to the I/O port */ ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 26 Context: ```markdown */ void outb(unsigned short port, unsigned char data); #endif /* INCLUDE_IO_H */ /* Moving the cursor can now be wrapped in a C function: */ #include "io.h" /* The I/O ports */ #define FB_COMMAND_PORT 0x3D4 #define FB_DATA_PORT 0x3D5 /* The I/O port commands */ #define FB_HIGH_BYTE_COMMAND 14 #define FB_LOW_BYTE_COMMAND 15 /** fb_move_cursor: * Moves the cursor of the framebuffer to the given position * * @param pos The new position of the cursor */ void fb_move_cursor(unsigned short pos) { outb(FB_COMMAND_PORT, FB_HIGH_BYTE_COMMAND); outb(FB_DATA_PORT, (pos >> 8) & 0x00FF); outb(FB_COMMAND_PORT, FB_LOW_BYTE_COMMAND); outb(FB_DATA_PORT, pos & 0x00FF); } # The Driver The driver should provide an interface that the rest of the code in the OS will use for interacting with the framebuffer. There is no right or wrong in what functionality the interface should provide, but a suggestion is to have a `write` function with the following declaration: ```c int write(char *buf, unsigned int len); ``` The `write` function writes the contents of the buffer of length `len` to the screen. The `write` function should automatically advance the cursor after a character has been written and scroll the screen if necessary. # The Serial Ports The serial port [30] is an interface for communicating between hardware devices and, although it is available on almost all motherboards, it is seldom exposed to the user in the form of a DE-9 connector nowadays. The serial port is easy to use, and, more importantly, it can be used as a logging utility in Bochs. If a computer has support for a serial port, then it usually has support for multiple serial ports, but we will only make use of one of the ports. This is because we will only use the serial ports for logging. Furthermore, we will only use the serial ports for output, not input. The serial ports are completely controlled via I/O ports. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 27 Context: ```markdown # Configuring the Serial Port The first data that needs to be sent to the serial port is configuration data. In order for two hardware devices to be able to talk to each other, they must agree upon a couple of things. These things include: - The speed used for sending data (bit or baud rate) - If any error checking should be used for the data (parity bit, stop bits) - The number of bits that represent a unit of data (data bits) # Configuring the Line Configuring the line means to configure how data is being sent over the line. The serial port has an I/O port, the **line command port**, that is used for configuration. First, the speed for sending data will be set. The serial port has an internal clock that runs at 115200 Hz. Setting the speed means sending a divisor to the serial port; for example, sending 2 results in a speed of 115200 / 2 = 57600 Hz. The divisor is a 16-bit number, but we can only send 8 bits at a time. We must therefore send an instruction telling the serial port to first expect the highest 8 bits, then the lowest 8 bits. This is done by sending `0x80` to the line command port. An example is shown below: ```c #include "io.h" /* io.h is implemented in the section "Moving the cursor" */ /* The I/O ports */ /* All the I/O ports are calculated relative to the data port. This is because * all serial ports (COM1, COM2, COM3, COM4) have their ports in the same * order, but they start at different values. */ #define SERIAL_COM1_BASE 0x3F8 /* COM1 base port */ #define SERIAL_DATA_PORT(base) (base) #define SERIAL_FIFO_COMMAND_PORT(base) (base + 2) #define SERIAL_LINE_COMMAND_PORT(base) (base + 3) #define SERIAL_MODEM_COMMAND_PORT(base) (base + 4) #define SERIAL_LINE_STATUS_PORT(base) (base + 5) /* The I/O port commands */ /* SERIAL_LINE_ENABLE_DLAB: */ /* Tells the serial port to expect first the highest 8 bits on the data port, * then the lowest 8 bits will follow */ #define SERIAL_LINE_ENABLE_DLAB 0x80 /** serial_configure_baud_rate: */ /* Sets the speed of the data being sent. The default speed of a serial * port is 115200 bits/s. The argument is a divisor of that number, hence * the resulting speed becomes (115200 / divisor) bits/s. */ ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 28 Context: ```markdown /** * @param con The COM port to configure * @param divisor The divisor */ void serial_configure_baud_rate(unsigned short con, unsigned short divisor) { outb(SERIAL_LINE_COMMAND_PORT(con), SERIAL_LINE_ENABLE_DLAB); outb(SERIAL_DATA_PORT(con), (divisor >> 8) & 0xFF); outb(SERIAL_DATA_PORT(con), divisor & 0xFF); } The way that data should be sent must be configured. This is also done via the line command port by sending a byte. The layout of the 8 bits looks like the following: Bit: | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Content: | d | b | pty | s | dl | A description for each name can be found in the table below (and in [31]): | Name | Description | |------|-------------| | d | Enables (d = 1) or disables (d = 0) DLAB | | b | If break control is enabled (b = 1) or disabled (b = 0) | | pty | The number of parity bits to use | | s | The number of stop bits to use (s = 0 equals 1, s = 1 equals 1.5 or 2) | | dl | Describes the length of the data | We will use the mostly standard value `0x03` [31], meaning a length of 8 bits, no parity bit, one stop bit and break control disabled. This is sent to the line command port, as seen in the following example: /** serial_configure_line: * Configures the line of the given serial port. The port is set to have a * data length of 8 bits, no parity bits, one stop bit and break control * disabled. * * @param con The serial port to configure */ void serial_configure_line(unsigned short con) { /* Bit: | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | * Content: | d | b | pty | s | dl | * Value: | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0x03 | */ outb(SERIAL_LINE_COMMAND_PORT(con), 0x03); } ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 29 Context: ```markdown The article on OSDev [31] has a more in-depth explanation of the values. ## Configuring the Buffers When data is transmitted via the serial port it is placed in buffers, both when receiving and sending data. This way, if you send data to the serial port faster than it can send it over the wire, it will be buffered. However, if you send too much data too fast the buffer will be full and data will be lost. In other words, the buffers are FIFO queues. The FIFO queue configuration byte looks like the following figure: | Bit: | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |---------------|---|---|---|---|---|---|---|---| | Content: | 1 | l | v | l | 1 | b | s | r | dma | c | l | t | c | l | r | e | A description for each name can be found in the table below: | Name | Description | |-------|------------------------------------------------------| | lvl | How many bytes should be stored in the FIFO buffers | | bs | If the buffers should be 16 or 64 bytes large | | r | Reserved for future use | | dma | How the serial port data should be accessed | | clt | Clear the transmission FIFO buffer | | clr | Clear the receiver FIFO buffer | | e | If the FIFO buffer should be enabled or not | We use the value `0xC7 = 11000000` that: - Enables FIFO - Clears both receiver and transmission FIFO queues - Uses 14 bytes as size of queue The WikiBook on serial programming [32] explains the values in more depth. ## Configuring the Modem The modem control register is used for very simple hardware flow control via the Ready To Transmit (RTS) and Data Terminal Ready (DTR) pins. When configuring the serial port we want RTS and DTR to be 1, which means that we are ready to send data. The modem configuration byte is shown in the following figure: | Bit: | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |---------------|---|---|---|---|---|---|---|---| | Content: | 1 | r | 1 | a | 1 | b | a | c | 2 | a | 0 | 1 | r | t | s | d | t | A description for each name can be found in the table below: ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 30 Context: ```markdown | Name | Description | |------|-----------------------------------------| | r | Reserved | | af | Autoload control enabled | | lb | Loopback mode (used for debugging serial ports) | | ao2 | Auxiliary output 2, used for receiving interrupts | | ao1 | Auxiliary output 1 | | rts | Ready To Transmit | | dtr | Data Terminal Ready | We don’t need to enable interrupts, because we won’t handle any received data. Therefore we use the configuration value `0x03 = 00000011` (RTS = 1 and DTS = 1). ## Writing Data to the Serial Port Writing data to the serial port is done via the data I/O port. However, before writing, the transmit FIFO queue has to be empty (all previous writes must have finished). The transmit FIFO queue is empty if bit 5 of the line status I/O port is equal to one. Reading the contents of an I/O port is done via the in assembly code instruction. There is no way to use the in assembly code instruction from C, therefore it has to be wrapped (the same way as the out assembly code instruction): ```assembly global inb ; inb - returns a byte from the given I/O port ; stack: [esp + 4] The address of the I/O port ; [esp ] The return address inb: mov dx, [esp + 4] ; move the address of the I/O port to the dx register in al, dx ; read a byte from the I/O port and store it in the al register ret ; return the read byte ``` /* in file io.h */ /** inb: * Read a byte from an I/O port. * * @param port The address of the I/O port * @return The read byte */ unsigned char inb(unsigned short port); Checking if the transmit FIFO is empty can then be done from C: ```c #include "io.h" ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 31 Context: ```markdown /** serial_is_transmit_fifo_empty: * Checks whether the transmit FIFO queue is empty or not for the given COM port. * * @param com The COM port * @return 0 if the transmit FIFO queue is not empty * 1 if the transmit FIFO queue is empty */ int serial_is_transmit_fifo_empty(unsigned int com) { /* 0x20 = 0010 0000 */ return inb(SERIAL_LINE_STATUS_PORT(com)) & 0x20; } Writing to a serial port means spinning as long as the transmit FIFO queue isn’t empty, and then writing the data to the data I/O port. ## Configuring Bochs To save the output from the first serial port the Bochs configuration file `bochsrc.txt` must be updated. The `com1` configuration instructs Bochs how to handle the first serial port: ``` com1: enabled=1, mode=file, dev=com1.out ``` The output from serial port one will now be stored in the file `com1.out`. ## The Driver We recommend that you implement a `write` function for the serial port similar to the `write` function in the driver for the framebuffer. To avoid name clashes with the `write` function for the framebuffer it is a good idea to name the functions `tb_write` and `serial_write` to distinguish them. We further recommend that you try to write a `printf`-like function, see section 7.3 in [1]. The `printf` function could take an additional argument to decide to which device to write the output (framebuffer or serial). A final recommendation is that you create some way of distinguishing the severeness of the log messages, for example by prepending the messages with `DEBUG`, `INFO` or `ERROR`. ## Further Reading - The book “Serial programming” (available on WikiBooks) has a great section on programming the serial port, [http://en.wikibooks.org/wiki/Serial_Programming/8250_UART_Programming#UART_Registers](http://en.wikibooks.org/wiki/Serial_Programming/8250_UART_Programming#UART_Registers) - The OSDev wiki has a page with a lot of information about the serial ports, [http://wiki.osdev.org/Serial_ports](http://wiki.osdev.org/Serial_ports) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 32 Context: I'm sorry, but I can't assist with that. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 33 Context: ```markdown # Chapter 5 ## Segmentation Segmentation in x86 means accessing the memory through segments. Segments are portions of the address space, possibly overlapping, specified by a base address and a limit. To address a byte in segmented memory, you use a 48-bit logical address: 16 bits that specifies the segment and 32 bits that specifies what offset within that segment you want. The offset is added to the base address of the segment, and the resulting linear address is checked against the segment's limit - see the figure below. If everything works out fine (including access-rights checks ignored for now), the result is a linear address. When paging is disabled, then the linear address space is mapped 1:1 onto the physical address space, and the physical memory can be accessed. (See the chapter "Paging" for how to enable paging.) To enable segmentation, you need to set up a table that describes each segment - a segment descriptor table. In x86, there are two types of descriptor tables: the Global Descriptor Table (GDT) and Local Descriptor Tables (LDT). An LDT is set up and managed by user-space processes, and all processes have their own LDT. LDTs can be used if a more complex segmentation model is desired - we won't use it. The GDT is shared by everyone - it’s global. As we discuss in the sections on virtual memory and paging, segmentation is rarely used more than in a minimal setup, similar to what we do below. ## Accessing Memory Most of the time when accessing memory there is no need to explicitly specify the segment to use. The processor has six 16-bit segment registers: `cs`, `ds`, `es`, `fs`, and `gs`. The `cs` register is the code segment register and specifies the segment to use when fetching instructions. The `ds` register is used when accessing the stack (through the stack pointer `esp`), and `fs` is used for other data accesses. The OS is free to use the registers `gs` and `fs` as however it want. Below is an example showing implicit use of the segment registers: ```assembly func: mov eax, [esp+4] mov ebx, [eax] add ebx, 8 mov [eax], ebx ret ``` The above example can be compared with the following one that makes explicit use of the segment registers: ```assembly ; Example not provided ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 34 Context: ```markdown # Translation of Logical Addresses to Linear Addresses Figure 5.1: Translation of logical addresses to linear addresses. | **Segment Selector** | **Offset (Effective Address)** | **Base Address** | **Linear Address** | |----------------------|-------------------------------|------------------|---------------------| | 15 | 31 | 0 | 0 | **Segments:** - **Segment Descriptor:** - Contains the base address and limit of the segment. ### Breakdown - **Segment Selector (15 bits):** Identifies the segment. - **Offset (31 bits):** Specifies the location within the segment. - **Base Address:** The starting address of the segment. - **Linear Address (31 bits):** Resulting address after combining base and offset. ## Figure Reference - This figure illustrates the process of translating logical addresses to linear addresses. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 35 Context: ```markdown func: mov eax, [esp+4] mov ebx, [ds:eax] add ebx, 8 mov [ds:eax], ebx ret You don't need to use for storing the stack segment selector, or as for the data segment selector. You could store the stack segment selector in ds and vice versa. However, in order to use the implicit style shown above, you must store the segment selectors in their indirect registers. Segment descriptors and their fields are described in Figure 3-8 in the Intel manual [33]. # The Global Descriptor Table (GDT) A GDT/LDT is an array of 8-byte segment descriptors. The first descriptor in the GDT is always a null descriptor and can never be used to access memory. At least two segment descriptors (plus the null descriptor) are needed for the GDT, because the descriptor contains more information than just the base and limit fields. The two most relevant fields for us are the Type field and the Descriptor Privilege Level (DPL) field. Table 3-1 in chapter 3 of the Intel manual [33] specifies the values for the Type field. The table shows that the Type field can't be both writable and executable at the same time. Therefore, two segments are needed: one segment for executing code to put in cs (Type = Execute-only or Execute-Read) and one segment for reading and writing data (Type = Read/Write) to put in the other segment registers. The DPL specifies the privilege levels required to use the segment. x86 allows for four privilege levels (PL), 0 to 3, where PL0 is the most privileged. In most operating systems (e.g., Linux and Windows), only PL0 and PL3 are used. However, some operating systems, such as MINIX, make use of all levels. The kernel should be able to do anything; therefore, it uses segments with DPL set to 0 (also called kernel mode). The current privilege level (CPL) is determined by the segment selector in cs. The segments needed are described in the table below. | Index | Offset | Name | Address range | Type | DPL | |-------|--------|----------------------------|--------------------------------|------|-----| | 0 | 0x00 | null descriptor | | | | | 1 | 0x08 | kernel code segment | 0x00000000 - 0xFFFFFFFF | RX | PL0 | | 2 | 0x10 | kernel data segment | 0x00000000 - 0xFFFFFFFF | RW | PL0 | Note that the segments overlap - they both encompass the entire linear address space. In our minimal setup we'll only use segmentation to get privilege levels. See the Intel manual [33], chapter 3, for details on the other descriptor fields. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 36 Context: ```markdown # Loading the GDT Loading the GDT into the processor is done with the `lgdt` assembly code instruction, which takes the address of a struct that specifies the start and size of the GDT. It is easiest to encode this information using a "packed struct" as shown in the following example: ```c struct gdt { unsigned int address; unsigned short size; } __attribute__((packed)); ``` If the content of the `eax` register is the address to such a struct, then the GDT can be loaded with the assembly code shown below: ```assembly lgdt [eax] ``` It might be easier if you make this instruction available from C, the same way as was done with the assembly code instructions in and out. After the GDT has been loaded, the segment registers need to be loaded with their corresponding segment selectors. The content of a segment selector is described in the figure and table below: | Bit: | 15 | 3 | 2 | 1 | 0 | |------------|----|---|---|---|---| | Content: | offset (index) | ti | rpl | ### Segment Selector Structure | Name | Description | |--------|--------------------------------------------------------------| | rpl | Requested Privilege Level - we want to execute in PL0 for now. | | ti | Table Indicator. 0 means that this specifies a GDT segment, 1 means an LDT Segment. | | offset (index) | Offset within descriptor table. | The offset of the segment selector is added to the start of the GDT to get the address of the segment descriptor: `0x08` for the first descriptor and `0x10` for the second, since each descriptor is 8 bytes. The Requested Privilege Level (RPL) should be 0 since the kernel of the OS should execute in privilege level 0. Loading the segment selector registers is easy for the data registers - just copy the correct offsets to the registers: ```assembly mov ds, 0x10 mov es, 0x10 mov ss, 0x10 ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 37 Context: ```markdown To load `cs` we have to do a “far jump”: ```assembly ; code here uses the previous cs jmp 0x0080:flush_cs ; specify cs when jumping to flush_cs flush_cs: ; now we've changed cs to 0x08 ``` A far jump is a jump where we explicitly specify the full 48-bit logical address: the segment selector to use and the absolute address to jump to. It will first set `cs` to `0x08` and then jump to `flush_cs` using its absolute address. ## Further Reading - Chapter 3 of the Intel manual [33] is filled with low-level and technical details about segmentation. - The OSDev wiki has a page about segmentation: [OSDev Segmentation](http://wiki.osdev.org/Segmentation) - The Wikipedia page on x86 segmentation might be worth looking into: [Wikipedia X86 Memory Segmentation](http://en.wikipedia.org/wiki/X86_memory_segmentation) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 38 Context: I'm unable to view the image you provided. Please share the Markdown text you would like me to correct, and I'll be happy to assist! #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 39 Context: ```markdown # Chapter 6 ## Interrupts and Input Now that the OS can produce output, it would be nice if it also could get some input. (The operating system must be able to handle interrupts in order to read information from the keyboard.) An interrupt occurs when a hardware device, such as the keyboard, the serial port, or the timer, signals the CPU that the state of the device has changed. The CPU itself can also send interrupts due to program errors; for example, when a program references memory it doesn’t have access to, or when a program divides a number by zero. Finally, there are also *software interrupts*, which are interrupts that are caused by the int assembly code instruction, and they are often used for system calls. ## Interrupts Handlers Interrupts are handled via the **Interrupt Descriptor Table** (IDT). The IDT describes a handler for each interrupt. The interrupts are numbered (0 - 255) and the handler for interrupt i is defined at the ith position in the table. There are three different kinds of handlers for interrupts: - Task handler - Interrupt handler - Trap handler The task handlers use functionality specific to the Intel version of x86, so they won’t be covered here (see the Intel manual [33], chapter 6, for more info). The only difference between an interrupt handler and a trap handler is that the interrupt handler disables interrupts, which means you cannot get an interrupt while at the same time handling an interrupt. In this book, we will use trap handlers and disable interrupts manually when we need to. ## Creating an Entry in the IDT An entry in the IDT for an interrupt handler consists of 64 bits. The highest 32 bits are shown in the figure below: | Bit: | 31 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |-----------------------|-----|-----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----| | Content: | offset high | P | DPL | 0 | D | 1 | 1 | 0 | 0 | 0 | 0 | reserved | ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 40 Context: ```markdown The lowest 32 bits are presented in the following figure: Bit: | 31 | 16 | 15 | 0 | Content: | segment selector | offset low | A description for each name can be found in the table below: | Name | Description | |--------------|--------------------------------------------------------------------------------| | offset high | The 16 highest bits of the 32 bit address in the segment. | | offset low | The 16 lowest bits of the 32 bits address in the segment. | | P | If the handler is present in memory or not (1 = present, 0 = not present). | | DPL | Descriptor Privilege Level, the privilege level the handler can be called from (0, 1, 2, 3). | | D | Size of gate, (1 = 32 bits, 0 = 16 bits). | | segment selector | The offset in the GDT. | | r | Reserved. | The offset is a pointer to code (preferably an assembly code label). For example, to create an entry for a handler whose code starts at `0xDEADBEEF` and that runs in privilege level 0 (therefore using the same code segment selector as the kernel) the following two bytes would be used: ``` 0xDEADB00 0x0000BEEF ``` If the IDT is represented as an `unsigned integer idt[512]` then to register the above example as an handler for interrupt 0 (divide-by-zero), the following code would be used: ```c idt[0] = 0xDEADB00; idt[1] = 0x0000BEEF; ``` As written in the chapter “Getting to C”, we recommend that you instead of using bytes (or unsigned integers) use packed structures to make the code more readable. ## Handling an Interrupt When an interrupt occurs the CPU will push some information about the interrupt onto the stack, then look up the appropriate interrupt handler in the IDT and jump to it. The stack at the time of the interrupt will look like the following: ``` [esp + 12] eflags [esp + 8] cs [esp + 4] eip [esp] error code? ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 41 Context: ```markdown The reason for the question mark behind error code is that not all interrupts create an error code. The specific CPU interrupts that put an error code on the stack are 8, 10, 11, 12, 13, 14, and 17. The error code can be used by the interrupt handler to get more information on what has happened. Also, note that the interrupt number is not pushed onto the stack. We can only determine what interrupt has occurred by knowing what code is executing - if the handler registered for interrupt 17 is executing, then interrupt 17 has occurred. Once the interrupt handler is done, it uses the `iret` instruction to return. The instruction `iret` expects the stack to be the same as at the time of the interrupt (see the figure above). Therefore, any values pushed onto the stack by the interrupt handler must be popped. Before returning, `iret` restores `eflags` by popping the value from the stack and then finally jumps to `cs:eip` as specified by the values on the stack. The interrupt handler has to be written in assembly code, since all registers that the interrupt handlers use must be preserved by pushing them onto the stack. This is because the code that was interrupted doesn’t know about the interrupt and will therefore expect that its registers stay the same. Writing all the logic of the interrupt handler in assembly code will be tiresome. Creating a handler in assembly code that saves the registers, calls a C function, restores the registers and finally executes `iret` is a good idea! The C handler should get the state of the registers, the state of the stack and the number of the interrupt as arguments. The following definitions can for example be used: ```c struct cpu_state { unsigned int eax; unsigned int ebx; unsigned int ecx; // ... unsigned int esp; } __attribute__((packed)); struct stack_state { unsigned int error_code; unsigned int eip; unsigned int cs; unsigned int flags; } __attribute__((packed)); void interrupt_handler(struct cpu_state cpu, struct stack_state stack, unsigned int interrupt); ``` ## Creating a Generic Interrupt Handler Since the CPU does not push the interrupt number on the stack it is a little tricky to write a generic interrupt handler. This section will use macros to show how it can be done. Writing one version for each interrupt is tedious - it is better to use the macro functionality of NASM. And since not all interrupts produce an error code the value 0 will be added as the "error code" for interrupts without an error code. The following code shows an example of how this can be done: ```asm ;%macro no_error_code_interrupt_handler 1 global interrupt_handler_%1 interrupt_handler_%1: push dword 0 ; push 0 as error code ;%endmacro ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 42 Context: ```markdown ```assembly push dword %1 ; push the interrupt number jmp common_interrupt_handler ; jump to the common handler ``` `%endmacro` ```assembly %macro error_code_interrupt_handler %1 global interrupt_handler_%1 interrupt_handler_%1: ; handler for interrupt %1 push dword %1 ; push the interrupt number jmp common_interrupt_handler ; jump to the common handler %endmacro ``` ```assembly common_interrupt_handler: ; the common parts of the generic interrupt handler ; save the registers push eax push ebx . . push ebp ; call the C function call interrupt_handler ; restore the registers pop ebp . . pop ebx pop eax ; restore the esp add esp, 8 ; return to the code that got interrupted iret ``` ```assembly no_error_code_interrupt_handler 0 ; create handler for interrupt 0 no_error_code_interrupt_handler 1 ; create handler for interrupt 1 . . error_code_handler 7 ; create handler for interrupt 7 ``` The `common_interrupt_handler` does the following: - Push the registers on the stack. - Call the C function `interrupt_handler`. - Pop the registers from the stack. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 43 Context: ```markdown - Add 8 to `esp` (because of the error code and the interrupt number pushed earlier). - Execute `iret` to return to the interrupted code. Since the macros declare global labels the addresses of the interrupt handlers can be accessed from C or assembly code when creating the IDT. ## Loading the IDT The IDT is loaded with the `lidt` assembly code instruction which takes the address of the first element in the table. It is easiest to wrap this instruction and use it from C: ```asm global load_idt ; load_idt - Loads the interrupt descriptor table (IDT). ; stack: [esp + 4] the address of the first entry in the IDT ; [esp ] the return address load_idt: mov eax, [esp + 4] ; load the address of the IDT into register eax lidt eax ; load the IDT ret ; return to the calling function ``` ## Programmable Interrupt Controller (PIC) To start using hardware interrupts you must first configure the Programmable Interrupt Controller (PIC). The PIC makes it possible to map signals from the hardware to interrupts. The reasons for configuring the PIC are: - Remap the interrupts. The PIC uses interrupts 0 - 15 for hardware interrupts by default, which conflicts with the CPU interrupts. Therefore the PIC interrupts must be remapped to another interval. - Select which interrupts to receive. You probably don't want to receive interrupts from all devices since you don't have code that handles these interrupts anyway. - Set up the correct mode for the PIC. In the beginning there was only one PIC (PIC 1) and eight interrupts. As more hardware were added, 8 interrupts were too few. The solution chosen was to chain an another PIC (PIC 2) on the first PIC (see interrupt 2 on PIC 1). The hardware interrupts are shown in the table below: | PIC 1 | Hardware | PIC 2 | Hardware | |-------|----------------|-------|--------------| | 0 | Timer | 8 | Real Time Clock | | 1 | Keyboard | 9 | General I/O | | 2 | PIC 2 | 10 | General I/O | | 3 | COM 2 | 11 | General I/O | | 4 | COM 1 | 12 | General I/O | ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 44 Context: ```markdown | PIC 1 Hardware | PIC 2 Hardware | |----------------|-----------------| | 5 | LPT 2 | 13 | Coprocessor | | 6 | Floppy disk | 14 | IDE Bus | | 7 | LPT 1 | 15 | IDE Bus | A great tutorial for configuring the PIC can be found at the SigOPS website [35]. We won’t repeat that information here. Every interrupt from the PIC has to be acknowledged - that is, sending a message to the PIC confirming that the interrupt has been handled. If this isn't done, the PIC won't generate any more interrupts. Acknowledging a PIC interrupt is done by sending the byte `0x20` to the PIC that raised the interrupt. Implementing a `pic_acknowledge` function can be done as follows: ```c #include "io.h" #define PIC1_PORT_A 0x20 #define PIC2_PORT_A 0xA0 /* The PIC interrupts have been remapped */ #define PIC1_START_INTERRUPT 0x20 #define PIC2_START_INTERRUPT 0x28 #define PIC2_END_INTERRUPT (PIC2_START_INTERRUPT + 7) #define PIC_ACK 0x20 /** * pic_acknowledge: * Acknowledges an interrupt from either PIC 1 or PIC 2. * * @param num The number of the interrupt */ void pic_acknowledge(unsigned integer interrupt) { if (interrupt < PIC1_START_INTERRUPT || interrupt > PIC2_END_INTERRUPT) { return; } if (interrupt < PIC2_START_INTERRUPT) { outb(PIC1_PORT_A, PIC_ACK); } else { outb(PIC2_PORT_A, PIC_ACK); } } ``` ### Reading Input from the Keyboard The keyboard does not generate ASCII characters; it generates scan codes. A scan code represents a button - both presses and releases. The scan code representing the just pressed button can be read from the keyboard's data I/O port which has address `0x60`. How this can be done is shown in the following example: ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 45 Context: ```markdown #include "io.h" #define KBD_DATA_PORT 0x60 /** read_scan_code: * Reads a scan code from the keyboard * @return The scan code (NOT an ASCII character!) */ unsigned char read_scan_code(void) { return inb(KBD_DATA_PORT); } The next step is to write a function that translates a scan code to the corresponding ASCII character. If you want to map the scan codes to ASCII characters as is done on an American keyboard, then Andries Brouwer has a great tutorial [36]. Remember, since the keyboard interrupt is raised by the PIC, you must call `pic_acknowledge` at the end of the keyboard interrupt handler. Also, the keyboard will not send you any more interrupts until you read the scan code from the keyboard. ## Further Reading - The OSDev wiki has a great page on interrupts: [OSDev Interrupts](http://wiki.osdev.org/Interrupts) - Chapter 6 of Intel Manual 3A [33] describes everything there is to know about interrupts. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 46 Context: I'm sorry, but I cannot assist with that as I can't view images. If you can provide the text directly, I would be happy to help you format it in Markdown. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 47 Context: ```markdown # Chapter 7 ## The Road to User Mode Now that the kernel boots, prints to screen and reads from keyboard - what do we do? Usually, a kernel is not supposed to do the application logic itself, but leave that for applications. The kernel creates the proper abstractions (for memory, files, devices) to make application development easier, performs tasks on behalf of applications (system calls) and schedules processes. User mode, in contrast with kernel mode, is the environment in which the user's programs execute. This environment is less privileged than the kernel, and will prevent (badly written) user programs from messing with other programs or the kernel. Badly written kernels are free to mess up what they want. There's quite a way to go until the OS created in this book can execute programs in user mode, but this chapter will show how to easily execute a small program in kernel mode. ## Loading an External Program Where do we get the external program from? Somehow we need to load the code we want to execute into memory. More feature-complete operating systems usually have drivers and file systems that enable them to load the software from a CD-ROM drive, a hard disk or other persistent media. Instead of creating all these drivers and file systems, we will use a feature in GRUB called modules to load the program. ### GRUB Modules GRUB can load arbitrary files into memory from the ISO image, and these files are usually referred to as modules. To make GRUB load a module, edit the file `iso/boot/grub/menu.lst` and add the following line at the end of the file: ``` module /modules/program ``` Now create the folder `iso/modules`: ```bash mkdir -p iso/modules ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 48 Context: ```markdown The application `program` will be created later in this chapter. The code that calls `kmain` must be updated to pass information to `kmain` about where it can find the modules. We also want to tell GRUB that it should align all the modules on page boundaries when loading them (see the chapter "Paging" for details about page alignment). To instruct GRUB how to load our modules, the "multiboot header" - the first bytes of the kernel - must be updated as follows: ```assembly ; in file 'loader.s' MAGIC_NUMBER equ 0x1BADB002 ; define the magic number constant ALIGN_MODULES equ 0x00000001 ; tell GRUB to align modules ; calculate the checksum (all options + checksum should equal 0) CHECKSUM equ -(MAGIC_NUMBER + ALIGN_MODULES) section .text ; start of the text (code) section align 4 ; the code must be 4 byte aligned dd MAGIC_NUMBER ; write the magic number dd ALIGN_MODULES ; write the align modules instruction dd CHECKSUM ; write the checksum ``` GRUB will also store a pointer to a struct in the register `ebx` that, among other things, describes at which addresses the modules are loaded. Therefore, you probably want to push `ebx` on the stack before calling `kmain` to make it an argument for `kmain`. ## Executing a Program ### A Very Simple Program A program written at this stage can only perform a few actions. Therefore, a very short program that writes a value to a register suffices as a test program. Halting Bochs after a while and then checking that register contains the correct number by looking in the Bochs log will verify that the program has run. This is an example of such a short program: ```assembly ; set eax to some distinguishable number, to read from the log afterwards mov eax, 0xDEADBEEF ; enter infinite loop, nothing more to do ; means "beginning of line", i.e. the same instruction jmp $ ``` ## Compiling Since our kernel cannot parse advanced executable formats we need to compile the code into a flat binary. NASM can do this with the flag `-f`: ```bash nasm -f bin program.s -o program ``` This is all we need. You must now move the file `program` to the folder `iso/modules`. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 49 Context: ```markdown # Finding the Program in Memory Before jumping to the program we must find where it resides in memory. Assuming that the contents of `ebx` is passed as an argument to `kmain`, we can do this entirely from C. The pointer in `ebx` points to a `multiboot` structure [19]. Download the `multiboot.h` file from [http://www.gnu.org/software/grub/manual/multiboot/html_node/multiboot.html](http://www.gnu.org/software/grub/manual/multiboot/html_node/multiboot.html), which describes the structure. The pointer passed to `kmain` in the `ebx` register can be cast to a `multiboot_info_t` pointer. The address of the first module is in the field `mods_addr`. The following code shows an example: ```c int kmain(/* additional arguments */ unsigned int ebx) { multiboot_info_t *mbinfo = (multiboot_info_t *) ebx; unsigned int address_of_module = mbinfo->mods_addr; } ``` However, before just blindly following the pointer, you should check that the module got loaded correctly by GRUB. This can be done by checking the `flags` field of the `multiboot_info_t` structure. You should also check the `mods_count` field to make sure it is exactly 1. For more details about the multiboot structure, see the multiboot documentation [19]. # Jumping to the Code The only thing left to do is to jump to the code loaded by GRUB. Since it is easier to parse the multiboot structure in C than assembly code, calling the code from C is more convenient (it can of course be done with `jmp` or `call` in assembly code as well). The C code could look like this: ```c typedef void (*call_module_t)(void); /* ... */ call_module_t start_program = (call_module_t) address_of_module; start_program(); /* we'll never get here, unless the module code returns */ ``` If we start the kernel, wait until it has run and entered the infinite loop in the program, and then halt Bochs, we should see `0x0EDEDBEEF` in the register `eax` via the Bochs log. We have successfully started a program in our OS! # The Beginning of User Mode The program we’ve written now runs at the same privilege level as the kernel - we’ve just entered it in a somewhat peculiar way. To enable applications to execute at a different privilege level we’ll need, besides segmentation, to do paging and page frame allocation. It’s quite a lot of work and technical details to go through, but in a few chapters you’ll have working user mode programs. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 50 Context: I'm sorry, but I can't process images. Please provide the text directly for assistance. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 51 Context: ```markdown # Chapter 8 ## A Short Introduction to Virtual Memory Virtual memory is an abstraction of physical memory. The purpose of virtual memory is generally to simplify application development and to let processes address more memory than what is actually physically present in the machine. We also don’t want applications messing with the kernel or other applications’ memory due to security. In the x86 architecture, virtual memory can be accomplished in two ways: **segmentation** and **paging**. Paging is by far the most common and versatile technique, and we’ll implement it in the next chapter. Some use of segmentation is still necessary to allow for code to execute under different privilege levels. Managing memory is a big part of what an operating system does. Paging and page frame allocation deals with that. Segmentation and paging is described in [33](#), chapters 3 and 4. ## Virtual Memory Through Segmentation You could skip paging entirely and just use segmentation for virtual memory. Each user mode process would get its own segment, with base address and limit properly set up. This way no process can see the memory of another process. A problem with this is that the physical memory for a process needs to be contiguous (or at least it is very convenient if it is). Either we need to know in advance how much memory the program will require (unlikely), or we can move the memory segments to places where they can grow when the limit is reached (expensive; causes fragmentation - can result in “out of memory” even though enough memory is available). Paging solves both these problems. It is interesting to note that in x86_64 (the 64-bit version of the x86 architecture), segmentation is almost completely removed. ## Further Reading - LWN.net has an article on virtual memory: [http://lwn.net/Articles/253361](http://lwn.net/Articles/253361) - Gustavo Duarte has also written an article about virtual memory: [http://duarte.org/gustavo/blog/post/memory-translation-and-segmentation](http://duarte.org/gustavo/blog/post/memory-translation-and-segmentation) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 52 Context: I'm sorry, but I cannot see or interpret images. If you have text you'd like formatted or corrected, please provide that directly, and I can help! #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 53 Context: ```markdown # Chapter 9 ## Paging Segmentation translates a logical address into a linear address. Paging translates these linear addresses onto the physical address space and determines access rights and how the memory should be cached. ### Why Paging? Paging is the most common technique used in x86 to enable virtual memory. Virtual memory through paging means that each process will get the impression that the available memory range is `0x00000000 - 0xFFFFFFFF`, even though the actual size of the memory might be much less. It also means that when a process addresses a byte of memory it will use a virtual (linear) address instead of physical one. The code in the user process won't notice any difference (except for execution delays). The linear address gets translated to a physical address by the MMU and the page table. If the virtual address isn’t mapped to a physical address, the CPU will raise a page fault interrupt. Paging is optional, and some operating systems do not make use of it. But if we want to mark certain areas of memory accessible only to code running at a certain privilege level (to be able to have processes running at different privilege levels), paging is the neatest way to do it. ### Paging in x86 Paging in x86 (chapter 4 in the Intel manual [33]) consists of a page directory (PDT) that can contain references to 1024 page tables (PT), each of which can point to 1024 sections of physical memory called page frames (PF). Each page frame is 4096 bytes large. In a virtual (linear) address, the highest 10 bits specify the offset of a page directory entry (PDE) in the current PDT, the next 10 bits of a page table entry (PTE) within the page table pointed to by that PDE. The lowest 12 bits in the address is the offset within the page frame to be addressed. All page directories, page tables, and page frames need to be aligned on 4096 byte addresses. This makes it possible to address a PDT, PT, or PF with just the highest 20 bits of a 32 bit address, since the lowest 12 need to be zero. The PDE and PTE structure is very similar to each other: 32 bits (4 bytes), where the highest 20 bits point to a PTE or PF, and the lowest 12 bits control access rights and other configurations. 4 bytes times 1024 equals 4096 bytes, so a page directory and page table both fit in a page frame themselves. The translation of linear addresses to physical addresses is described in the figure below. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 54 Context: ```markdown While pages are normally 4096 bytes, it is also possible to use 4 MB pages. A PDE then points directly to a 4 MB page frame, which needs to be aligned on a 4 MB address boundary. The address translation is almost the same as in the figure, with just the page table step removed. It is possible to mix 4 MB and 4 KB pages. ``` Directory Table Offset 31 22 21 12 11 0 10 Page directory PDE (PS=0) 20 CR3 ``` 10 Page table PTE 20 Physical address 20 4KB Page frame ``` Figure 9.1: Translating virtual addresses (linear addresses) to physical addresses. The 20 bits pointing to the current PDT is stored in the register CR3. The lower 12 bits of CR3 are used for configuration. For more details on the paging structures, see Chapter 4 in the Intel manual [33]. The most interesting bits are U/S, which determine what privilege levels can access this page (PL0 or PL3), and R/W, which makes the memory in the page read-write or read-only. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 55 Context: ```markdown # Identity Paging The simplest kind of paging is when we map each virtual address onto the same physical address, called identity paging. This can be done at compile time by creating a page directory where each entry points to its corresponding 4 MB frame. In NASM, this can be done using macros and commands (`irep`, `times` and `dd`). It can of course also be done at run-time by using ordinary assembly code instructions. ## Enabling Paging Paging is enabled by first writing the address of a page directory to `cr3` and then setting bit 31 (the PG "paging-enable" bit) of `cr0` to 1. To use 4 MB pages, set the PSE bit (Page Size Extensions, bit 4) of `cr4`. The following assembly code shows an example: ```assembly ; eax has the address of the page directory mov cr3, eax mov ebx, cr4 ; read current cr4 or ebx, 0x00000001 ; set PSE mov cr4, ebx ; update cr4 mov ebx, cr0 ; read current cr0 or ebx, 0x80000000 ; set PG mov cr0, ebx ; update cr0 ; now paging is enabled ``` ## A Few Details It is important to note that all addresses within the page directory, page tables, and in `cr3` need to be physical addresses to the structures, never virtual. This will be more relevant in later sections where we dynamically update the paging structures (see the chapter "User Mode"). An instruction that is useful when updating a PDT or PT is `invLpg`. It invalidates the Translation Lookaside Buffer (TLB) entry for a virtual address. The TLB is a cache for translated addresses, mapping physical addresses corresponding to virtual addresses. This is only required when changing a PDE or PTE that was previously mapped to something else. If the PDE or PTE had previously been marked as not present (bit 0 was set to 0), executing `invLpg` is unnecessary. Changing the value of `cr3` will cause all entries in the TLB to be invalidated. An example of invalidating a TLB entry is shown below: ```assembly ; invalidate any TLB references to virtual address 0 invLpg [0] ``` ## Paging and the Kernel This section will describe how paging affects the OS kernel. We encourage you to run your OS using identity paging before trying to implement a more advanced paging setup, since it can be hard to debug a malfunctioning page table that is set up via assembly code. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 56 Context: ```markdown # Reasons to Not Identify Map the Kernel If the kernel is placed at the beginning of the virtual address space (0x00000000, `size of kernel`) — that is, the virtual address space maps to the location of the kernel in memory — there will be issues when linking the user mode process code. Normally, during linking, the linker assumes that the code will be loaded into the memory position 0x20000000. Therefore, when resolving absolute references, 0x00000000 will be the base address for calculating the exact position. But if the kernel is mapped onto the virtual address space (0x00000000, `size of kernel`), the user mode process cannot be loaded at virtual address 0x00000000; it must be placed somewhere else. Therefore, the assumption from the linker that the user mode process is loaded into memory at position 0x00000000 is wrong. This can be corrected by using a linker script which tells the linker to assume a different starting address, but that is a very cumbersome solution for the users of the operating system. This also assumes that we want the kernel to be part of the user mode process' address space. As we will see later, this is a nice feature, since during system calls we don’t have to change any paging structures to get access to the kernel's code and data. The kernel pages will of course require privilege level 0 for access, to prevent a user process from reading or writing kernel memory. # The Virtual Address for the Kernel Preferably, the kernel should be placed at a very high virtual memory address, for example 0xC0000000 (3 GB). The user mode process is not likely to be 3 GB large, which is now the only way that it can conflict with the kernel. When the kernel uses virtual addresses at 3 GB and above it is called a higher-half kernel. 0xC0000000 is just an example; the kernel can be placed at any address higher than 0 to get the same benefits. Choosing the correct address depends on how much virtual memory should be available for the kernel (it is easiest if all memory above the kernel virtual address should belong to the kernel) and how much virtual memory should be available for the process. If the user mode process is larger than 3 GB, some pages will need to be swapped out by the kernel. Swapping pages is not part of this book. # Placing the Kernel at 0xC0000000 To start with, it is better to place the kernel at 0xC0100000 than 0xC0000000, since this makes it possible to map (0x00000000, 0x20000000) to (0xC0000000, 0xC0100000). This way, the entire range (0x00000000, `size of kernel`) of memory is mapped to the range (0xC0000000, 0xC0000000 + `size of kernel`). Placing the kernel at 0xC0100000 isn’t hard, but it does require some thought. This is once again a linking problem. When the linker resolves all absolute references in the kernel, it will assume that our kernel is loaded at physical memory location 0x00100000, not 0x00000000, since relocation is used in the linker script (see the section “Linking the kernel”). However, we want the jumps to be resolved using 0xC0000000 as base address, since otherwise a kernel jump will jump straight into the user mode process code (remember that the user mode process is loaded at virtual memory 0x00000000). However, we can't simply tell the linker to assume that the kernel starts (is loaded) at 0xC0000000, since we want it to be loaded at the physical address 0x00100000. The reason for having the kernel loaded at 1 MB is because it can't be loaded at 0x00000000, since there is BIOS and GRUB code loaded below 1 MB. Furthermore, we cannot assume that we can load the kernel at 0xC0100000, since the machine might not have 3 GB of physical memory. This can be solved by using both relocation (-0xC0100000) and the `AT` instruction in the linker script. Relocation specifies that non-relative memory references should use the relocation address as base in address calculations. `AT` specifies where the kernel should be loaded into memory. Relocation is done at link time. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 57 Context: ```markdown # Higher-half Linker Script We can modify the first linker script to implement this: ```ld ENTRY(loader) /* the name of the entry symbol */ . = 0xC0100000; /* the code should be relocated to 3GB + 1MB */ /* align at 4 KB and load at 1 MB */ .text ALIGN(0x1000) : AT(ADDR(.text) - 0xC0000000) { *(.text) /* all text sections from all files */ } /* align at 4 KB and load at 1 MB + */ .rodata ALIGN(0x1000) : AT(ADDR(.text) - 0xC0000000) { *(.rodata*) /* all read-only data sections from all files */ } /* align at 4 KB and load at 1 MB + */ .data ALIGN(0x1000) : AT(ADDR(.text) - 0xC0000000) { *(.data) /* all data sections from all files */ } /* align at 4 KB and load at 1 MB + */ .bss ALIGN(0x1000) : AT(ADDR(.text) - 0xC0000000) { *(.common) /* all COMMON sections from all files */ *(.bss) /* all bss sections from all files */ } ``` # Entering the Higher Half When GRUB jumps to the kernel code, there is no paging table. Therefore, all references to `0xC0100000 + X` won’t be mapped to the correct physical address, and will therefore cause a general protection exception (GPE) at the very best, otherwise (if the computer has more than 3 GB of memory) the computer will just crash. Therefore, assembly code that doesn’t use relative jumps or relative memory addressing must be used to do the following: - Set up a page table. - Add identity mapping for the first 1 MB of the virtual address space. - Add an entry for `0xC0100000` that maps to `0x00100000`. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 58 Context: ```markdown If we skip the identity mapping for the first 4 MB, the CPU would generate a page fault immediately after paging was enabled when trying to fetch the next instruction from memory. After the table has been created, a jump can be done to a label to make `eip` point to a virtual address in the higher half: ```assembly ; assembly code executing at around 0x00100000 ; enable paging for both actual location of kernel ; and its higher-half virtual location lea ebx, [higher_half] ; load the address of the label in ebx jmp ebx ; jump to the label higher_half: ; code here executes in the higher half kernel ; eip is larger than 0x00000000 ; can continue kernel initialisation, calling C code, etc. ``` The register `eip` will now point to a memory location somewhere right after `0x01000000` — all the code can now execute as if it were located at `0x01000000`, the higher-half. The entry mapping of the first 4 MB of virtual memory to the first 4 MB of physical memory can now be removed from the page table and its corresponding entry in the TLB invalidated with `invlpg` [0]. ## Running the Higher Half There are a few more details we must deal with when using a higher-half kernel. We must be careful when using memory-mapped I/O that uses specific memory locations. For example, the frame buffer is located at `0x000B8000`, but since there is no entry in the page table for the address `0x000B8000` any longer, the address `0xB8000` must be used, since the virtual address `0x000B8000` maps to the physical address `0x000B8000`. Any explicit references to addresses within the multiboot structure needs to be changed to reflect the new virtual addresses as well. Mapping 4 MB pages for the kernel is simple, but wastes memory (unless you have a really big kernel). Creating a higher-half kernel mapped in as 4 KB pages saves memory but is harder to set up. Memory for the page directory and one page table can be reserved in the `.data` section, but one needs to configure the mappings from virtual to physical addresses at run-time. The size of the kernel can be determined by exporting labels from the linker script [37], which we'll need to do later anyway when writing the page frame allocator (see the chapter "Page Frame Allocation"). ## Virtual Memory Through Paging Paging enables two things that are good for virtual memory. First, it allows for fine-grained access control to memory. You can mark pages as read-only, read-write, or only for PLI etc. Second, it creates the illusion of contiguous memory. User mode processes, and the kernel, can access memory as if it were contiguous, and the contiguous memory can be extended without the need to move data around in memory. We can also allow the user mode programs access to all memory below 3 GB, but unless they actually use it, we don’t have to assign page frames to the pages. This allows processes to have code located near `0x00000000` and the stack at just below `0xC0000000`, and still not require more than two actual pages. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 59 Context: ```markdown ## Further Reading - Chapter 4 (and to some extent chapter 3) of the Intel manual [33] are your definitive sources for the details about paging. - Wikipedia has an article on paging: [https://en.wikipedia.org/wiki/Paging](https://en.wikipedia.org/wiki/Paging) - The OSDev wiki has a page on paging: [http://wiki.osdev.org/Paging](http://wiki.osdev.org/Paging) and a tutorial for making a higher-half kernel: [http://wiki.osdev.org/Highert_ half_bare_bones](http://wiki.osdev.org/Highert_half_bare_bones) - Gustavo Duarte's article on how a kernel manages memory is well worth a read: [http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory](http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory) - Details on the linker command language can be found at Steve Chamberlain's website [37]. - More details on the ELF format can be found in this presentation: [http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf](http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 60 Context: I'm unable to view images. Please provide the text you would like to have checked and corrected in Markdown format. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 61 Context: ```markdown # Chapter 10 ## Page Frame Allocation When using virtual memory, how does the OS know which parts of memory are free to use? That is the role of the page frame allocator. ## Managing Available Memory ### How Much Memory is There? First, we need to know how much memory is available on the computer the OS is running on. The easiest way to do this is to read it from the multiboot structure [19] passed to us by GRUB. GRUB collects the information we need about the memory - what is reserved, I/O mapped, read-only etc. We must also make sure that we don’t mark the part of memory used by the kernel as free (since GRUB doesn’t mark this memory as reserved). One way to know how much memory the kernel uses is to export labels at the beginning and the end of the kernel binary from the linker script: ```assembly ENTRY(loader) /* the name of the entry symbol */ . = 0xC0100000; /* the code should be relocated to 3 GB + 1 MB */ /* these labels get exported to the code files */ kernel_virtual_start = .; kernel_physical_start = -0x00000000; /* align at 4 KB and load at 1 MB */ .text ALIGN(0x1000) : AT(ADDR(.text) - 0xC0000000) { *(.) /* all text sections from all files */ } /* align at 4 KB and load at 1 MB + . */ .rodata ALIGN(0x1000) : AT(ADDR(.rodata) - 0xC0000000) { *(.) /* all read-only data sections from all files */ } ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 62 Context: ```markdown /* align at 4 KB and load at 1 MB + */ // .data ALIGH (0x1000) : AT(ADDR(.data)-0x00000000) .data { *(.data) /* all data sections from all files */ } /* align at 4 KB and load at 1 MB + */ // .bss ALIGH (0x1000) : AT(ADDR(.bss)-0x00000000) .bss { *(COMMON) /* all COMMON sections from all files */ *(.bss) /* all bss sections from all files */ } kernel_virtual_end = .; kernel_physical_end = - 0xC0000000; These labels can directly be read from assembly code and pushed on the stack to make them available to C code: ``` extern kernel_virtual_start extern kernel_virtual_end extern kernel_physical_start extern kernel_physical_end : ... push kernel_physical_end push kernel_physical_start push kernel_virtual_end push kernel_virtual_start call kmain ``` This way we get the labels as arguments to `kmain`. If you want to use C instead of assembly code, one way to do it is to declare the labels as functions and take the addresses of these functions: ```c void kernel_virtual_start(void); /* ... */ ``` ```c unsigned int vaddr = (unsigned int) &kernel_virtual_start; ``` If you use GRUB modules, you need to make sure the memory they use is marked as reserved as well. Note that the available memory does not need to be contiguous. In the first 1 MB there are several I/O-mapped memory sections, as well as memory used by GRUB and the BIOS. Other parts of the memory might be similarly unavailable. It's convenient to divide the memory sections into complete page frames, as we can't map part of pages into memory. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 63 Context: ```markdown # Managing Available Memory How do we know which page frames are in use? The page frame allocator needs to keep track of which are free and which aren't. There are several ways to do this: bitmaps, linked lists, trees, the Buddy System (used by Linux) etc. For more information about the different algorithms see the article on OSDev [1]. Bitmaps are quite easy to implement. One bit is used for each page frame and one (or more) page frames are dedicated to store the bitmap. (Note that this is just one way to do it; other designs might be better and/or more fun to implement.) ## How Can We Access a Page Frame? The page frame allocator returns the physical start address of the page frame. This page frame is not mapped in - no page table points to this page frame. How can we read and write data to the frame? We need to map the page frame into virtual memory, by updating the PDT and/or PT used by the kernel. What if all available page tables are full? Then we can’t map the page frame into memory, because we need a new page table - which takes up an entire page frame - and to write to this page frame we’d need to map its page frame... Somehow this circular dependency must be broken. One solution is to reserve a part of the first page table used by the kernel (or some other higher-half page table) for temporarily mapping page frames to make them accessible. If the kernel is mapped at `0xC0000000` (page directory entry with index 768), and 4 KB page frames are used, then the kernel has at least one page table. If we assume - or limit us to - a kernel of size at most 4 MB minus 4 KB we can dedicate the last entry (entry 1023) of this page table for temporary mappings. The virtual address of pages mapped in using the last entry of the kernel's PT will be: ``` (768 << 22) | (1023 << 12) | 0x000 = 0xC03F0000 ``` After we’ve temporarily mapped the page frame we want to use as a page table, and set it up to map in our first page frame, we can add it to the page directory and remove the temporary mapping. ## A Kernel Heap So far we’ve only been able to work with fixed-size data, or directly with raw memory. Now that we have a page frame allocator we can implement `malloc` and `free` to use in the kernel. Kernighan and Ritchie [2] have an example implementation in their book [2] that we can draw inspiration from. The only modification we need to do is to replace calls to `sbrk/brk` with calls to the page frame allocator when more memory is needed. We must also make sure to map the page frames returned by the page frame allocator to virtual addresses. A correct implementation should also return page frames to the page frame allocator on call to `free`, whenever sufficiently large blocks of memory are freed. ## Further Reading - The OSDev wiki page on page frame allocation: [OSDev Page Frame Allocation](http://wiki.osdev.org/Page_Frame_Allocation) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 64 Context: I'm unable to view images directly. However, if you provide the Markdown text, I can help you correct any visible mistakes and format it properly. Please paste the text here. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 65 Context: ```markdown # Chapter 11 ## User Mode User mode is now almost within our reach; there are just a few more steps required to get there. Although these steps might seem easy they way they are presented in this chapter, they can be tricky to implement, since there are a lot of places where small errors will cause bugs that are hard to find. ### Segments for User Mode To enable user mode, we need to add two more segments to the GDT. They are very similar to the kernel segments we added when we set up the GDT in the chapter about segmentation: | Index | Offset | Name | Address range | Type | DPL | |-------|--------|--------------------|----------------------------|------|-----| | 3 | 0x18 | user code segment | 0x00000000 - 0xFFFFFFFF | RX | PL3 | | 4 | 0x20 | user data segment | 0x00000000 - 0xFFFFFFFF | RW | PL3 | **Table 11.1:** The segment descriptors needed for user mode. The difference is the DPL, which now allows code to execute in PL3. The segments can still be used to address the entire address space; just using these segments for user mode code will not protect the kernel. For that we need paging. ### Setting Up For User Mode There are a few things every user mode process needs: - Page frames for code, data, and stack. At the moment, it suffices to allocate one page frame for the stack and enough page frames to fit the program’s code. Don’t worry about setting up a stack that can grow and shrink at this point in time; focus on getting a basic implementation work first. - The binary from the GRUB module has to be copied to the page frames used for the program's code. - A page directory and page tables are needed to map the page frames described above into memory. At least two page tables are needed, because the code and data should be mapped in at `0x00000000`. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 66 Context: ```markdown and increasing, and the stack should start just below the kernel, at 0xBFFFFFFF, growing towards lower addresses. The U/S flag has to be set to allow PL3 access. It might be convenient to store this information in a `struct` representing a process. This process `struct` can be dynamically allocated with the kernel's `malloc` function. ## Entering User Mode The only way to execute code with a lower privilege level than the current privilege level (CPL) is to execute an `iret` or `lret` instruction - interrupt return or log return, respectively. To enter user mode we set up the stack as if the processor had raised an inter-privilege level interrupt. The stack should look like the following: ``` [eip + 16] ss ; the stack segment selector we want for user mode [eip + 12] esp ; the user mode stack pointer [eip + 8] eflags ; the control flags we want to use in user mode [eip + 4] cs ; the code segment selector [eip + 0] eip ; the instruction pointer of user mode code to execute ``` See the Intel manual [33], section 6.2.1, figure 6-4 for more information. The `iret` instruction will then read these values from the stack and fill in the corresponding registers. Before we execute `iret`, we need to change to the page directory we setup for the user mode process. It is important to remember that to continue executing kernel code after we've switched the Page Table (PDT), the kernel needs to be mapped in. One way to accomplish this is to have a separate PDT for the kernel, which maps all data at 0xC0000000 and above, and merge it with the user PDT (which only maps below 0xC0000000) when performing the switch. Remember that physical address of the PDT has to be used when setting the control register `cr3`. The register `eflags` contains a set of different flags, specified in section 2.3 of the Intel manual [33]. Most important for us is the interrupt enable (IF) flag. The assembly code instruction `iret` can't be used in privilege level 3 for enabling interrupts. If interrupts are disabled when entering user mode, then interrupts can't be enabled once user mode is entered. Setting the IF flag in the `eflags` entry on the stack will enable interrupts in user mode, since the assembly code instruction `iret` will set the register eflags to the corresponding value on the stack. For now, we should have interrupts disabled, as it requires a little more work to get inter-privilege level interrupts to work properly (see the section "System calls"). The value `eip` on the stack should point to the entry point for the user code - `0x00000000` in our case. The value `esp` on the stack should be where the stack starts - `0xC0000000 - 4`. The values `cs` and `ss` on the stack should be the segment selectors for the user code and user data segments, respectively. As we saw in the segmentation chapter, the lowest two bits of a segment selector is the RPL - Requested Privilege Level. When using inter to enter PL3, the RPL of `cs` and `ss` should be `0x3`. The following code shows an example: ``` USER_MODE_CODE_SEGMENT_SELECTOR equ 0x18 USER_MODE_DATA_SEGMENT_SELECTOR equ 0x20 mov cs, USER_MODE_CODE_SEGMENT_SELECTOR ; 0x3 mov ss, USER_MODE_DATA_SEGMENT_SELECTOR ; 0x3 ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 67 Context: ```markdown The register `ds`, and the other data segment registers, should be set to the same segment selector as `ss`. They can be set the ordinary way, with the new assembly code instruction. We are now ready to execute `iret`. If everything has been set up right, we should now have a kernel that can enter user mode. # Using C for User Mode Programs When C is used as the programming language for user mode programs, it is important to think about the structure of the file that will be the result of the compilation. The reason we can use ELF [as] as the file format for the kernel executable is because GRUB knows how to parse and interpret the ELF file format. If we implemented an ELF parser, we could compile the user mode programs into ELF binaries as well. We leave this as an exercise for the reader. One thing we can do to make it easier to develop user mode programs is to allow the programs to be written in C, but compile them to flat binaries instead of ELF binaries. In C the layout of the generated code is more unpredictable and the entry point, `main`, might not be at offset 0 in the binary. One common way to work around this is to add a few assembly code lines placed at offset 0 which calls `main`: ```assembly extern main section .text ; push argv push argv ; push argc push argc call main ; main has returned, eax is return value jmp $ ; loop forever ``` If this code is saved in a file called `start.s`, then the following code shows an example of a linker script that places these instructions first in executable (remember that `start.s` gets compiled to `start.o`): ```plaintext OUTPUT_FORMAT("binary") /* output flat binary */ SECTIONS { * = 0; /* relocate to address 0 */ .text ALIGN(4): { start.o(.text) /* include the .text section of start.o */ *(.text) /* include all other .text sections */ } .data ALIGN(4): { *(.data) } .rodata ALIGN(4): { *(.rodata) } } ``` ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 68 Context: ```markdown *(.rodata*) } Note: `*(.text)` will not include the `.text` section of `start.o` again. With this script we can write programs in C or assembler (or any other language that compiles to object files linkable with `ld`), and it is easy to load and map for the kernel (`.rodata` will be mapped in as writable, though). When we compile user programs we want the following GCC flags: ``` -m32 -nostdlib -nostdinc -fno-builtin -fno-stack-protector -nostartfiles -nodefaultlibs ``` For linking, the following flags should be used: ``` -T link.ld -melf_i386 # emulate 32 bits ELF, the binary output is specified # in the linker script ``` The option `-T` instructs the linker to use the linker script `link.ld`. ## A C Library It might now be interesting to start thinking about writing a small “standard library” for your programs. Some of the functionality requires system calls to work, but some, such as the functions in `string.h`, does not. ## Further Reading - Gustavo Duarte has an article on privilege levels: [http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection](http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection) ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 69 Context: ```markdown # Chapter 12 ## File Systems We are not required to have file systems in our operating system, but it is a very usable abstraction, and it often plays a central part of many operating systems, especially UNIX-like operating systems. Before we start the process of supporting multiple processes and system calls, we might want to consider implementing a simple file system. ### Why a File System? How do we specify what programs to run in our OS? Which is the first program to run? How do programs output data or read input? In UNIX-like systems, with their almost-everything-is-a-file convention, these problems are solved by the file system. (It might also be interesting to read a bit about the Plan 9 project, which takes this idea one step further.) ### A Simple Read-Only File System The simplest file system might be what we already have - one file, existing only in RAM, loaded by GRUB before the kernel starts. When the kernel and operating system grow, this is probably too limiting. A file system that is slightly more advanced than just the bits of one file is a file with metadata. The metadata can describe the type of the file, the size of the file, and so on. A utility program can be created that runs at build time, adding this metadata to a file. This way, a “file system in a file” can be constructed by concatenating several files with metadata into one large file. The result of this technique is a read-only file system that resides in memory (once GRUB has loaded the file). The program constructing the file system can traverse a directory on the host system and all subdirectories and files as part of the target file system. Each object in the file system (directory or file) can consist of a header and a body, where the body of a file is the actual file and the body of a directory is a list of entries - names and “addresses” of other files and directories. Each object in this file system will become contiguous, so they will be easy to read from memory for the kernel. All objects will also have a fixed size (except for the last one, which can grow), therefore it is difficult to add new files or modify existing ones. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 70 Context: ```markdown # Inodes and Writable File Systems When the need for a writable file system arises, then it is a good idea to look into the concept of an **inode**. See the section “Further Reading” for recommended reading. ## A Virtual File System What abstraction should be used for reading and writing to devices such as the screen and the keyboard? A virtual file system (VFS) creates an abstraction on top of the concrete file systems. A VFS mainly supplies the path system and file hierarchy; it delegates operations on files to the underlying file systems. The original paper on VFS is succinct and well worth a read. See the section “Further Reading” for a reference. With a VFS we could mount a special file system on the path `/dev`. This file system would handle all devices such as keyboards and the console. However, one could also take the traditional UNIX approach, with major/minor device numbers and `mknod` to create special files for devices. Which approach you think is the most appropriate is up to you; there is no right or wrong when building abstraction layers (although some abstractions turn out way more useful than others). ## Further Reading - The ideas behind the Plan 9 operating systems is worth taking a look at: [Plan 9](http://plan9.bell-labs.com/plan9/index.html) - Wikipedia's page on inodes: [Inode](http://en.wikipedia.org/wiki/Inode) and the inode pointer structure: [Inode Pointer Structure](http://en.wikipedia.org/wiki/Inode_pointer_structure). - The original paper on the concept of inodes and a virtual file system is quite interesting: [Kleinain](http://www.sun.com/west/edu/fredk/Courses/cs253/fall01/Papers/kleinain86/inodes.pdf). - Poul-Henning Kamp discusses the idea of a special file system for `/dev` in [this paper](http://static.usenix.org/publications/library/proceedings/bsdcon02/full_papers/kamp/kamp_html/index.html). ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 71 Context: ```markdown # Chapter 13 ## System Calls System calls are the way user-mode applications interact with the kernel—to ask for resources, request operations to be performed, etc. The system call API is the part of the kernel that is most exposed to the users; therefore, its design requires some thought. ## Designing System Calls It is up to us, the kernel developers, to design the system calls that application developers can use. We can draw inspiration from the POSIX standards or, if they seem like too much work, just look at the ones for Linux, and pick and choose. See the section “Further Reading” at the end of the chapter for references. ## Implementing System Calls System calls are traditionally invoked with software interrupts. The user applications put the appropriate values in registers or on the stack and then initiate a pre-defined interrupt which transfers execution to the kernel. The interrupt number used is dependent on the kernel; Linux uses the number 0x80 to identify that an interrupt is intended as a system call. When system calls are executed, the current privilege level is typically changed from PL3 to PL0 (if the application is running in user mode). To allow this, the DPL of the entry in the IDT for the system call interrupt needs to allow PL3 access. Whenever inter-privilege level interrupts occur, the processor pushes a few important registers onto the stack—the same ones we used to enter user mode before (see Figure 6-4, section 6.12.1, in the Intel manual [33]). What stack is used? The same section in [33] specifies that if an interrupt leads to code executing at a minimally lower privilege level, a stack switch occurs. The new values for the registers as well as the stack are loaded from the current Task State Segment (TSS). The TSS structure is specified in Figure 7-2, section 7.2.1 of the Intel manual [33]. To enable system calls, we need to set up a TSS before entering user mode. Setting it up can be done in C by setting the `s80` and `eip` fields of a “packed struct” that represents a TSS. Before loading the “packed struct” into the processor, a TSS descriptor has to be added to the GDT. The structure of the TSS descriptor is described in section 7.2.2 in [33]. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 72 Context: ```markdown You specify the current TSS segment selector by loading it into the `tr` register with the `ltr` assembly code instruction. If the TSS segment descriptor has index 5, and thus offset is `5 * 8 + 40 = 0x28`, this is the value that should be loaded into the register `tr`. When we entered user mode before in the chapter “Entering User Mode” we disabled interrupts when executing in PL3. Since system calls are implemented using interrupts, interrupts must be enabled in user mode. Setting the IF flag bit in the `eflags` value on the stack will make `iret` enable interrupts (since the `eflags` value on the stack will be loaded into the `eflags` register by the assembly code instruction `iret`). ## Further Reading - The Wikipedia page on POSIX, with links to the specifications: [POSIX](http://en.wikipedia.org/wiki/POSIX) - A list of system calls used in Linux: [System Calls](http://bluemaster.in.ho.edu/dark/lhn-asm/syscalls.html) - The Wikipedia page on system calls: [System Call](http://en.wikipedia.org/wiki/System_call) - The Intel manual sections on interrupts (chapter 6) and TSS (chapter 7) are where you get all the details you need. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 73 Context: ```markdown # Chapter 14 ## Multitasking How do you make multiple processes appear to run at the same time? Today, this question has two answers: - With the availability of multi-core processors, or on systems with multiple processors, two processes can actually run at the same time by running two processes on different cores or processors. - Fake it. That is, switch rapidly (faster than a human can notice) between the processes. At any given moment there is only one process executing, but the rapid switching gives the impression that they are running “at the same time.” Since the operating system created in this book does not support multi-core processors or multiple processors, the only option is to fake it. The part of the operating system responsible for rapidly switching between the processes is called the **scheduling algorithm**. ## Creating New Processes Creating new processes is usually done with two different system calls: `fork` and `exec`. `fork` creates an exact copy of the currently running process, while `exec` replaces the current process with one that is specified by a path to the location of a program in the file system. Of these two, we recommend that you start implementing `exec`, since this system call will do almost exactly the same steps as described in the section “Setting up for user mode” in the chapter “User Mode.” ## Cooperative Scheduling with Yielding The easiest way to achieve rapid switching between processes is if the processes themselves are responsible for the switching. The processes run for a while and then tell the OS (via a system call) that it can now switch to another process. Giving up the control of the CPU to another process is called **yielding**, and when the processes themselves are responsible for the scheduling it’s called **cooperative scheduling**, since all the processes must cooperate with each other. When a process yields the process’ entire state must be saved (all the registers), preferably on the kernel heap in a structure that represents a process. When changing to a new process all the registers must be restored from the saved values. ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 74 Context: ```markdown Scheduling can be implemented by keeping a list of which processes are running. The system call `yield` should then run the next process in the list and put the current one last (other schemes are possible, but this is a simple one). The transfer of control to the new process is done via the `iret` assembly code instruction in exactly the same way as explained in the section “Entering user mode” in the chapter “User Mode”. ## Preemptive Scheduling with Interrupts Instead of letting the processes themselves manage when to change to another process, the OS can switch processes automatically after a short period of time. The OS can set up the programmable interval timer (PIT) to raise an interrupt after a short period of time, for example 20 ms. In the interrupt handler for the PIT, the OS will change the running process to a new one. This way the processes themselves don't need to worry about scheduling. This kind of scheduling is called preemptive scheduling. ## Programmable Interval Timer To be able to do preemptive scheduling the PIT must first be configured to raise interrupts every x milliseconds, where x should be configurable. The configuration of the PIT is very similar to the configuration of other hardware devices: a byte is sent to an I/O port. The command port of the PIT is `0x43`. To read about all the configuration options, see the article about the PIT on OSDev [39]. We use the following options: - Raise interrupts (use channel 0) - Send the divider as low byte then high byte (see next section for an explanation) - Use a square wave - Use binary mode This results in the configuration byte `00110110`. Setting the interval for how often interrupts are to be raised is done via a divider, the same way as for the serial port. Instead of sending the PIT a value (e.g. in milliseconds) that says how often an interrupt should be raised, you send the divider. The PIT operates at 1193182 Hz as default. Sending the divide 10 results in the PIT running at 1193182 / 10 = 119318 Hz. The divider can only be 16 bits, so it is only possible to configure the timer’s frequency between 119318 Hz and 1193182 / 65536 = 18.2 Hz. We recommend that you create a function that takes an interval in milliseconds and converts it to the correct divider. The divider is sent to the channel 0 data I/O port of the PIT, but since only one byte can be sent at a time, the lowest 8 bits of the divider has to be sent first, then the highest 8 bits of the divider can be sent. The channel 0 data I/O port is located at `0x40`. Again, see the article on OSDev [39] for more details. ## Separate Kernel Stacks for Processes If all processes use the same kernel stack (the stack exposed by the TSS) there will be trouble if a process is interrupted while still in kernel mode. The process that is being switched to will now use the same kernel ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 75 Context: ```markdown # Difficulties with Preemptive Scheduling When using preemptive scheduling, one problem arises that doesn't exist with cooperative scheduling. With cooperative scheduling, every time a process yields, it must be in user mode (privilege level 3), since yield is a system call. With preemptive scheduling, the processes can be interrupted in either user mode or kernel mode (privilege level 0), since the process itself does not control when it gets interrupted. Interrupting a process in kernel mode is a little bit different than interrupting a process in user mode, due to the way the CPU sets up the stack at interrupts. If a privilege level change occurred (the process was interrupted in user mode), the CPU will push the value of the process and `esp` register on the stack. If no privilege level change occurs (the process was interrupted in kernel mode), the CPU won't push the `esp` register on the stack. Furthermore, if there was no privilege level change, the CPU won't change stack to the one defined it the TSS. This problem is solved by calculating what the value of `esp` was before the interrupt. Since you know that the CPU pushes 3 things on the stack when no privilege change happens and you know how much you have pushed on the stack, you can calculate what the value of `esp` was at the time of the interrupt. This is possible since the CPU won't change stacks if there is no privilege level change, so the content of `esp` will be the same as at the time of the interrupt. To further complicate things, one must think of how to handle the case when switching to a new process that should be running in kernel mode. Since `iret` is being used without a privilege level change, the CPU won't update the value of `esp` with the one placed on the stack - you must update it yourself. ## Further Reading - For more information about different scheduling algorithms, see [OSDev Scheduling Algorithms](http://wiki.osdev.org/Scheduling_Algorithms). ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 76 Context: I'm unable to view or analyze images. Please provide the text you'd like me to correct, and I'll help format it in Markdown. #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 77 Context: ```markdown # References 1. Andrew Tanenbaum, 2007. *Modern operating systems*, 3rd edition. Prentice Hall, Inc. 2. The Royal Institute of Technology, [http://www.kth.se](http://www.kth.se). 3. Wikipedia, [Hexadecimal](http://en.wikipedia.org/wiki/Hexadecimal). 4. OSDev, [OSDev](http://wiki.osdev.org/Main_Page). 5. James Molloy, [James's kernel development tutorial](http://www.jamesmolloy.co.uk/tutorial_html/). 6. Canonical Ltd, [Ubuntu](http://www.ubuntu.com/). 7. Oracle, [Oracle VM VirtualBox](http://www.virtualbox.org/). 8. Dennis M. Ritchie and Brian W. Kernighan, 1988. *The C programming language*, second edition. Prentice Hall, Inc. 9. Wikipedia, [C (programming language)](http://en.wikipedia.org/wiki/C_(programming_language)). 10. Free Software Foundation, [GCC, the GNU Compiler Collection](http://gcc.gnu.org/). 11. NASM, [NASM: The Netwide Assembler](http://www.nasm.us/). 12. Wikipedia, [Bash](http://en.wikipedia.org/wiki/Bash_%28Unix_shell%29). 13. Free Software Foundation, [GNU make](http://www.gnu.org/software/make/). 14. Volker Rüppel, [Bochs: The Open Source IA-32 Emulation Project](http://bochs.sourceforge.net/). 15. QEMU, [QEMU](http://wiki.qemu.org/Main_Page). 16. Wikipedia, [BIOS](http://en.wikipedia.org/wiki/BIOS). 17. Free Software Foundation, [GNU GRUB](http://www.gnu.org/software/grub/). 18. Wikipedia, [Executable and Linkable Format](http://en.wikipedia.org/wiki/Executable_and_Linkable_Format). 19. Free Software Foundation, [Multiboot Specification Version 0.6.96](http://www.gnu.org/software/grub/manual/multiboot/multiboot.html). 20. GNU, [GNU Binaries](http://www.gnu.org/software/binutils/). 21. Lars Noodén, Bug #426419: *configure: error: GRUB requires a working absolute pathcopy*, [https://bugs.launchpad.net/ubuntu/+source/grub/+bug/426419](https://bugs.launchpad.net/ubuntu/+source/grub/+bug/426419). 22. Wikipedia, [ISO Image](http://en.wikipedia.org/wiki/ISO_image). 23. Boch's, [Bochs](http://bochs.sourceforge.net/doc/bochs/user/bochsrel.html). 24. NASM, [RESB and Friends: Declaring Uninitialized Data](http://www.nasm.us/doc/nasmdoc3.htm). ``` #################### File: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf Page: 78 Context: ```markdown [25] Wikipedia, [x86 calling conventions](http://en.wikipedia.org/wiki/X86_calling_conventions) [26] Wikipedia, [Framebuffer](http://en.wikipedia.org/wiki/Framebuffer) [27] Wikipedia, [VGA-compatible text mode](http://en.wikipedia.org/wiki/VGA-compatible_text_mode) [28] Wikipedia, [ASCII](https://en.wikipedia.org/wiki/ASCII) [29] OSDev, [VGA hardware](http://wiki.osdev.org/VGA_Hardware) [30] Wikipedia, [Serial port](http://en.wikipedia.org/wiki/Serial_port) [31] OSDev, [Serial ports](http://wiki.osdev.org/Serial_ports) [32] WikiBooks, [Serial programming/8250 uART programming](http://en.wikibooks.org/wiki/Serial-Programming/8250_UART_Programming) [33] Intel, [16 (i) and 32 architectures software developer's manual vol. 3](http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.html) [34] NASM, [Multi-line macros](http://www.nasm.us/doc/nasmdoc.html#section-4.3) [35] SIGOPS, [i386 interrupt handling](http://www.acm.uiuc.edu/sigops/roll_your_own/i386/irq.html) [36] Andries Brouwer, [Keyboard scancodes](http://www.win.tue.nl/) [37] Steve Chamberlain, [Using ld, the GNU linker](http://www.math.utah.edu/docs/info/ld_toc.html) [38] OSDev, [Page frame allocation](http://wiki.osdev.org/Page_Frame_Allocation) [39] OSDev, [Programmable interval timer](http://wiki.osdev.org/Programmable_Interval_Timer) ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 1 Context: ```markdown # Web API Design ## Crafting Interfaces that Developers Love ### Author: Brian Mulloy --- ### Table of Contents 1. **Introduction** - Overview of Web APIs - Importance of Good Design 2. **Design Principles** - Consistency - Simplicity - Flexibility 3. **Best Practices** - Documentation - Versioning - Error Handling 4. **Tools and Frameworks** - Postman - Swagger - RAML 5. **Conclusion** - Summary of Key Points --- ### Links - [Postman](https://www.postman.com) - [Swagger](https://swagger.io) - [RAML](https://raml.org) --- ### Notes - Always aim for clarity and conciseness when designing APIs. - Engage with developers to understand their needs and challenges. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 2 Context: ```markdown # Table of Contents ## Web API Design - Crafting Interfaces that Developers Love 1. [Introduction](#introduction) ....................................................................... 3 2. [Nouns are good; verbs are bad](#nouns-are-good-verbs-are-bad) .......... 4 3. [Plural nouns and concrete names](#plural-nouns-and-concrete-names) ... 8 4. [Simplify associations - sweep complexity under the ‘?’](#simplify-associations) ........................... 9 5. [Handling errors](#handling-errors) ................................................................. 10 6. [Tips for versioning](#tips-for-versioning) ................................................... 13 7. [Pagination and partial response](#pagination-and-partial-response) .......... 16 8. [What about responses that don’t involve resources?](#what-about-responses) .......................... 19 9. [Supporting multiple formats](#supporting-multiple-formats) ........................ 20 10. [What about attribute names?](#what-about-attribute-names) ...................... 21 11. [Tips for search](#tips-for-search) ............................................................... 22 12. [Consolidate API requests in one subdomain](#consolidate-api-requests) ... 23 13. [Tips for handling exceptional behavior](#tips-for-handling-exceptional-behavior) .... 25 14. [Authentication](#authentication) ............................................................ 27 15. [Making requests on your API](#making-requests-on-your-api) .................. 28 16. [Chatty APIs](#chatty-apis) ............................................................. 30 17. [Complement with an SDK](#complement-with-an-sdk) ............................. 31 18. [The API Façade Pattern](#the-api-facade-pattern) .................................... 32 ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 3 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Introduction If you're reading this, chances are that you care about designing Web APIs that developers will love and that you're interested in applying proven design principles and best practices to your Web API. One of the sources for our design thinking is REST. Because REST is an architectural style and not a strict standard, it allows for a lot of flexibility. Because of that flexibility and freedom of structure, there is also a big appetite for design best practices. This e-book is a collection of design practices that we have developed in collaboration with some of the leading API teams around the world as they craft their API strategy through a design workshop that we provide at Apigee. We call our point of view in API design "pragmatic REST," because it places the success of the developer over and above any other design principle. The developer is the customer for the Web API. The success of an API design is measured by how quickly developers can get up to speed and start enjoying success using your API. We’d love your feedback—whether you agree, disagree, or have some additional practices and tips to add. The [API Craft Google Group](https://groups.google.com/g/api-craft) is a place where Web API design enthusiasts come together to share and debate design practices—we’d love to see you there. ## Are you a Pragmatist or a RESTafarian? Let’s start with our overall point of view on API design. We advocate pragmatic, not dogmatic REST. What do we mean by dogmatic? You might have seen discussion threads on true REST - some of them can get pretty strict and wonky. Mike Schinkel sums it up well - defining a RESTafarian as follows: > “A RESTafarian is a zealous proponent of the REST [software architectural style](https://en.wikipedia.org/wiki/Representational_state_transfer) as defined by Roy T. Fielding in Chapter 5 of his PhD dissertation at UC Irvine. You can find RESTafarians in the wild on the [REST-discuss mailing list](https://groups.google.com/g/rest-discuss). But be careful, RESTafarians can be extremely meticulous when discussing the finer points of REST ...” Our view: approach API design from the ‘outside-in’ perspective. This means we start by asking - what are we trying to achieve with an API? The API's job is to make the developer as successful as possible. The orientation for APIs is to think about design choices from the application developer's point of view. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 4 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love Why? Look at the value chain below – the application developer is the lynchpin of the entire API strategy. The primary design principle when crafting your API should be to maximize developer productivity and success. This is what we call pragmatic REST. ![Value Chain](url_to_value_chain_image) ## Pragmatic REST is a design problem You have to get the design right because design communicates how something will be used. The question becomes - what is the design with optimal benefit for the app developer? The developer point of view is the guiding principle for all the specific tips and best practices we’ve compiled. ## Nouns are good; verbs are bad The number one principle in pragmatic RESTful design is: keep simple things simple. ## Keep your base URL simple and intuitive The base URL is the most important design affordance of your API. A simple and intuitive base URL design makes using your API easy. Affordance is a design property that communicates how something should be used without requiring documentation. A door handle's design should communicate whether you pull or push. Here's an example of a conflict between design affordance and documentation - not an intuitive interface! ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 5 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ![Push doors](image_url) A key litmus test we use for Web API design is that there should be only **2 base URLs per resource**. Let's model an API around a simple object or resource, a dog, and create a Web API for it. The first URL is for a collection; the second is for a specific element in the collection: - `/dogs` - `/dogs/1234` Boiling it down to this level will also force the verbs out of your base URLs. ## Keep verbs out of your base URLs Many Web APIs start by using a method-driven approach to URL design. These method-based URLs sometimes contain verbs—sometimes at the beginning, sometimes at the end. For any resource that you model, like our dog, you can never consider one object in isolation. Rather, there are always related and interacting resources to account for—like owners, veterinarians, leashes, food, squirrels, and so on. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 6 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love Think about the method calls required to address all the objects in the dogs' world. The URLs for our resource might end up looking something like this: | URL | URL | |------------------------------------|------------------------------------| | `/getAllDogs` | `/getAllLeashedDogs` | | `/verifyLocation` | `/verifyVeterinarianLocation` | | `/feedNeeded` | `/feedNeededFood` | | `/createRecuringWakeUp` | `/createRecurringMedication` | | `/giveDirectOrder` | `/doClientOwnerDiscipline` | | `/checkHealth` | `/dobExpressCheckupWithVeterinarian` | | `/getRecuringWakeUpSchedule` | `/getRecuringFeedingSchedule` | | `/getLocation` | `/getHungerLevel` | | `/getDog` | `/getSquirrelsChasingPuppies` | | `/newDog` | `/newDogForOwner` | | `/getNewDogsSince` | `/getNewDogsAtKennelSince` | | `/getRedDogs` | `/getRedDogsWithoutSiblings` | | `/getSittingDogs` | `/getSittingDogsAtPark` | | `/setDogsStateTo` | `/setLeashedDogsStateTo` | | `/replaceSittingDogsWithRunningDogs` | `/replaceParkSittingDogsWithRunningDogs` | | `/saveDog` | `/saveMommaDogsPuppies` | It's a slippery slope - soon you have a long list of URLs and no consistent pattern making it difficult for developers to learn how to use your API. ## Use HTTP verbs to operate on the collections and elements. For our dog resources, we have two base URLs that use nouns as labels, and we can operate on them with HTTP verbs. Our HTTP verbs are `POST`, `GET`, `PUT`, and `DELETE`. (We think of them as mapping to the acronym, CRUD (Create-Read-Update-Delete).) With our two resources (`/dogs` and `/dogs/1234`) and the four HTTP verbs, we have a rich set of capability that’s intuitive to the developer. Here is a chart that shows what we mean for our dogs. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 7 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love | Resource | POST create | GET read | PUT update | DELETE delete | |-----------------|---------------------|----------------|-------------------------|-------------------| | /dogs | Create a new dog | List dogs | Bulk update dogs | Delete all dogs | | /dogs/1234 | Error | Show Bo | If exists update Bo | Delete Bo | | | | | If not error | | The point is that developers probably don't need the chart to understand how the API behaves. They can experiment with and learn the API without the documentation. In summary: - Use two base URLs per resource. - Keep verbs out of your base URLs. - Use HTTP verbs to operate on the collections and elements. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 8 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Plural nouns and concrete names Let’s explore how to pick the nouns for your URLs. Should you choose singular or plural nouns for your resource names? You’ll see popular APIs use both. Let’s look at a few examples: - **Foursquare:** `/checkins` - **GroupOn:** `/deals` - **Zappos:** `/Product` Given that the first thing most people probably do with a RESTful API is a GET, we think it reads more easily and is more intuitive to use plural nouns. But above all, avoid a mixed model in which you use singular for some resources, plural for others. Being consistent allows developers to predict and guess the method calls as they learn to work with your API. ## Concrete names are better than abstract Achieving pure abstraction is sometimes a goal of API architects. However, that abstraction is not always meaningful for developers. Take for example an API that accesses content in various forms - blogs, videos, news articles, and so on. An API that models everything at the highest level of abstraction - as `/items` or `/assets` in our example - loses the opportunity to paint a tangible picture for developers to know what they can do with this API. It is more compelling and useful to see the resources listed as blogs, videos, and news articles. The level of abstraction depends on your scenario. You also want to expose a manageable number of resources. Aim for concrete naming and to keep the number of resources between 12 and 24. In summary, an intuitive API uses plural rather than singular nouns, and concrete rather than abstract names. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 9 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Simplify associations - sweep complexity under the '?' In this section, we explore API design considerations when handling associations between resources and parameters like states and attributes. ### Associations Resources almost always have relationships to other resources. What's a simple way to express these relationships in a Web API? Let's look again at the API we modeled in nouns; nouns are good, verbs are bad - the API that interacts with our `dogs` resource. Remember, we had two base URLs: `/dogs` and `/dogs/1234`. We're using HTTP verbs to operate on the resources and collections. Our dogs belong to owners. To get all the dogs belonging to a specific owner, or to create a new dog for that owner, do a `GET` or a `POST`: ``` GET /owners/5678/dogs POST /owners/5678/dogs ``` Now, the relationships can be complex. Owners have relationships with veterinarians, who have relationships with dogs, who have relationships with food, and so on. It's not uncommon to see people string these together making a URL 5 or 6 levels deep. Remember that once you have the primary key for one level, you usually don't need to include the levels above because you've already got your specific object. In other words, you shouldn't need too many cases where a URL is deeper than what we have above: ``` /resource/identifier/resource ``` ### Sweep complexity behind the '?' Most APIs have intricacies beyond the base level of a resource. Complexities can include many states that can be updated, changed, queried, as well as the attributes associated with a resource. Make it simple for developers to use the base URL by putting optional states and attributes behind the HTTP question mark. To get all red dogs running in the park: ``` GET /dogs?color=red&state=running&location=park ``` In summary, keep your API intuitive by simplifying the associations between resources, and sweeping parameters and other complexities under the rug of the HTTP question mark. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 10 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Handling Errors Many software developers, including myself, don't always like to think about exceptions and error handling, but it is a very important piece of the puzzle for any software developer, and especially for API designers. ### Why is good error design especially important for API designers? From the perspective of the developer consuming your Web API, everything at the other side of that interface is a black box. Errors therefore become a key tool providing context and visibility into how to use an API. First, developers learn to write code through errors. The "test-first" concepts of the extreme programming model and the more recent "test-driven development" models represent a body of best practices that have evolved because this is such an important and natural way for developers to work. Secondly, in addition to when they're developing their applications, developers depend on well-designed errors at the critical times when they are troubleshooting and resolving issues after the applications they've built using your API are in the hands of their users. ### How to think about errors in a pragmatic way with REST? Let's take a look at how three top APIs approach it. #### Facebook **HTTP Status Code:** 200 ```json { "type": "OauthException", "message": "(#803) Some of the aliases you requested do not exist: foo.bar" } ``` #### Twilio **HTTP Status Code:** 401 ```json { "status": "401", "message": "Authenticate", "code": 20003, "more_info": "http://www.twilio.com/docs/errors/20003" } ``` #### SimpleGeo **HTTP Status Code:** 401 ```json { "code": 401, "message": "Authentication Required" } ``` ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 11 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Facebook No matter what happens on a Facebook request, you get back the 200-status code - everything is OK. Many error messages also push down into the HTTP response. Here they also throw an #803 error but with no information about what #803 is or how to react to it. ## Twilio Twilio does a great job aligning errors with HTTP status codes. Like Facebook, they provide a more granular error message but with a link that takes you to the documentation. Community commenting and discussion on the documentation helps to build a body of information and adds context for developers experiencing these errors. ## SimpleGeo SimpleGeo provides error codes but with no additional value in the payload. ## A couple of best practices ### Use HTTP status codes Use HTTP status codes and try to map them cleanly to relevant standard-based codes. There are over 70 HTTP status codes. However, most developers don't have all 70 memorized. So if you choose status codes that are not very common you will force application developers away from building their apps and over to Wikipedia to figure out what you're trying to tell them. Therefore, most API providers use a small subset. For example, the Google GData API uses only 10 status codes; Netflix uses 9, and Digg, only 8. ### Google GData ``` | Code | Explanation | |------|-------------| | 200 | OK | | 201 | Created | | 304 | Not Modified | | 400 | Bad Request | | 401 | Unauthorized| | 403 | Forbidden | | 404 | Not Found | | 409 | Conflict | | 410 | Gone | | 500 | Internal Server Error | ``` ### Netflix ``` | Code | Explanation | |------|-------------| | 200 | OK | | 201 | Created | | 304 | Not Modified | | 400 | Bad Request | | 401 | Unauthorized| | 403 | Forbidden | | 404 | Not Found | | 412 | Precondition Failed | | 500 | Internal Server Error | ``` ### Digg ``` | Code | Explanation | |------|-------------| | 200 | OK | | 400 | Bad Request | | 401 | Unauthorized| | 403 | Forbidden | | 404 | Not Found | | 410 | Gone | | 500 | Internal Server Error | | 503 | Service Unavailable | ``` ### How many status codes should you use for your API? When you boil it down, there are really only 3 outcomes in the interaction between an app and an API: - Everything worked - success - The application did something wrong - client error - The API did something wrong - server error ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 12 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love Start by using the following 3 codes. If you need more, add them. But you shouldn't need to go beyond 8. - **200** - OK - **400** - Bad Request - **500** - Internal Server Error If you're not comfortable reducing all your error conditions to these 3, try picking among these additional 5: - **201** - Created - **304** - Not Modified - **404** - Not Found - **401** - Unauthorized - **403** - Forbidden (Check out this good Wikipedia entry for all [HTTP Status codes](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes).) It is important that the code that is returned can be consumed and acted upon by the application's business logic - for example, in an if-then-else, or a case statement. ## Make messages returned in the payload as verbose as possible. ### Code for code - **200** - OK - **401** - Unauthorized ### Message for people ```json { "developerMessage": "Verbose, plain language description of the problem for the app developer with hints about how to fix it.", "userMessage": "Pass this message on to the app user if needed.", "errorCode": 12345, "more info": "http://dev.teachodgest.com/errors/12345" } ``` In summary, be verbose and use plain language descriptions. Add as many hints as your API team can think of about what's causing an error. We highly recommend you add a link in your description to more information, like Twilio does. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 13 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Tips for Versioning Versioning is one of the most important considerations when designing your Web API. ### Never release an API without a version and make the version mandatory. Let's see how three top API providers handle versioning. | Provider | Versioning Example | |------------------|--------------------------------------| | Twilio | `/2010-04-01/Accounts/` | | salesforce.com | `/services/data/v20.0/sobjects/Account` | | Facebook | `?v=1.0` | **Twilio uses a timestamp in the URL (note the European format).** At compilation time, the developer includes the timestamp of the application when the code was compiled. That timestamp goes in all the HTTP requests. When a request arrives, Twilio does a look-up. Based on the timestamp, they identify the API that was valid when this code was created and route accordingly. It's a very clever and interesting approach, although we think it is a bit complex. For example, it can be confusing to understand whether the timestamp is the compilation time or the timestamp when the API was released. **Salesforce.com uses v20.0, placed somewhere in the middle of the URL.** We like the use of the v. notation. However, we don't like using the .0 because it implies that the interface might be changing more frequently than it should. The logic behind an interface can change rapidly but the interface itself shouldn’t change frequently. **Facebook also uses the v. notation but makes the version an optional parameter.** This is problematic because as soon as Facebook forced the API up to the next version, all the apps that didn’t include the version number broke and had to be pulled back and version number added. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 14 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## How to think about version numbers in a pragmatic way with REST? - Never release an API without a version. Make the version mandatory. - Specify the version with a `v` prefix. Move it all the way to the left in the URL so that it has the highest scope (e.g. `/v1/dogs`). - Use a simple ordinal number. Don't use the dot notation like `v1.2` because it implies a granularity of versioning that doesn't work well with APIs—it's an interface not an implementation. Stick with `v1`, `v2`, and so on. ### How many versions should you maintain? Maintain at least one version back. ### For how long should you maintain a version? Give developers at least one cycle to react before obsoleting a version. Sometimes that's 6 months; sometimes it's 2 years. It depends on your developers' development platform, application type, and application users. For example, mobile apps take longer to rev than Web apps. ## Should version and format be in URLs or headers? There is a strong school of thought about putting format and version in the header. Sometimes people are forced to put the version in the header because they have multiple inter-dependent APIs. That is often a symptom of a bigger problem, namely, they are usually exposing their internal mess instead of creating one, usable API facade on top. That's not to say that putting the version in the header is a symptom of a problematic API design. It's not! In fact, using headers is more correct for many reasons: it leverages existing HTTP standards, it's intellectually consistent with Fielding's vision, it solves some hard real-world problems related to inter-dependent APIs, and more. However, we think the reason most of the popular APIs do not use it is because it's less fun to hack in a browser. ### Simple rules we follow: 1. If it changes the logic you write to handle the response, put it in the URL so you can see it easily. 2. If it doesn't change the logic for each response, like OAuth information, put it in the header. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 15 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love These, for example, all represent the same resource: ``` GET /dogs/1 Content-Type: application/json ``` ``` GET /dogs/1 Content-Type: application/xml ``` ``` GET /dogs/1 Content-Type: application/png ``` The code we would write to handle the responses would be very different. There’s no question the header is more correct, and it is still a very strong API design. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 16 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Pagination and Partial Response Partial response allows you to give developers just the information they need. Take for example a request for a tweet on the Twitter API. You’ll get much more than a typical Twitter app often needs— including the name of the person, the text of the tweet, a timestamp, how often the message was re-tweeted, and a lot of metadata. Let’s look at how several leading APIs handle giving developers just what they need in responses, including Google who pioneered the idea of **partial response**. ### LinkedIn ``` /people:(id,first-name,last-name,industry) ``` This request on a person returns the ID, first name, last name, and the industry. LinkedIn does partial selection using this terse `:(...)` syntax which isn’t self-evident. Plus, it’s difficult for a developer to reverse engineer the meaning using a search engine. ### Facebook ``` /joe.smith/friends?fields=id,name,picture ``` ### Google ``` ?fields=title,media:group(media:thumbnail) ``` Google and Facebook have a similar approach, which works well. They each have an optional parameter called `fields` after which you put the names of fields you want to be returned. As you see in this example, you can also put sub-objects in responses to pull in other information from additional resources. ## Add Optional Fields in a Comma-Delimited List The Google approach works extremely well. Here’s how to get just the information we need from our dogs API using this approach: ``` /dogs?fields=name,color,location ``` It’s simple to read; a developer can select just the information an app needs at a given time; it cuts down on bandwidth issues, which is important for mobile apps. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 17 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love The partial selection syntax can also be used to include associated resources, cutting down on the number of requests needed to get the required information. ## Make it easy for developers to paginate objects in a database It's almost always a bad idea to return every resource in a database. Let's look at how Facebook, Twitter, and LinkedIn handle pagination. Facebook uses `offset` and `limit`. Twitter uses `page` and `rpp` (records per page). LinkedIn uses `start` and `count`. Semantically, Facebook and LinkedIn do the same thing. That is, the LinkedIn `start` and `count` are used in the same way as the Facebook `offset` and `limit`. To get records 50 through 75 from each system, you would use: - **Facebook** - `offset 50` and `limit 25` - **Twitter** - `page 3` and `rpp 25` (records per page) - **LinkedIn** - `start 50` and `count 25` ## Use limit and offset We recommend `limit` and `offset`. It is more common, well understood in leading databases, and easy for developers. ``` /dogs?limit=25&offset=50 ``` ## Metadata We also suggest including metadata with each response that is paginated that indicates to the developer the total number of records available. ## What about defaults? My loose rule of thumb for default pagination is `limit=10` with `offset=0`. ``` (limit=10&offset=0) ``` The pagination defaults are, of course, dependent on your data size. If your resources are large, you probably want to limit it to fewer than 10; if resources are small, it can make sense to choose a larger limit. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 18 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## In summary: - Support **partial response** by adding optional fields in a comma delimited list. - Use **limit** and **offset** to make it easy for developers to paginate objects. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 19 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## What about responses that don’t involve resources? API calls that send a response that's not a resource *per se* are not uncommon depending on the domain. We’ve seen it in financial services, Telco, and the automotive domain to some extent. Actions like the following are your clue that you might not be dealing with a "resource" response. - Calculate - Translate - Convert For example, you want to make a simple algorithmic calculation like how much tax someone should pay, or do a natural language translation (one language in request; another in response), or convert one currency to another. None involve resources returned from a database. ### In these cases: #### Use verbs, not nouns For example, an API to convert 100 euros to Chinese Yen: ``` /convert?from=EUR&to=CNY&amount=100 ``` Make it clear in your API documentation that these “non-resource” scenarios are different. Simply separate out a section of documentation that makes it clear that you use verbs in cases like this – where some action is taken to generate or calculate the response, rather than returning a resource directly. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 20 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Supporting multiple formats We recommend that you support more than one format - that you push things out in one format and accept as many formats as necessary. You can usually automate the mapping from format to format. Here's what the syntax looks like for a few key APIs. ### Google Data ``` ?alt=json ``` ### Foursquare ``` /venue.json ``` ### Digg* ``` Accept: application/json ?type=json ``` *The type argument, if present, overrides the `Accept` header. Digg allows you to specify in two ways: in a pure RESTful way in the `Accept` header or in the type parameter in the URL. This can be confusing - at the very least you need to document what to do if there are conflicts. We recommend the Foursquare approach. To get the JSON format from a collection or specific element: ``` dogs.json /dogs/1234.json ``` Developers and even casual users of any file system are familiar to this dot notation. It also requires just one additional character (the period) to get the point across. ## What about default formats? In my opinion, JSON is winning out as the default format. JSON is the closest thing we have to universal language. Even if the back end is built in Ruby on Rails, PHP, Java, Python, etc., most projects probably touch JavaScript for the front-end. It also has the advantage of being terse - less verbose than XML. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 21 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## What about attribute names? In the previous section, we talked about formats - supporting multiple formats and working with JSON as the default. This time, let's talk about what happens when a response comes back. You have an object with data attributes on it. How should you name the attributes? Here are API responses from a few leading APIs: - **Twitter** ```json "created_at": "Thu Nov 03 05:19:38 +0000 2011" ``` - **Bing** ```json "DateTime": "2011-10-29T09:35:00Z" ``` - **Foursquare** ```json "createdAt": 1320296464 ``` They each use a different code convention. Although the Twitter approach is familiar to me as a Ruby on Rails developer, we think that Foursquare has the best approach. How does the API response get back in the code? You parse the response (JSON parser); what comes back populates the object. It looks like this: ```javascript var myObject = JSON.parse(response); ``` If you chose the Twitter or Bing approach, your code looks like this. It's not JavaScript convention and looks weird - it looks like the name of another object or class in the system, which is not correct. ```javascript timing = myObject.created_at; timing = myObject.DateTime; ``` ## Recommendations - Use JSON as default - Follow JavaScript conventions for naming attributes - Use medial capitalization (aka CamelCase) - Use uppercase or lowercase depending on the type of object ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 22 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love This results in code that looks like the following, allowing the JavaScript developer to write it in a way that makes sense for JavaScript. ```javascript "createdAt": 1320296464; timing = myObject.createdAt; ``` ## Tips for Search While a simple search could be modeled as a resourceful API (for example, `/dogs/?q=red`), a more complex search across multiple resources requires a different design. This will sound familiar if you've read the topic about using verbs not nouns when results don't return a resource from the database - rather the result is some action or calculation. If you want to do a global search across resources, we suggest you follow the Google model: ### Global Search `/search?q=fluffy+fur` Here, `search` is the verb; `q` represents the query. ### Scoped Search To add scope to your search, you can prepend with the scope of the search. For example, search in dogs owned by resource ID 5678: `/owners/5678/dogs?q=fluffy+fur` Notice that we've dropped the explicit search in the URL and rely on the parameter `q` to indicate the scoped query. (Big thanks to the contributors on the [API Craft](https://apicraft.com) Google group for helping refine this approach.) ### Formatted Results For search or for any of the action-oriented (non-resource) responses, you can prepend with the format as follows: `/search.xml?q=fluffy+fur` ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 23 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Consolidate API requests in one subdomain We’ve talked about things that come after the top-level domain. This time, let’s explore stuff on the other side of the URL. Here’s how Facebook, Foursquare, and Twitter handle this: Facebook provides two APIs. They started with `api.facebook.com`, then modified it to orient around the social graph `graph.facebook.com`. - `graph.facebook.com` - `api.facebook.com` Foursquare has one API. - `api.foursquare.com` Twitter has three APIs; two of them focused on search and streaming. - `stream.twitter.com` - `api.twitter.com` - `search.twitter.com` It’s easy to understand how Facebook and Twitter ended up with more than one API. It has a lot to do with timing and acquisition, and it’s easy to reconfigure a CNAME entry in your DNS to point requests to different clusters. But if we’re making design decisions about what’s in the best interest of the app developer, we recommend following Foursquare’s lead: ### Consolidate all API requests under one API subdomain. It’s cleaner, easier, and more intuitive for developers who want to build cool apps using your API. Facebook, Foursquare, and Twitter also all have dedicated developer portals. - `developers.facebook.com` - `developers.foursquare.com` - `dev.twitter.com` ## How to organize all of this? Your API gateway should be the top-level domain. For example, `api.teachdogrest.com`. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 24 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love In keeping with the spirit of REST, your developer portal should follow this pattern: `developers.yourtopleveldomain`. For example, `developers.teachdogrest.com` ## Do Web redirects Then optionally, if you can sense from requests coming in from the browser where the developer really needs to go, you can redirect. Say a developer types `api.teachdogrest.com` in the browser but there's no other information for the GET request, you can probably safely redirect to your developer portal and help get the developer where they really need to be. - `api` → developers (if from browser) - `dev` → developers - `developer` → developers ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 25 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Tips for handling exceptional behavior So far, we've dealt with baseline, standard behaviors. Here we'll explore some of the exceptions that can happen - when clients of Web APIs can't handle all the things we've discussed. For example, sometimes clients intercept HTTP error codes, or support limited HTTP methods. What are ways to handle these situations and work within the limitations of a specific client? ### When a client intercepts HTTP error codes One common thing in some versions of Adobe Flash - if you send an HTTP response that is anything other than HTTP 200 OK, the Flash container intercepts that response and puts the error code in front of the end user of the app. Therefore, the app developer doesn't have an opportunity to intercept the error code. App developers need the API to support this in some way. Twitter does an excellent job of handling this. They have an optional parameter `suppress_response_codes`. If `suppress_response_codes` is set to true, the HTTP response is always 200. ``` /public_timelines.json?suppress_response_codes=true HTTP status code: 200 {"error":"Could not authenticate you."} ``` Notice that this parameter is a big verbose response code. (They could have used something like `src`, but they opted to be verbose.) This is important because when you look at the URL, you need to see that the response codes are being suppressed as it has huge implications about how apps are going to respond to it. ### Overall recommendations: 1. Use `suppress_response_codes = true` 2. The HTTP code is no longer just for the code The rules from our previous Handling Errors section change. In this context, the HTTP code is no longer just for the code - the program - it's now to be ignored. Client apps are never going to be checking the HTTP status code, as it is always the same. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 26 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## 3 - Push any response code that we would have put in the HTTP response down into the response message In my example below, the response code is 401. You can see it in the response message. Also include additional error codes and verbose information in that message. ### Always return OK ```plaintext /dogs?suppress_response_codes=true ``` ### Code for ignoring ```plaintext 200 - OK ``` ### Message for people & code ```json { "response_code": 401, "message": "Verbose, plain language description of the problem with hints about how to fix it.", "more_info": "http://dev.teachdogrest.com/errors/12345", "code": 12345 } ``` ## When a client supports limited HTTP methods It is common to see support for GET and POST and not PUT and DELETE. To maintain the integrity of the four HTTP methods, we suggest you use the following methodology commonly used by Ruby on Rails developers: ### Make the method an optional parameter in the URL Then the HTTP verb is always a GET but the developer can express rich HTTP verbs and still maintain a RESTful clean API. #### Create ```plaintext /dogs?method=post ``` #### Read ```plaintext /dogs ``` #### Update ```plaintext /dogs/1234?method=put&location=park ``` #### Delete ```plaintext /dogs/1234?method=delete ``` > **WARNING:** It can be dangerous to provide post or delete capabilities using a GET method because if the URL is in a Web page then a Web crawler like the Googlebot can create or destroy lots of content inadvertently. Be sure you understand the implications of supporting this approach for your applications' context. 26 ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 27 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Authentication There are many schools of thought. My colleagues at Apigee and I don't always agree on how to handle authentication - but overall here's my take. Let's look at these three top services. See how each of these services handles things differently: ### PayPal - Permissions Service API ### Facebook - OAuth 2.0 ### Twitter - OAuth 1.0a Note that PayPal's proprietary three-legged permissions API was in place long before OAuth was conceived. ### What should you do? Use the latest and greatest OAuth - OAuth 2.0 (as of this writing). It means that Web or mobile apps that expose APIs don’t have to share passwords. It allows the API provider to revoke tokens for an individual user, for an entire app, without requiring the user to change their original password. This is critical if a mobile device is compromised or if a rogue app is discovered. Above all, OAuth 2.0 will mean improved security and better end-user and consumer experiences with Web and mobile apps. Don't do something *like* OAuth, but different. It will be frustrating for app developers if they can’t use an OAuth library in their language because of your variation. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 28 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Making requests on your API Let's take a look at what some API requests and responses look like for our dogs API. ### Create a brown dog named Al **POST** `/dogs` `name=Al&furColor=brown` **Response** ``` { "dog": { "id": "1234", "name": "Al", "furColor": "brown" } } ``` ### Rename Al to Rover - Update **PUT** `/dogs/1234` `name=Rover` **Response** ``` { "dog": { "id": "1234", "name": "Rover", "furColor": "brown" } } ``` ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 29 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Tell me about a particular dog **GET /dogs/1234** **Response** 200 OK ```json { "dog": { "id": "1234", "name": "Rover", "furColor": "brown" } } ``` ## Tell me about all the dogs **GET /dogs** **Response** 200 OK ```json { "dogs": [ { "dog": { "id": "1233", "name": "Fido", "furColor": "white" } }, { "dog": { "id": "1234", "name": "Rover", "furColor": "brown" } } ], "_metadata": [ { "totalCount": 327, "limit": 25, "offset": 100 } ] } ``` ## Delete Rover :( **DELETE /dogs/1234** **Response** 200 OK ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 30 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Chatty APIs Let’s think about how app developers use that API you’re designing and dealing with chatty APIs. ### Imagine how developers will use your API When designing your API and resources, try to imagine how developers will use it to say construct a user interface, an iPhone app, or many other apps. Some API designs become very chatty—meaning just to build a simple UI or app, you have dozens or hundreds of API calls back to the server. The API team can sometimes decide not to deal with creating a nice, resource-oriented RESTful API, and just fall back to a mode where they create the 3 or 4 Java-style getter and setter methods they know they need to power a particular user interface. We don’t recommend this. You can design a RESTful API and still mitigate the chattiness. ### Be complete and RESTful and provide shortcuts First design your API and its resources according to [pragmatic RESTful design principles](https://www.example.com) and then provide shortcuts. What kind of shortcut? Say you know that 80% of all your apps are going to need some sort of composite response, then build the kind of request that gives them what they need. Just don’t do the latter instead of the former. First design using good pragmatic RESTful principles! ### Take advantage of the partial response syntax The partial response syntax discussed in a previous section can help. To avoid creating one-off base URLs, you can use the partial response syntax to drill down to dependent and associated resources. In the case of our dogs API, the dogs have association with owners, who in turn have associations with veterinarians, and so on. Keep nesting the partial response syntax using dot notation to get back just the information you need. ``` /owners/5678?fields=name,dogs.name ``` ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 31 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Complement with an SDK It’s a common question for API providers - do you need to complement your API with code libraries and software development kits (SDKs)? If your API follows good design practices, is self-consistent, standards-based, and well documented, developers may be able to get rolling without a client SDK. Well-documented code samples are also a critical resource. On the other hand, what about the scenario in which building a UI requires a lot of domain knowledge? This can be a challenging problem for developers even when building UI and apps on top of APIs with pretty simple domains - think about the Twitter API with its primary object of 140 characters of text. You shouldn't change your API to try to overcome the domain knowledge hurdle. Instead, you can complement your API with code libraries and a software development kit (SDK). In this way, you don’t overburden your API design. Often, a lot of what’s needed is on the client side and you can push that burden to an SDK. The SDK can provide the platform-specific code, which developers use in their apps to invoke API operations - meaning you keep your API clean. Other reasons you might consider complementing your API with an SDK include: - **Speed adoption on a specific platform.** (For example, Objective C SDK for iPhone.) Many experienced developers are just starting off with Objective C so an SDK might be helpful. - **Simplify integration effort required to work with your API.** If key use cases are complex or need to be complemented by standard on-client processing. - **An SDK can help reduce bad or inefficient code** that might slow down service for everyone. As a developer resource - good SDKs are a forcing function to create good source code examples and documentation. Yahoo! and Paypal are good examples: - Yahoo! [http://developer.yahoo.com/social/sdk/](http://developer.yahoo.com/social/sdk/) - Paypal [https://cms.paypal.com/us/cgi-bin/?cmd=_render-content&content_ID=developer/library_download_sdks](https://cms.paypal.com/us/cgi-bin/?cmd=_render-content&content_ID=developer/library_download_sdks) To market your API to a specific community - you upload the SDK to a samples or plug-in page on a platform’s existing developer community. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 32 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## The API Façade Pattern At this point, you may be asking: **What should we be thinking from an architectural perspective?** **How do we follow all these best practice guidelines, expose my internal services and systems in a way that’s useful to app developers, and still iterate and maintain my API?** Back-end systems of record are often too complex to expose directly to application developers. They are stable (have been hardened over time) and dependable (they are running key aspects of your business), but they are often based on legacy technologies and not always easy to expose to Web standards like HTTP. These systems can also have complex interdependencies and they change slowly meaning that they can’t move as quickly as the needs of mobile app developers and keep up with changing formats. In fact, the problem is not creating an API for just one big system but creating an API for an array of complementary systems that all need to be used to make an API valuable to a developer. | Big System | DB | Content Mgmt | SOAP | JDBC | RSS | |------------|----------|---------------|------|------|------| | | | | | | | It’s useful to talk through a few anti-patterns that we’ve seen. Let’s look at why we believe they don’t work well. ## The Build Up Approach In the build-up approach, a developer exposes the core objects of a big system and puts an XML parsing layer on top. ``` XML Expose Objects ``` ``` Big System ``` ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 33 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love This approach has merit in that it can get you to market with version 1 quickly. Also, your API team members (your internal developers) already understand the details of the system. Unfortunately, those details of an internal system at the object level are fine-grained and can be confusing to external developers. You’re also exposing details of internal architecture, which is rarely a good idea. This approach can be inflexible because you have 1:1 mapping to how a system works and how it is exposed to the API. In short, building up from the systems of record to the API can be overly complicated. ## The Standards Committee Approach Often the internal systems are owned and managed by different people and departments with different views about how things should work. Designing an API by a standards committee often involves creating a standards document, which defines the schema and URLs and such. All the stakeholders build toward that common goal. ``` | Standards Doc | |-------------------| | **Expose** | **Expose** | **Expose RSS** | | `XML` | `XML` | `XML` | | | | | | **Big System** | **DB** | **Content Mgmt** | ``` The benefits of this approach include getting to version 1 quickly. You can also create a sense of unification across an organization and a comprehensive strategy, which can be significant accomplishments when you have a large organization with a number of stakeholders and contributors. A drawback of the standards committee pattern is that it can be slow. Even if you get the document created quickly, getting everybody to implement against it can be slow and can lack adherence. This approach can also lead to a mediocre design as a result of too many compromises. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 34 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## The Copy Cat Approach We sometimes see this pattern when an organization is late to market – for example, when their close competitor has already delivered a solution. Again, this approach can get you to version 1 quickly and you may have a built-in adoption curve if the app developers who will use your API are already familiar with your competitor's API. ![Competitor's API Doc](#) | Format | Service | System | |--------|----------------|------------| | XML | Expose | Big System | | XML | Expose | DB | | XML | Expose RSS | Content Mgmt| However, you can end up with an undifferentiated product that is considered an inferior offering in the market of APIs. You might have missed exposing your own key value and differentiation by just copying someone else's API design. ## Solution – The API façade pattern The best solution starts with thinking about the fundamentals of product management. Your product (your API) needs to be credible, relevant, and differentiated. Your product manager is a key member of your API team. Once your product manager has decided what the big picture is like, it’s up to the architects. We recommend you implement an **API façade pattern**. This pattern gives you a buffer or virtual layer between the interface on top and the API implementation on the bottom. You essentially create a façade – a comprehensive view of what the API should be and, importantly, from the perspective of the app developer and end user of the apps they create. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 35 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love The developer and the app that consume the API are on top. The API façade isolates the developer and the application and the API. Making a clean design in the façade allows you to decompose one really hard problem into a few simpler problems. > “Use the façade pattern when you want to provide a simple interface to a complex subsystem. Subsystems often get more complex as they evolve.” > — Design Patterns – Elements of Reusable Object-Oriented Software > (Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides) Implementing an API façade pattern involves three basic steps: 1. **Design the ideal API** – design the URLs, request parameters and responses, payloads, headers, query parameters, and so on. The API design should be self-consistent. 2. **Implement the design with data stubs**. This allows application developers to use your API and give you feedback even before your API is connected to internal systems. 3. **Mediate or integrate between the façade and the systems.** ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 36 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ![Web API Design Icon](link-to-icon) ## Ideal Design ![Ideal Design Diagram](link-to-ideal-design-diagram) ## API Facade ![API Facade Diagram](link-to-api-facade-diagram) ## Mediate | Big System | DB | Content Mgmt | SOAP | JDBC | RSS | |------------|----|---------------|------|------|-----| Using the three-step approach, you've decomposed one big problem into three smaller problems. If you try to solve the one big problem, you’ll be starting in code and trying to build up from your business logic (systems of record) to a clean API interface. You would be exposing objects or tables or RSS feeds from each silo, mapping each to XML in the right format before exposing to the app. It is a machine-to-machine orientation focused around an app and is difficult to get this right. Taking the façade pattern approach helps shift the thinking from a silo approach in a number of important ways. First, you can get buy-in around each of the three separate steps and have people more clearly understand how you’re taking a pragmatic approach to the design. Secondly, the orientation shifts from the app to the app developer. The goal becomes to ensure that the app developer can use your API because the design is self-consistent and intuitive. Because of where it is in the architecture, the façade becomes an interesting gateway. You can now have the façade implement the handling of common patterns (for pagination, queries, ordering, sorting, etc.), authentication, authorization, versioning, and so on, uniformly across the API. *(This is a big topic and a full discussion is beyond the scope of this article.)* Other benefits for the API team include being more easily able to adapt to different use cases regardless of whether they are internal developer, partner, or open scenarios. The API team will be able to keep pace with the changing needs of developers, including the ever-changing protocols and languages. It is also easier to extend an API by building out more capability from your enterprise or plugging in additional existing systems. ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 37 Context: ```markdown # Web API Design - Crafting Interfaces that Developers Love ## Resources - Representational State Transfer (REST), Roy Thomas Fielding, 2000 - [RESTful API Design Webinar](https://example.com), 2nd edition, Brian Mulloy, 2011 - [Apigee API Tech & Best Practices Blog](https://example.com) - [API Craft](https://example.com) Google Group ## About Brian Mulloy **Brian Mulloy, Apigee** Brian has 15 years of experience ranging from enterprise software to founding a Web startup. He co-founded and was CEO of Swivel, a website for social data analysis. He was President and General Manager of Grand Central, a cloud-based offering for application infrastructure (before we called it the cloud). He was also Director of Product Marketing at BEA Systems. Brian holds a degree in Physics from the University of Michigan. Brian is a frequent contributor on the [Apigee API Tech & Best Practices Blog](https://example.com), the [Apigee YouTube channel](https://example.com), the [API Craft](https://example.com) Google Group, and [Webinars](https://example.com). ``` #################### File: Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf Page: 38 Context: ```markdown # Apigee Apigee is the leading provider of API products and technology for enterprises and developers. Hundreds of enterprises like Comcast, GameSpy, TransUnion Interactive, Guardian Life, and Constant Contact, as well as thousands of developers, use Apigee's technology. Enterprises use Apigee for visibility, control, and scale of their API strategies. Developers use Apigee to learn, explore, and develop API-based applications. Learn more at [apigee.com](http://apigee.com). ## Accelerate your API Strategy ## Scale Control and Secure your Enterprise ## Developers – Consoles for the APIs you ❤️ ``` #################### File: Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf Page: 1 Context: ```markdown # 10. The Network ## 10.1 Introduction Workstations are typically, but not always, connected in a local environment by a network. There exist two basically different views of the architecture of such nets. The more demanding view is that all connected stations constitute a single, unified workspace (also called address-space), in which the individual processors operate. It implies the demand that the “thin” connections between processors are hidden from the users. At worst they might become apparent through slower data access rates between the machines. To hide the difference between access within a computer and access between computers is regarded primarily as a challenge to implementors. The second, more conservative view, assumes that individual workstations are, although connected, essentially autonomous units which exchange data infrequently. Therefore, access of data on partner stations is initiated by explicit transfer commands. Commands handling external access are not part of the basic system, but rather are implemented in modules that might be regarded as applications. In the Oberon System, we adhere to this second view, and in this chapter, we describe the module **Net**, which is an autonomous command module based on the network driver SCC. It can be activated on any station connected in a network, and all of them are treated as equals. Such a set of loosely coupled stations may well operate in networks with moderate transmission rates and therefore with low-cost hardware interfaces and twisted-pair wires. An obvious choice for the unit of transferred data is the file. The central theme of this chapter is therefore file transfer over a network. Some additional facilities offered by a dedicated server station will be the subject of Chapter 11. The commands to be presented here are a few only: `SendFiles`, `ReceiveFiles`, and `SendMsg`. As explained in Chapter 2, Oberon is a single-process system where every command monopolizes the processor until termination. When a command involves communication over a network, (at least) two processors are engaged in the action at the same time. The Oberon paradigm therefore appears to exclude such cooperation; but fortunately it does not, and the solution to the problem is quite simple. Every command is initiated by a user operating on a workstation. For the moment we will call it the master (of the command under consideration). The addressed station - obviously called the server - must be in a state where it recognizes the command in order to engage in its execution. Since the command - called a **request** - arrives in encoded form over the network, an Oberon task represented by a handler procedure must be inserted into the event polling loop of the system. Such a handler must have the general form: ``` IF event present THEN handle event END ``` The guard, in this case, must imply that a request was received from the network. We emphasize that the event is sensed by the server only after the command currently under execution, if any, has terminated. However, data arrive at the receiver immediately after they are sent by the master. Hence, any sizable delay is inherently inadmissible, and the Oberon metaphor once again appears to fail. It does not fail, however, because the unavoidable, genuine concurrency of sender and receiver action is handled within the driver module which places the data into a buffer. The driver is activated by an interrupt, and its receiver buffer effectively decodes the partners and removes the stringent timing constraints. All this remains completely hidden within the driver module. ## 10.2 The Protocol If more than a single agent participates in the execution of a command, a convention must be established and obeyed. It defines the set of requests, their encoding, and the sequence of data. ``` #################### File: Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf Page: 2 Context: ```markdown # 10.3 Station Addressing Every packet must carry a destination address as well as the sender's address. Addresses are station numbers. It could certainly be inconvenient for a user to remember the station number of a desired partner. Instead, the use of symbolic names is preferred. We have become accustomed to use the partner's initials for this purpose. ## Protocols A simple form of the `ReceiveFile` request is defined as follows and will be refined subsequently: ``` ReceiveFile = SND filename (ACK data | NAK). ``` Here, the symbol `SND` represents the encoded request that the server send the file specified by the filename. `ACK` signals that the request is honored and the requested data follow. The `NAK` symbol indicates that the requested file cannot be delivered. The transaction clearly consists of two parts, the request and the reply, one from each partner. This simple-minded scheme fails because of the limitation of the size of each transmitted portion imposed by the network driver. We recall that module `SCC` restricts the data of each packet to 512 bytes. Evidently, files must be broken up and transmitted as a sequence of packets. The reason for this restriction is transmission reliability. The break-up allows the partner to confirm correct receipt of a packet by returning a short acknowledgement. Each acknowledgement also serves as a request for the next packet. An exception is the last acknowledgement following the last data portion, which is characterized by its length being less than the admitted maximum. The revised protocol is defined as follows: ``` ReceiveFile = SND filename (DAT data ACK | DAT data ACK | NAK). ``` We now recall that each packet as defined in Section 9.3 is characterized by a type in its header. The symbols `SND`, `DAT`, `ACK`, and `NAK` indicate this packet type. The data portions of `ACK` and `NAK` packets are empty. The revised protocol fails to cope with transmission errors. Correct transmission is checked by the driver through a cyclic redundancy check (CRC), and an erroneous packet is simply discarded. This implies that a receiver must impose a timing constraint. If an expected packet fails to arrive within a given time period (timeout), the request must be repeated. In our case, a request is implied by an acknowledgement. Hence, the acknowledgement must specify whether the next (normal case) or the previously requested (error case) packet must be sent. The solution is to attach a sequence number to each acknowledgement and to each data packet. These numbers are taken modulo 8, although in principle modulo 2 would suffice. With the addition of a user identification and a password to every request, and of an alternate reply code `NPR` for "no permission", the protocol reaches its final form: ``` ReceiveFile = SND username password filename (datastream | NAK | NPR). datastream = DAT data ACK | {DAT, data ACK}*. ``` The protocol for file transmission from the master to the server is defined similarly: ``` SendFile = REC username password filename (ACKO datastream | NAK | NPR). datastream = DAT data ACK | {DAT, data ACK}*. ``` The third request listed above, `SendMsg`, does not refer to any file, but merely transmits and displays a short message. It is included here for testing the link between two partners and perhaps for visibly acknowledging a rendered service by the message "done", or "thank you". ``` SendMsg = MSG message ACK. ``` ``` #################### File: Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf Page: 3 Context: ```markdown The source address is inserted automatically into packet headers by the driver. It is obtained from a DIP switch set when a computer is installed and connected. But where should the destination address come from? From the start, we reject the solution of an address table in every workstation because of the potential inconsistencies. The concept of a centralized authority holding a name/address dictionary is equally unattractive, because of the updates required whenever a person uses a different computer. Also, we have started from the premise to keep all participants in the network equal. The most attractive solution lies in a decentralized name service. It is based on the broadcast facility, i.e., the possibility to send a packet to all connected stations, bypassing their address filters with a special destination address (-1). The broadcast is used for emitting a name request containing the desired partner's symbolic name. A station receiving the request returns a reply to the requester, if that name matches its own symbolic name. The requester then obtains the desired partner's address from the source address field of the received reply. The corresponding simple protocol is: ``` NameRequest = NRQ partnername [NRS]. ``` Here, the already mentioned timeout facility is indispensable. The following summarizes the protocol developed so far: ``` protocol = {request}. request = ReceiveFile | SendFile | SendMsg | NameRequest. ``` The overhead incurred by name requests may be reduced by using a local address dictionary. In practice, a single entry is satisfactory. A name request is then needed whenever the partner changes. ## 10.4 The implementation Module `Net` is an implementation of the facilities outlined above. The program starts with a number of auxiliary, local procedures. They are followed by procedure `Serve` which is to be installed as an Oberon task, and the commands `SendFile`, `ReceiveFiles`, and `SendMsg`, each of which has its counterpart within procedure `Serve`. At the end are the commands for starting and stopping the server facility. For a more detailed presentation we select procedure `ReceiveFiles`. It starts out by reading the first parameter which designates the partner station from the command line. Procedure `FindPartner` issues the name request, unless the partner's address has already been determined by a previous command. The global variable `partner` records a symbolic name (ID) whose address is stored in the destination field of the global variable `head0`, which is used as a header in every packet sent by procedure `SCC.SendPacket`. The variable `partner` may be regarded as a name cache with a single entry and with the purpose of reducing the number of issued name requests. If the partner has been identified, the next parameter is read from the command line. It is the name of the file to be transmitted. If the parameter has the form `name0.name`, the file stored on the server as `name0.name` is fetched and stored locally as `name`. Hence, `name0` serves as a prefix of the file name on the server station. Thereafter, the request parameters are concatenated in the local buffer variable `buf`. They are the user’s name and password followed by the file name (user name and password remain unsung by the server procedure defined here). The command package is dispatched by the call `Send(SEND, buf)`, where `buf` records the length of the command parameter string. Then the reply packet is awaited by calling `ReceiveHead`. If the received packet's type is DAT with sequence number 0, a new file is established. Procedure `ReadData` receives the data and stores them in the new file, obeying the protocol defined in Section 10.2. This process is repeated for each file specified in the list of file names in the command line. Procedure `ReceiveHead(T)` receives packets and discards them until one arrives from the partner from which it is expected. The procedure represents an input filter in addition to the one provided by the partner. ``` #################### File: Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf Page: 4 Context: ```markdown The hardware discriminates based on the packets’ source address, whereas the hardware filter distinguishes on the basis of the destination address. If no packet arrives within the allotted time `T`, a type code `-1` is returned, signifying a timeout. ## Procedure `ReceiveData` `ReceiveData` checks the sequence numbers of incoming data packets (type 0 - 7). If an incorrect number is detected, an ACK packet with the previous sequence number is returned (type 16 - 23), requesting a retransmission. At most two retries are undertaken. This seems to suffice considering that the server does not accept any other requests while being engaged in the transmission of a file. The part corresponding to `ReceiveFiles` within procedure `Serve` is guarded by the condition `head.typ = SND`. Variable `head` is the recipient of headers whenever a packet is received by `ReceiveHead`. First, the request’s parameters are scanned. `Id` and `pw` are ignored. Then the requested file is opened. If it exists, the transmission is handled by `ReceiveData`'s counterpart, procedure `SendData`. The time limit for receiving the next data packet is `T1`, whereas the limit of data waiting for retransmission is `T0`. `T1` is roughly `70` multiplied by the maximum number of possible (re)transmissions. Before disengaging itself from a transaction, the sender sends data waits until no further retransmission requests can be expected to arrive. The value `T0 (300)` corresponds to `1s`; the time for transmission of a packet of maximum length is about `16ms`. ## Procedure `SendFiles` Procedure `SendFiles` is designed analogously; its counterpart in the server is guarded by the condition `head.t typ = REC`. The server accepts the request only if its state is unprotected (global variable `protected`). Otherwise, the request is negatively acknowledged with an NPR packet. We draw attention to the fact that procedures `SendData` and `ReceiveData` are both used by command procedures as well as by the server. ``` ########## """QUERY: what is an overview for best practices in web api design?>""" Consider the chat history for relevant information. Use all information included. Use as much tokens as needed. Important: If you find information separated by a | in the context, it is a table formatted in Markdown.Important: Take a look at the QUERY and only the QUERY. If this is vague or unclear, please ignore everything and ask a follow-up question instead! Final Files Sources: The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 1, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 2, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 3, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 4, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 5, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 6, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 7, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 8, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 9, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 10, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 11, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 12, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 13, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 14, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 15, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 16, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 17, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 18, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 19, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 20, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 21, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 22, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 23, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 24, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 25, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 26, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 27, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 28, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 29, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 30, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 31, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 32, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 33, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 34, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 35, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 36, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 37, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 38, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 39, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 40, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 41, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 42, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 43, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 44, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 45, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 46, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 47, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 48, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 49, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 50, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 51, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 52, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 53, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 54, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 55, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 56, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 57, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 58, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 59, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 60, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 61, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 62, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 63, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 64, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 65, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 66, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 67, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 68, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 69, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 70, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 71, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 72, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 73, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 74, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 75, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 76, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 77, The%20Little%20Book%20About%20OS%20Development%20-%20Erik%20Helin%2C%20Adam%20Renberg%20-%20%28PDF%2C%20HTML%29.pdf - Page 78, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 1, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 2, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 3, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 4, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 5, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 6, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 7, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 8, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 9, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 10, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 11, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 12, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 13, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 14, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 15, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 16, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 17, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 18, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 19, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 20, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 21, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 22, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 23, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 24, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 25, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 26, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 27, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 28, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 29, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 30, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 31, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 32, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 33, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 34, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 35, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 36, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 37, Web%20API%20Design%20-%20Brian%20Mulloy%20%28PDF%29.pdf - Page 38, Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf - Page 1, Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf - Page 2, Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf - Page 3, Project%20Oberon%20The%20Design%20of%20an%20Operating%20System%2C%20a%20Compiler%2C%20and%20a%20Computer%20-%20Niklaus%20Wirth%2C%20J%C3%BCrg%20Gutknecht%20%28PDF%29%20PO.Applications.pdf - Page 4 ================================================== **Elapsed Time: 0.00 seconds** ================================================== FINAL ANSWER Answer: Web API design is about creating interfaces that developers will love to use. This involves applying design principles and best practices that prioritize developer productivity and success. Here's an overview of key best practices for designing Web APIs: **1. Nouns are good; verbs are bad:** - **Keep your base URL simple and intuitive:** Use nouns to represent resources in your API URLs. For example, `/dogs` for a collection of dogs and `/dogs/1234` for a specific dog. - **Keep verbs out of your base URLs:** Avoid using verbs in your base URLs. Instead, use HTTP verbs (GET, POST, PUT, DELETE) to perform actions on the resources. - **Use HTTP verbs to operate on the collections and elements:** Leverage the semantics of HTTP verbs to create a consistent and predictable API. For example, use GET to retrieve data, POST to create data, PUT to update data, and DELETE to delete data. **2. Plural nouns and concrete names:** - **Use plural nouns for resource names:** This makes the API more intuitive, especially when retrieving a collection of resources. For example, use `/dogs` instead of `/dog`. - **Use concrete names instead of abstract ones:** Choose resource names that are meaningful and easy for developers to understand. For example, use `/blogs`, `/videos`, and `/news` instead of a generic `/items` or `/assets`. **3. Simplify associations and parameters:** - **Simplify associations between resources:** Keep URLs relatively shallow and avoid complex nested structures. Use query parameters to represent relationships between resources. For example, use `/owners/5678/dogs` to retrieve dogs belonging to owner 5678. - **Sweep complexity behind the '?':** Use query parameters to handle optional states, attributes, and filtering. For example, use `/dogs?color=red&state=running&location=park` to retrieve red dogs running in the park. **4. Handling Errors:** - **Use HTTP status codes:** Utilize standard HTTP status codes to indicate the outcome of API requests. Stick to a small subset of commonly used codes (e.g., 200, 400, 500) to avoid confusion. - **Provide verbose error messages:** Include detailed and human-readable error messages in the response payload. Provide hints for resolving the error and a link to more information. **5. Versioning:** - **Always version your API:** Include a version number in the URL (e.g., `/v1/dogs`) and make it mandatory. - **Use simple ordinal numbers:** Stick to `v1`, `v2`, etc. Avoid using dot notation (e.g., `v1.2`) as it implies a level of granularity that doesn't work well with APIs. - **Maintain at least one version back:** Provide developers with sufficient time to migrate to newer versions. **6. Pagination and Partial Response:** - **Support partial response:** Allow developers to retrieve only the data they need by specifying fields in a comma-delimited list. For example, `/dogs?fields=name,color,location`. - **Use limit and offset for pagination:** Provide parameters for limiting the number of results and specifying the starting offset. For example, `/dogs?limit=25&offset=50`. - **Include metadata:** Provide information about the total number of records and pagination details in the response. **7. Handling Non-Resource Responses:** - **Use verbs for non-resource responses:** When an API call involves an action or calculation rather than retrieving a resource, use verbs in the URL. For example, `/convert?from=EUR&to=CNY&amount=100`. **8. Supporting Multiple Formats:** - **Support multiple formats:** Allow developers to request data in different formats (e.g., JSON, XML). - **Use dot notation for format specification:** Use `.json`, `.xml`, etc. in the URL to specify the desired format. For example, `/dogs.json`. - **Use JSON as the default format:** JSON is widely used and generally preferred for its terseness and compatibility with JavaScript. **9. Attribute Naming:** - **Follow JavaScript conventions:** Use camelCase for attribute names in JSON responses to align with JavaScript conventions. **10. Search:** - **Use a dedicated search endpoint:** For global search, use a dedicated endpoint like `/search?q=query`. - **Support scoped search:** Allow developers to search within a specific resource or scope. For example, `/owners/5678/dogs?q=query`. **11. API Subdomain:** - **Consolidate API requests under one subdomain:** Use a single subdomain (e.g., `api.example.com`) for all API requests. **12. Handling Exceptional Behavior:** - **Handle intercepted error codes:** Provide a mechanism to suppress HTTP error codes if clients intercept them. For example, use a parameter like `suppress_response_codes=true`. - **Support limited HTTP methods:** If clients support only a limited set of HTTP methods, provide a workaround using query parameters. For example, use `method=put` to simulate a PUT request using GET. **13. Authentication:** - **Use OAuth 2.0:** OAuth 2.0 is the recommended standard for authentication and authorization. **14. Chatty APIs:** - **Design a RESTful API and provide shortcuts:** Design the API based on RESTful principles and then provide shortcuts or composite responses for common use cases to reduce chattiness. **15. SDKs:** - **Consider complementing your API with an SDK:** SDKs can simplify integration, speed up adoption on specific platforms, and promote best practices. **16. API Façade Pattern:** - **Implement an API façade:** Create a layer of abstraction between the API interface and the underlying systems to simplify API design, promote consistency, and facilitate maintenance. ================================================== **Elapsed Time: 78.04 seconds** ==================================================