So the question is: How does a computer go from binary code representing the letter "g" to the correct combination of pixel illuminations?
Here is what I have managed to figure out so far. I understand how the CPU takes the input generated by the keyboard and stores it in the RAM, and then retrieves it to do operations on using an instruction set. I also understand how it does these operations in detail. Then the CPU transmits the output of an operation which for this example is an instruction set that retrieves the "g" from the memory address and sends it to the monitor output.
Now my question is does the CPU convert the letter "g" to a bitmap directly or does it use a GPU that is either built-in or separate, OR does the monitor itself handle the conversion?
Also, is it possible to write your own code that interprets the binary and formats it for display?
In most systems the CPU doesn't speak with the monitor directly; it sends commands to a graphics card which in turn generates an electric signal that the monitor translates into a picture on the screen. There are many steps in this process and the processing model is system dependent.
From the software perspective, communication with the graphics card is made through a graphics card driver that translates your program's and the operating system's requests into something that the hardware on the card can understand.
There are different kinds of drivers; the simplest to explain is a text mode driver. In text mode the screen is composed of a number of cells, each of which can hold exactly one of predefined characters. The driver includes a predefined bit map font that describes how a character looks like by specifying which pixels are on and which are off . When a program requests a character to be printed on the screen, the driver looks it up in the font and tells the card to change the electric signal it's sending to the monitor so that the pixels on the screen reflect what's in the font.
The text mode has limited use though. You get only one choice of font, a limited choice of colors, and you can't draw graphics like lines or circles: you're limited to characters. For high quality graphics output a different driver is used. Graphics cards typically include a memory buffer that contains the contents of the screen in a well defined format, like "n bits per pixel, m pixels per row, .." To draw something on the screen you just have to write to this memory buffer. In order to do that the driver maps the buffer into the computer memory so that the operating system and programs can use the buffer as if it was a part of RAM. Programs then can directly put the pixels they want to show, and to put the letter g on the screen it's up to the application programmer to output pixels in a way that resembles that letter. Of course there are many libraries to help programmers do this, otherwise the current state of the graphical user interface would be even sorrier than it is.
Of course, this is a simplification of what actually goes on in a computer, and there are systems that don't work exactly like this, for example some CPUs do have an integrated graphics card, and some output devices are not based on drawing pixels but plotting lines, but I hope this clears the confusion a little.