project_GLaMRE log 00

Welcome to the world where bits, bytes, matter, magic, and chaos all blend seamlessly. Meet Project GLaMRE: the graphics engine I'm writing in C++ with OpenGL, just for funsies. It stands for Graphics Library and Media Rendering Engine. My current mission? To create an engine that's not only fast and performant but also offers me nuanced, granular control over every aspect of the rendered image. The ultimate goal? To create visually striking graphics that I think look neat, and hopefully you will think they look neat, too.

I'm making this blog to document progress as a personal reference, and as a way to promote and demonstrate the project. This blog might be a mess for a bit; I haven't 'blogged' in a while so I plan on just throwing some digital spaghetti at the wall and seeing what sticks.  If you're not interested in reading all of my technical blabbering, I hope to still make it entertaining to scroll through by frequently sprinkling in videos and screenshots of the project along the way.
 
This first post is going to be an origin story of sorts. I'm going to do my best to catalog the ideas that lead to the project in its current state. 
 
I didn't set out to develop a graphics engine in C++. I originally just read the book "Computer Graphics: Principles and Practice"  (https://cgpp.net/) around July/August of 2023. Graphics programming has always been a big interest/passion of mine, so I wanted to learn about it from the ground up, and this book seemed like a good place to start. Currently, I'm on chapter 6 "Introduction to Fixed-Function 3D Graphics and Hierarchical Modeling," and so far it has been an excellent resource for learning about computer graphics from the basics; however, much to my dismay, the first few chapters focus on using the C# programming language, .NET Framework, and Windows Presentation Foundation (WPF) to create applications that display basic 2D and 3D graphics. For those who don't know, WPF and .NET are proprietary frameworks that are used to create programs/apps that only work on Windows, and can only be developed in a Windows environment. (gross!).

I went into this book with the understanding that it would teach OpenGL, so I assumed that I could complete all the projects in a Linux development environment. Unfortunately, that was not the case. I understand why the author chose to write the book this way, since learning fundamental graphics concept is a lot easier to grasp with WPF, especially for beginners. So, I begrudgingly booted up the old Windows virtual machine and persevered through the starting projects. Despite being forced to use Windows and MS Visual Studio, the projects were still a lot of fun to work on. I also program in C# in my professional life, so I was able to apply knowledge gained from work to these projects and vice versa, which was a pretty big motivating factor through this phase of the project.

As the book began delving into the basics of 3D graphics, it introduced an example featuring the famous Renaissance woodcutter, Albrecht Dürer. He created a method to produce realistic 3D renderings/drawings way before the concept of a computer was ever conceived of. I won't fully describe how it works here, but if you're interested, here's a link. Basically, the Dürer Rendering engine is a primitive yet practical method for calculating perspective projection/ray casting. 

Below is a picture of what the Dürer Engine looked like. It features a pointer on a string that's attached to the wall. One person holds the pointer, aiming it at different spots on the object; while another person, positioned beside the wooden frame/door, holds a pencil perfectly still where the string passes through the frame. They then close the door on the frame and mark where the string passed through. This process eventually creates dots that, when connected, form what appears to be a three-dimensional image on a two-dimensional plane. I like to think of the wooden frame as your eyeball, and the string is a beam of light passing through it.



One of the first examples of computer generated 3D graphics in the book involves running a program that uses a coded version of the Dürer algorithm. This program generates a wireframe cube in 3D, using two-dimensional line segments and small circles where the vertices (the 3D points that make up the cube) are to make them more visible. This was all accomplished using built-in WPF features that allow you to draw basic shapes onto a canvas within a window. I will probably go into more detail on how exactly this coded algorithm works in a later post. But basically what it's doing is taking 3D X,Y,Z coordinate data for the cube that is stored in a file, and does mathy stuff to convert the 3D coordinate data into 2D X,Y coordinates that correspond to each cube vertex. When rendered on the screen it appears to make up a 3D image.
 
 
Wireframe cube test project created by the CGP&P authors, code and project files can be found here.

Then! I had an idea that was reminiscent of the college courses I took on computer graphics and linear algebra. I wanted to see if I could modify the code for this project so that it renders a rotated version of the cube. I knew that I needed to multiply each vertex by a specific 4x4 matrix to achieve a rotation transformation, but I forgot what a rotation matrix looks like (In the computer graphics class I took, we used C++ and OpenGL, but it was with libraries and frameworks that abstracted much of this kind of math.) Thus, while I was conceptually familiar with matrix transformations and 3D graphics, I did not have the various transformations memorized. 

After some refresher research, I managed to create a method that utilized two-dimensional arrays or lists instead of a dedicated matrix class as one might expect in a project like this. (Some situations required converting back and forth between the two, but it's mainly for deserialization purposes.) 
 
It was around this point that I began investing significant time in brushing up on my linear algebra and calculus, recognizing that an intuitive understanding of these concepts would be crucial for a project of this nature. I highly recommend the 'Essence of' series for Linear Algebra and Calculus by 3blue1brown on YouTube.

I started studying math and working on the rotating cube simultaneously. Getting the cube to rotate turned out to be a lot easier than I had anticipated; it just sort of worked. I can’t recall the specifics, but I likely started with a single rotation from its original position. Then I think I added a game/timed loop, so I had the wireframe cube spinning at an approximation of 60FPS. This was before I created a Github repository for the project, so I don't have a backup of what that looks like unfortunately.  
 
I felt the next logical step was to add faces/sides to the cube, so that it appears to be a solid object rather than a wireframe model. I had the idea of just using the built in DrawPolygon() method that's available in the .NET Framework to draw a 2D polygon that connects the vertices of the cube with a filled in polygon for every frame. So like, a 2D polygon that's filling in the 3D space where the face/side of the cube would be for each side of the cube. That seemed to work pretty well, almost. I soon realized a limitation: this method had no way of discerning which sides of the cube were facing the camera. As a result, it ended up drawing all the cube faces, including those facing away from the viewer. This limitation was rooted in the design of Dürer's rendering technique, where the concept of hidden faces wasn’t a concern, as the artist could easily distinguish which points to project for rendering the image. They could tell which parts of the object didn't need to be drawn, because the back of it, for example, wouldn't be seen when looking at it through the wooden frame. Computers can't really distinguish which faces it should or shouldn't draw without a bit of extra help from good ol' mathematics. After a bit of research, I realized that linear algebra really comes in handy for this type of problem. Turns out, you can use the cross product between the vector the side of the cube is facing and the vector the camera is facing. Vectors here just mean the direction they're facing, and the cross product is a resulting direction when the 2 vectors are compared. If the cross product of the 2 vectors results in a negative value, it means that side of the cube is facing towards you, and it knows to draw the polygon for that side of the cube. If the cross product is positive then it will skip drawing that side. In that moment, I felt like I understood linear algebra better than ever before, way more than in any classroom. It’s unbelievable how much easier it is to grasp this stuff when you’re actually using it in a coding project.

It wasn’t long before I had a solid cube rotating, and spinning continuously in an animation loop. I also attempted to add a smooth color changing effect, but that didn't really work out like I wanted, but it was still sorta neat.





So, it sort of felt like magic when the cube was rotating and displaying only the sides it should; the ones that face the camera. From there, ideas started popping off in my head. I wanted to store model data in a file so that I could easily import and export them, so I created my own JSON format for this purpose. If only I'd done a bit of research while designing my JSON format, I probably would've stumbled upon the .OBJ format and realized I was hardcore reinventing the wheel with a lot of unnecessary deserialization overhead (yuck!). Though I'm getting a little ahead of myself; I didn't discover the OBJ format until a bit later after I started working on the C++ engine. 

Cube model data in my custom JSON model format.


Anyway, back to the rotating cube and stuff. So yeah, I kept adding more features, which included some additional transformations like translation (moving a model from point A to B) and scaling (making a model big/smaller.) Everything just sort of fell into place. It began to feel like I was creating an actual graphics engine of sorts, as primitive and ridiculous as it was. The book also had a good amount of various other projects in the early chapters, most of which involved some form of animation, so instead of using the built in WPF methods for animating/creating polygons, I used my own shape creation and transformation methods. Below are some of the projects I completed using my primitive engine. (I did use the built in .NET sliders and buttons for all of the GUI components.)


Program where you click to draw a polygon, then you can subdivide the edges to smooth them out. You can also generate the "Dual" of the polygon, which creates a new polygon based on the edge center points of the previous polygon.



Shadow Trajectory Project: changing the location of the shadow with the sliders changes the perceived trajectory of the ball when it's moving/hovering above the table. As you change the location of the shadow, the ball's path appears to change. It's a good demonstration of how shadows can influence our interpretation of motion and depth in a seemingly three-dimensional space.



"Motion Blindness" optical illusion: staring at the middle of the screen causes some or all of the dots on the outer edge of the screen to "disappear". There are sliders to adjust the spin speed, distance between the crosses, and the dot color/diameter. The sliders can be adjusted to optimize the illusion effect. 



'Spatial Frequency' optical illusion: the size and spacing of the black and white bars can be adjusted, and when the bars are fine enough and you look at them from a far enough distance, they blend into a solid gray shade. You're supposed to adjust the separate gray area below to match the perceived shade of gray made by the bars. It's an interesting way to see how our eyes combine detailed patterns into a single tone from afar.



Just to be 100% clear, my methods were horribly inefficient. Most of the calculations and matrix math that is typically processed by the GPU via shader code, was being handled by C# on the CPU, which is kind of gross. Despite that, I was pleasantly surprised by the initial smoothness of the rotating cube and other animations in these projects, enough so that it provided enough motivation to keep going and push this engine to its limit. Also, it's worth mentioning that WPF has built-in 3D rendering capabilities, which most likely DO utilize the GPU. So I was absolutely reinventing a much less efficient wheel at this point, and I knew this 110%, but the sheer silliness of it was part of what made it so much fun to work on.

In the CGP&P book, where I left off before deciding to focus more on adding features to the engine, I had just started on a series of projects that are intended to be completed using the WPF 3D rendering functionality. My plan instead was to continue using this wacky engine I had developed to complete these projects. One of the early 3D projects (after the handful of 2D projects shown previously) involved modeling a robotic camel with articulated joints, capable of moving and rotating independently. This was considerably larger in scope compared to previous projects. The camel depicted in the book was relatively geometrically complex, featuring a variety of shapes and curved surfaces with smooth shading and "realistic" lighting. Here's a picture of the camel from the book:

Rendering of a robot camel from the book. The locations of the articulated joints are highlighted.

Having the ability to create, scale, and rotate various primitive shapes like spheres, cylinders, and cones, was going to be crucial for modeling and animating the different parts of the camel. I suppose I must've been looking for a challenge that day, because I ambitiously started with a sphere, which in hindsight might not have been the wisest decision, as they are the most complex out of all the primitives/shapes I just mentioned. The practice I would've gotten from implementing the more basic shapes first would've probably made the sphere a lot easier to tackle.

To calculate the points of a sphere, I knew I could use sine functions, but I would've needed to research/google how exactly to accomplish that. So instead, as a practice exercise I decided to go for a more hands-on/creative approach for approximating a sphere. Since I had already created methods for creating 2D circles in a 3D space for the shadow trajectory project, my strategy was to make a stack of these circles, each with progressively smaller radii, at specific ascending heights, and then connect the points of vertically adjacent circles to form the faces of the sphere.


Partial WIP sphere while I was still trying to figure out the height distance between the inner circles. You can see the stack of circles that are red, and the partial faces of the sphere that are purple.

I started by generating the points for a quarter of a circle (think of a quarter circle on an X,Y plane/grid), then for each point in this quarter circle, I used the distance from the coordinate's Y value to the grid's X axis to determine the height of each ring that will make up the sphere. Then I determined the radius of each circle by getting the distance from the X value of each point to the Y axis of the grid. So this essentially created a stack of circles with smaller and smaller radii, then I connected the points of adjacent rings to form half of the sphere. While this method of sphere generation sounds somewhat straightforward, it still required a lot of debugging and trial and error to perfect. Eventually, this approach resulted in a decent looking half-sphere which I then duplicated and flipped upside down to make a very recognizable approximation of a full sphere. This success absolutely fueled my motivation to continue down this path of insanity.


Sphere without lighting.

After getting the sphere to work, I had confidence in my ability to implement other primitives as needed, so I decided to move on to adding other non-shape-related features to keep things fresh. One of the initial tasks in the camel project was to configure a light source, and the author of the book had already described the algorithm for calculating simple Gouraud shading, so I was going to try and implement that next. This shading technique worked similarly to how I determined which sides of the cube are facing the viewer. However, this time, the calculations consider the orientation of the surface, the viewing angle, and the direction of a light source. These factors were used to determine how bright or dark various regions of the model should be colored. Essentially, it involved adjusting each surface's RGB value resulting in the true color, black, or shades in between. While this might not have been the most accurate or correct implementation of the Gouraud shading algorithm, I was pleased with its aesthetic outcome. At this moment I felt on top of the world; I had anticipated that implementing the lighting would be more challenging, but I was quite proud of the results. I now had some basic lighting and primitives for my engine.


Sphere with lighting.

Then, something hit me that I hadn't quite thought through yet: completing the camel project would technically require a textured surface. I could have probably completed the camel project by 95% if I were to skip the textures and move on, but the perfectionist in me didn't want to leave things unfinished. It's not very apparent in the previous camel picture, but the ground the camel is standing on is using an image based sand texture. Implementing textures was a challenge I hadn't figured out yet. I figured there was some way to overlay an image on a 2D polygon, and after a bit of research, it did seem possible. However, after I added the sphere and lighting, reality hit me hard AGAIN. The application was essentially JUST rendering a rotating sphere to demonstrate reflected light. For the most part, it was smooth, but there were occasional hiccups suggesting that things might not be as stable as I'd hoped. Furthermore, increasing the resolution, segments, or points on the sphere led to even more noticeable performance hits, and I felt that the textures would've potentially pushed it over the edge in terms of stability/performance.

The process of crafting this 'engine' was extremely fun and an amazing learning experience. However, I was starting to get too attached to it and I wanted to continue developing it, but I knew if I continued down the current path, I would eventually hit a performance brick wall that I wouldn’t be able to get over. So, I bit the bullet, retired the WPF engine (for now!!), and decided to try porting the current engine to C++ and OpenGL. At this point, I hadn’t yet dreamt up the name 'Project GLaMRE,' but that was soon approaching. For the past few months, the C++ engine is what I've been focusing on, and I think it's coming along rather nicely. Stay tuned for more GLaMRE!

Popular posts from this blog

project_GLaMRE log 01