Making a Physics Simulation

Background

A while ago I saw this video by SethBling about 3D rigidbody physics in Minecraft using the Separating Axis Theorem for collision detection, and the idea really resonated with me. After seeing this video and reading up on the concept, I used the knowledge to add proper 2D polygon colliders to a Terraria clone I had been working on at the time in order to learn C# outside the context of the Unity Game Engine. By now I've used this method in several projects, including an implementation of Asteroids for college. But, all of the projects I've done using SAT so far have been 2D, and being the massive dweeb that I am, I've always wanted to try it in 3D. What better time to try some massive new project than a month before college starts back up again?

Getting literally anything to work

Animated gif that showcases how the dot product works Before getting into how I've implemented things, I should probably explain what the SAT actually is. In all honesty it's pretty simple as long as you understand a few core concepts, namely the dot product of vectors. The dot product can be thought of as an operation that rotates 2 vectors such that one of them aligns with the x axis, and returns the x component of the other vector multiplied by the first vector's length. For our purposes today, we can ignore that final multiplication because we can always assume at least one of our vectors is normalized, and has a length of 1. The effect this has is that doing the dot product casts a "shadow" of one vector onto the other, and gives us where that "shadow" lands on the x axis. The other main concept to understand is that for any 2 convex shapes that are not overlapping, there exists at least one axis onto which they can be projected and their shadows don't overlap. That one is pretty obvious if you think about it, but the important part is that we can very easily find that axis—in fact this is pretty much the entirety of the Separating Axis Theorem. It turns out that in 2D, you only need to check the axes perpendicular to each side of both objects and you're guaranteed to find the separating axis if it exists.

Now that we have our foundation, it's pretty easy to get started implementing the system! One important consideration is that the SAT only works for convex shapes, so any complex shapes with concavities will need multiple colliders. The most simple way of dealing with this is just to have each object store a std::vector<T> of colliders, but I think that there's aesthetically a lot to be desired with that option. Instead, I opted to have a base Collider class, and have MeshCollider and CompoundCollider derive from it. That way, when an object only needs a single collider, it only needs to store that one collider, but more complex objects can have a compound of several in the same slot. This might not be the best approach, but I like it because of the aforementioned aesthetics, and it gives me the flexibility to add specialized collider types down the line like a sphere. I don't know if that will be worth the effort, but it's nice to have the option. The main type of collider I'm going to be talking about going forward is going to be the mesh collider, which has a vector of vertices, and a vector of normals. Now we just need to make a function that takes in 2 colliders, splits up any compound colliders, and then checks for an overlap.

Now, for code I wrote in a super sleep-deprived state in a discord call with while my equally sleep-deprived friends played games and built Lego, I'd say this function turned out pretty good. It has a few problems I'll go over later, but for now it worked well enough that I was tricked into thinking it was done based on the console log testing I was doing. The function takes in a transformation matrix for each collider to put them into world-space, and the loop breaks as soon as a separating axis is found to avoid redundant work. This may change later, as it might be necessary to complete the loop to find the exact location in 3D space the collision occurred. I also made the function return an std::optional<T> of a hit object to pass info about the collision back out for resolution later, and created a simple struct of 2 floats representing the range of values a collider covers when projected onto normal. With this done, I can finally start making some objects to attach these colliders to and draw to the screen!

Getting something to render

The first step of getting something drawing to the screen, is creating a class to contain all the data related to rendering and simulating a given object. This class doesn't need a whole lot in it at the moment, just some transformation matrices, a mesh material and shader for rendering, and a collider. In terms of methods, all it's got is some setters and getters for the data, and functions for logic updates and rendering(Update() and Draw() respectively). I also took the time to set up a World class. There's not much in there at the moment, but it lets me bundle up all my global data and the internal logic in a much more organized place. From there, all I needed to do was loop through all the objects in the world and call their update and draw meshes, while also calling the collision check method against every other object. It's not optimized or efficient, but at the moment I'm looping through the objects 3 separate times(plus one nested loop) for simplicity.

At this point it should have been time to move on to implementing collision resolution, but you may remember I mentioned that I wrote a good chunk of the code under what might be called "sub-optimal conditions". That's right, my sleep deprived code I thought was working is now coming back to bite me in the behind. I might not have noticed a lot of these issues if I hadn't swapped one of the cubes I was testing with for another shape as a sanity check, because it just so happened most of my mistakes involved accidentally discarding the results of transformations in several fun flavours. Chief among those was calling raylib's vector and matrix transformation functions and forgetting to actually assign the returned result to anything. I did this in a few different places, alongside some places where I tried assigning to the temporary variable in range-based for loops over vectors of points. It was quite a lot of fixing to be done, and in the process I also discovered that the web builds using Emscripten were completely borked, which led to even more fixing. Unfortunately for me, that wasn't even the end of it.

You might have noticed all the way back in my explanation of the Separating Axis Theorem, I specifically talked about how you can find the axes that need to be tested in 2D, but didn't say anything about how that might be different in 3D. That's because I had completely forgotten that it was different. I don't know that I can blame this mistake on a lack of sleep, but I will use the excuse that it had been at least a year and a half since I had seen the video explaining the concept in 3D. The thing is, using only the normals of the objects works in most cases where 2 cubes are being tested, even if they are rotated. The problem happens when the cubes are rotated in odd but common ways relative to one another, and the overlap detection returns a false positive. To get rid of this issue(and presumably similar issues with more complex shapes), you need to loop through every edge on one shape, and take the cross product of it with every edge on the other shape and check the resulting direction vectors on top of the face normals. Since the cross product is an operation that returns a new vector perpendicular to both inputs, and the false positives happen when 2 edges are oriented weirdly relative to one another, it makes sense that checking these axes would eliminate the issue. With the problem identified, it wasn't that big a deal to actually solve it. The main change that needed to be made was that colliders need to store their edges, which I decided to solve by storing a vector of index pairs that refer to the vector of vertices. A fun side effect of solving this issue was that I was able to render colliders using raylib's DrawLine3D() function, which is a lot nicer for visualizing collisions than my previous solution of changing the colour of the whole object. And with that lengthy detour out of the way, it's finally time to move on to using the information from the SAT to resolve the collisions!

Everything thus far was written after the fact, because I hadn't decided to document the making of this project in this much detail yet, but from here on each segment will be written as I go through them.

Course correcting

At this point I figured it would be useful to have some ability to inspect and modify objects at runtime. Obviously I don't need any sort of reflection system for a small personal project like this, but being able to see and modify the transform data of a physics object without recompiling the project would be very nice. So, I decided that before moving on to collision resolution, I would implement raycasting to select objects in the scene, and bring in Dear ImGui to display relevant information. I had never used ImGui before, but had heard about it in passing and thought it a great fit for my purposes. After some struggling with outdated documentation and git hashes, I was able to get it up and running with the help of a nice compatibility library called rlImGui. Now that I had ImGui ready to use, it was time for raycasting, and this is where the title of this section comes in to play. You see, up to this point I had kind been going off vibes and what little knowledge I already had about 3D physics in games. I had some idea of how it might work, and thought I could work it out as long as I had my Separating Axis Theorem implementation in place to support it all. I found out just how wrong I was the moment I began to look into how raycasting works—something I had no idea about whatsoever.

After some googling and digging through forum posts, I was able to find this document going over raycasting, and this document on 3D physics. In the second one—a document from valve—the author explains very early on that they use the Half-Edge data structure AKA the Doubly Connected Edge List, and it became very apparent why as soon as I started to read the first document. The Half-Edge data structure makes it much easier to represent and iterate over adjacencies like "all the edges that make up a face" or "all connected faces". I had already started to move in that direction when I realized previously that I needed to keep track of edges and what vertices make them up, but I didn't quite think about how else I might need to care about the relations between elements of the polyhedron. So, I decided to read up on the data structure and implement my own version. At first I made something very 'by the books' with loosely connected Edge Face and Vertex objects that are related only by pointers. This wasn't very convenient for my purposes, and I didn't like how nothing was guaranteed to be next to anything else in memory, so I instead changed the structure to exist inside a 'container' class in a series of vectors. Then, each object can store a single pointer for each vector it needs to access, and the index of the object it's related to. As an example, the all important Edge class looks something like this:

I'll save you the exhaustive detail, but trying to integrate this system was a massive headache that took several days of work before I was even able to make sure I'd structured it all correctly. I spent many hours getting segfaults and seemingly inexplicable errors that ultimately boiled down to either not understanding how the std::map<T> structure stores it's members, or looking in the complete wrong part of the code. Perhaps the funniest example of the latter was towards the end of the process, when I spent nearly 8 hours debugging things trying to find one problem and solving about a dozen others along the way, only to realize in the end that I was simply providing the vertices in the wrong order so none of the half edges where being paired with their opposite, since they were both pointing in the same direction. After finally solving that, the raycasting itself was almost child's play by comparison and only took me a day or 2 of work to get fully functional.

From there, it was a simple matter to do a raycast from the mouse and store a pointer to whatever object(if any) was hit, and display it's position in an ImGui window. That info gets displayed as a series of input fields, and the temporary Vector3 ImGui uses can be fed back to the object at the end of the tick to change it's position at runtime!