Merrak's Isometric Adventures -- Artificial Intelligence!

merrak

  • *
  • Posts: 2092
Adventures Pursuing the Schrödinbug; or, How to Debug Your Code. Earlier there was a thread about how to debug code, for which I really can't think of a good, succinct answer. There's quite a bit of an art to debugging code. Those of you who are intermediate and expert programmers probably have your own habits, strategies, and tools--but if you're a beginner then there's a grim reality you'll experience soon enough: You'll likely spend much more time fixing your code than writing it "fresh".

The difficulty in writing a program scales up with its size--I'd say exponentially. So you really don't want to dive into a large project without having a few tricks up your sleeve.

For that reason I thought I'd share what I do to manage the herculean task of debugging something the size of Idosra's Vallas Engine. The last time I counted, I had around 35,000 lines of code in 1 MB of .hx Haxe files. I've added and deleted a lot since then, but the grand total hasn't changed much. That puts it at around the same storage size as the Stencyl engine, although I tend to be wordier when it comes to comments. So there's a lot of room for something to break.

If When Things go Wrong... The first principle I adhere to is controlled failure--anticipate where things might go wrong and fail on purpose to prevent a larger catastrophe. A good example of this kind of engineering is the case of the Boeing 737 fuselage. The B737 fuselage is lined with metal strips forming a 10in by 10in grid. Should a break in the fuselage occur in flight, the tear will be stopped by the metal strip, opening a small hole which lets the cabin depressurize. This controlled failure prevents the worse catastrophe of an explosive decompression.

In my case, I have the debug console which has been featured in the starts of the last few videos I've uploaded. I've posted a few screenshots of the "Guru Meditation" errors, such as the one below.


So what's going on is really pretty simple. Whenever I anticipate that a certain erroneous condition might occur, I put a bit of programming that sets an error code and an explanation string with more information. My rendering routine checks the state of the error codes. If there are no error codes stored, it displays the screen. If there is an error code stored, it kicks out to the debug console and flashes the red "Guru Meditation" (which, BTW, is a play off of the Amiga system's "BSOD" type error).

Most of the time these error codes prevent the much more ambiguous "Error 1009" / "Segmentation Fault" errors that are a pain to diagnose. The first two letters tell me what file to find the broken code in, and the last two give me something to search so I can pull that code up in a text editor. SR is Sector Renderer, and so is an error in the code that draws a room.

The error codes are a means of controlled failure that often keep the running program alive long enough to help me figure out the root cause of a problem. As many of us have experienced, often finding the bug takes much longer than fixing it. Anything that helps you pinpoint errors faster is going to speed up your debugging efforts.

So now... The Schrödinbug. What is a "Schrödinbug"? It's a cute little name for a type of bug--one of many (See Wikipedia entry). It's a bug that appears to lie dormant until the programmer notices the code shouldn't work. In actuality, it's code that never was correct, but the conditions for which the bug activates were not met.

My Schrödinbug came to light as I was adding code allowing the player to walk out of one map and into another. I modified the 2x2 "testquads" map, that I used for AI testing, to have a map exit. In this case--it exits to another copy of itself, but the actual map used doesn't matter. Exiting forces a scene change, which is what I need to test.

When I added my code to the rest of the player movement code, I noticed something was amiss. But--it's been working all summer. Surely there can't be an issue with it now?

Entering the "testquads" map... all looks okay.


Walk to the room to the south (lower left)...


Where's Marika??? Obviously something's wrong here. Now I do have sound implemented and I knew she was still "alive" because I could hear her footsteps. It's worth explaining real quick how actors in Idosra work. There's a small class hierarchy:

Character Data -> VActor -> Actor

A "character" is an entity that can exist on any map. Characters remain in memory no matter what scene/map is shown. Every character has a "VActor" instance, and every VActor has an "Actor" instance. An "Actor" is what you'd be used to from Stencyl--it's just a Stencyl actor. A "VActor" contains some additional data relevant to 3D physics and 2.5D rendering.

It's possible for a character to not have a VActor, which would be the case if they're on a different map than the player is. When the map changes, I loop through all the characters and create VActors and Actor instances for the characters that are now on stage. Of course, since the player is always on stage, the player should always have a VActor and Actor instance.

The fact that I could "hear" Marika suggested her VActor instance is still alive, since that's where movement code is housed. But what about her Actor? Is it gone? Did it somehow get pushed to the very bottom layer, under the floors? Did it get misaligned with the camera, so it's always off screen? There's so many possibilities here.

I managed to walk her back to the door, to see if she'd reappear in the room she started in:


Nope! At this point I already suspected what might be the problem, but "LD" is the last error I ever want to see. "LD" is "LayerDrawStack", which is the code that solves the "Sorting Stumper" problem I wrote about in April.

"Elementary," said he.* So here's where we begin the detective work. I gave a technical overview of how rendering works in the "Sorting Stumper" link above, but the short version is this: Walls are drawn on layers and the layers are arranged in the right order from far to near. I compute the minimum number of layers needed to draw the walls correctly, and then compute which layer each wall should be drawn on. For actors, I basically draw them on the layer that the floor they're standing on is. There's a lot more to that if you want to read the details, but that's the gist of it.

When any actor moves through the room, I have to check that they're drawn on the right layer. When this layer changes, I pull them from the old layer and push them onto the new layer.

Error LD03 occurs when the code tries to pull the actor from a layer, but they're not on one. That would show up as either a crash or "Error 1009" without the error code, so LD03 is a lot more useful. I now know why Marika disappeared. She was pulled off of a layer, but never pushed onto a new one. And that error must have occurred when she walked through the door.

So the "controlled failure" really did do its job. Imagine getting an "Error 1009" and trying to locate that. I think a lot of use have experienced that before.

My next strategy is to go into the suspect section of code (the movement code that handles sector changes) and stick a bunch of trace/print statements. Basically, print out the value of every variable and look for something that is wrong.

So what did it turn out to be? It seemed weird at first that adding a new door somehow "broke" another door, but once I got to the bottom of it, it made sense. It has to do with the "Growing Squares" algorithm that I use to partition the segments of the floors and walls into rectangles.

By adding a new door, it just so happened that the layer in the new room that Marika would enter at was changed to Layer 1. In the starting room, the layer Marika leaves the room from is Layer 1.

Now these are "different" Layer 1's: but the code wasn't checking it. To check if the layer an actor is on needs to be updated, it checks if the layer index has changed. When Marika went from Layer 1 in the old room to Layer 1 in the new room, the code didn't see the two layers as two different layers. Hence, when Marika was plucked from the old room, she was never inserted into the new room. When I walked her back, she couldn't be pulled from the room again and so the error flagged.

I think it's really a story emphasizing how important beta testers are. What are the odds that the layers would align themselves just so precisely as to trigger that bug? I think it's also important to have these kinds of error codes for your released games. Had this game been released and one of my players told me they got an "Error LD03", I have a hope of fixing it. If they just told me "the game crashed", then who knows.

In closing, it took me about four hours to diagnose this bug. The time it took to fix it? 15 seconds.

* Fun Fact: Sherlock Holmes never said "Elementary, my dear Watson". Also, my last name is "Watson", and I never hear the end of that quote.  :o

« Last Edit: July 28, 2018, 10:48:10 pm by merrak »

mdotedot

  • *
  • Posts: 1449
The art of debugging!
Quote
My next strategy is to go into the suspect section of code (the movement code that handles sector changes) and stick a bunch of trace/print statements. Basically, print out the value of every variable and look for something that is wrong.
This is pretty much how I do it.
And use the search function in the log viewer to get a view of occurrences. When on HTML5 I use the RightMouse->Inspect:Console

Problem with starting from ' fresh '  like you said is that in ' fresh ' code new bugs can emerge and you get into a loop.

You are 'lucky' when you have a reproducible situation. The hardest bugs are the once that only happen when special conditions are met .

In other programming languages you can use debuggers that have breakpoints and inspectors to help. I'm not an expert on using them and I'm sure once you learn how to use them you can get a better way of debugging faster (!)
What I did use was the HTML5 Console. There you can inspect variables and call functions.

Hanging out in the Chat:  http://www.stencyl.com/chat/

Proud member of the League of Idiotic Stencylers! Doing things in Stencyl that probably shouldn't be done.

merrak

  • *
  • Posts: 2092
I use gdb as a debugger, but I haven't learned how to use breakpoints. I can get a backtrace from a core dump, though. It I put some intentionally bad code in the error handler then I can also get a backtrace of the events that led up to an error code being thrown. That's pretty much the only way I'm able to get the drawing related ones, like "LD03".

I never thought about the HTML5 debugger, but that console sounds really useful. I made an HTML5 version of the game a while ago and it seemed to work. I didn't try any of the editors, though.

Maps Monday. Speaking of editors, I made a map exit editor. Now it's easy to make doors into new maps!


I have fond memories of hand-configuring each exit in each of the 113 rooms of the game jam version of "Temple of Idosra" via editing the .xml files. These exits are a lot easier to work with. I can give them easy to remember names and offsets are automatically computed.

Because I now have map exits, I can finally walk to the top of the stairs in the "Water Cellar"--the map from my May 9 walkthrough video clip

Here's what's at the top...


This map is "Foyer" and is one of the more complex maps I've made so far. It has about 25 sectors, multiple staircases, and twisty corridors. The back is dark and the front is brightly lit, which makes for a lot of rooms with deep contrast.


It's starting to look more like an actual game  :D

Combat is in a terrible state right now, so I need to address that. I can't kill any of the golems without taking a lot of damage, in part because striking distance isn't well defined. There's no visual feedback on whether or not anyone's hits are connecting. The golems also react very quickly because there's no "wind up" time for their attacks. I haven't thought about these things in a long time, so I need to turn my attention back to them.

merrak

  • *
  • Posts: 2092
Gameboy Jam 6. Against my better judgement I've taken the plunge into GBJAM 6. I'm not raising my hopes of finishing too high since the Vallas Engine is still missing several key features (like save-game!). Still, I've been working on this thing for a while, and it'd be nice to do something with it.

Day 1: Creating all new assets. The 160x144 resolution will be tricky to work with. I can have tiny rooms and normal size sprites, or tiny sprites and larger rooms. I've decided to go with the latter, since I'm not confident in-sector scrolling works well. Even at that, I can have rooms 8x8 tiles at 16x24 tile resolution. Tiles are half the size as in the screenshots you've seen so far. But I have to make all new assets anyway in order to maintain compliance with the rules, so I'm not making unnecessary trouble for myself.

I forgot what a pain asset creation is. Here's my Day 1 screenshot


In order to create the tiles, I draw the different faces and use Bash scripts and ImageMagick to combine them and create the different lighting versions. Most of the errors are a result of those scripts not assembling things correctly.

If nothing else, I'll get some good housekeeping out of this. I cut about 400 out of the 600 actors in the project file because they are obsolete.

One thing I'm very glad I had the foresight to do is program the rendering parts of the engine to work in tile units, not pixel units. I'm still going to have to go in and change velocity and acceleration values, since those are in pixel units... but that's not too bad. All physics data is specified in only a couple of places.

Edit. That's more like it...


The tiles are tiny! (Although the screen is actually twice as big as you're looking at here, since I scaled the image down 50%). I can't use the editor at the 160x144 resolution, so I'll have to change that when I'm done making the maps.

« Last Edit: August 18, 2018, 11:18:57 pm by merrak »

mdotedot

  • *
  • Posts: 1449
Quote
Bash scripts and ImageMagick

Ah .. Bash scripts .. I have used ImageMagick in the past. They are very powerfull. You can create spritesheets and do all kind of tricks with images.
Nice to read that there is another fan!

Quote
work in tile units, not pixel units.

That is interesting. I propbably should have done my 3d extention the same way. Well a bit too late for it now but I have to remember this for a future project!!

Have fun with the jam!  I hope there will be something playable at the end :D


Hanging out in the Chat:  http://www.stencyl.com/chat/

Proud member of the League of Idiotic Stencylers! Doing things in Stencyl that probably shouldn't be done.

merrak

  • *
  • Posts: 2092
Nice to read that there is another fan!

ImageMagick is great. I use it for a lot of things. I recently discovered how to do batch processing with Audacity as well.

Another useful Haxe trick--get/set variables. Here's an example from one of my classes "VActorType" (3D version of Stencyl's ActorType). The variable 'radius' defines the size of the circular base of the collision cylinder. It's size is saved in an xml config file, stored in tiles units:

Code: [Select]
public var radius:Float;       // Measurement in Tiles Units
To convert to grid, I have the variable gridRadius. It is a read-only variable.

Code: [Select]
public var gridRadius(get, never):Float;
Specifying 'get' necessitates a getter function.

Code: [Select]
function get_gridRadius( )    { return radius  * MapNode.currentMap.tileSize; }
Now I can use 'gridRadius' like any other variable, and it automatically multiplies the radius by whatever the tile size is to convert to grid coordinates (pixels) for that particular map and scale.

merrak

  • *
  • Posts: 2092
Day 2. I now have a correctly rendered scene  8) I've tentatively titled the game "Tower of Vallas", a throwback to the original "Thief of Vallas" name that got dropped.


One thing that really helps speed up drawing tiles is lighting--I don't have to shade anything because the renderer does that for me. In some ways I like this new tileset better than the one I've been working with for Idosra. Idosra only had one kind of window, where as here I have several different kinds. I don't have that many kinds of floors of pillars, though, but that's mostly due to the lack of room for any detail. Sadly, I don't have enough detail for the cat face walls.

This room is 8x8x4 tiles, but I think 10x10x3 would make more sense. Larger rooms make more sense than taller rooms--and I can always use multiple sectors for deep pits.

Next up: redoing all my UI assets and title screen art, new fonts, and other odds and ends.