Debugging_Crashes

Debugging crashes is fun! Actually no. With a little massaging, you can use the information on the crash screen to get a better idea of what’s going wrong, though.

Symptoms #

A crash usually looks like this:

System ERROR
REBOOT    :[EXIT]
INITIALIZE:[EXE]
 TLB ERROR
 TARGET=D223420F
 PC    =081007C0

The lower three lines are the interesting ones, giving you the fault type, the memory access that caused the fault, and the PC value is where the fault occurred (the meaning of the PC depends on the exception type). In this case, it’s a TLB fault when trying to access memory at 0xD223420F. It’s usually a safe assumption (no matter the fault type) that it was cause by an invalid memory access.

Examining #

By tweaking the linker options to emit a relocatable ELF object file (rather than the flat binary that is the default) [you can do this just by commenting out the first line of the prizm.x linker script], we can get an idea of what memory regions are in use:

$ sh3eb-elf-objdump -hr SDLTest.elf

SDLTest.elf:     file format elf32-sh

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0001b234  00300000  00300000  00000080  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata       00003300  0031b234  0031b234  0001b2b4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .data         00000088  08100004  0031e534  0001e604  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  3 .bss          0000225c  0810008c  0031e5bc  0001e68c  2**2
                  ALLOC
  4 .comment      00000011  00000000  00000000  0001e68c  2**0
                  CONTENTS, READONLY
  5 .debug_info   000033df  00000000  00000000  0001e69d  2**0
                  CONTENTS, READONLY, DEBUGGING
  6 .debug_abbrev 00001b1f  00000000  00000000  00021a7c  2**0
                  CONTENTS, READONLY, DEBUGGING
  7 .debug_loc    00001bed  00000000  00000000  0002359b  2**0
                  CONTENTS, READONLY, DEBUGGING
  8 .debug_aranges 000002b0  00000000  00000000  00025188  2**0
                  CONTENTS, READONLY, DEBUGGING
  9 .debug_line   00000bbf  00000000  00000000  00025438  2**0
                  CONTENTS, READONLY, DEBUGGING
 10 .debug_str    0000040e  00000000  00000000  00025ff7  2**0
                  CONTENTS, READONLY, DEBUGGING
 11 .debug_frame  000003fc  00000000  00000000  00026408  2**2
                  CONTENTS, READONLY, DEBUGGING
 12 .debug_ranges 000001e8  00000000  00000000  00026804  2**0
                  CONTENTS, READONLY, DEBUGGING

The .debug_* sections can be safely ignored for now, since they provide the machine-readable mappings of addresses to names. Of particular interest are the .text, .data and .bss sections, which contain the code, initialized writable data, and uninitialized writable data respectively.

Referring to the error message, we attempted to access memory at 0xD223420F, which is far outside any expected ranges. The PC was 0x081007C0, which is suspicious- that’s in .bss, which code should not be executing from. This is usually a symptom of a smashed stack, and makes debugging very difficult since it comes from executing bogus code and we have no way to retrieve a stack trace to see what went wrong earlier.

It appears that this particular crash was caused by a NULL pointer dereference, which is unusual (wouldn’t expect that to make the system begin executing in .bss). Further additions to this page will probably be forthcoming as more experiments can be performed.

Exception Names #

The third line on the System Error dialog tells you what exception the OS handled. Please see more details here.