Debugging

Tonight I couldn’t sleep. What do I do when I cannot sleep? I debug!

As yesterday libvorbis was enough of an headache for me (ended up that the parameters are being reset by libvorbis itself because the third header has an error in parsing… now to find where the error is, that’s a good question), I decided to go with something easier, like Gentoo/FreeBSD/SPARC64 kernel debugging. No I’m not kidding, debugging the problem in the kernel is resulting easier and funnier than debugging an userland library… to decode audio files… that’s parsing a damn header!

Anyway, thanks to Javier (I won’t mistype his nick this time), I got into FreeBSD’s kernel debugging by building a kernel with -g and DDB support. Then, I easily got the trace of the kernel panic:

Tracing pid 1258 tid 100054 td 0xfffff80007aa7c80
panic() at panic+0xcc
trap() at trap+0x38c
-- fast data access mmu miss tar=0xc0b70000 %o7=0xc01ae8ec --
malloc_type_zone_allocated() at malloc_type_zone_allocated+0x14
malloc() at malloc+0x7c
hashinit() at hashinit+0x4c
nullfs_init() at nullfs_init+0x1c
vfs_modevent() at vfs_modevent+0x244
module_register_init() at module_register_init+0x58
linker_load_module() at linker_load_module+0x844
kldload() at kldload+0xf4
syscall() at syscall+0x334
-- syscall (304, FreeBSD ELF64, kldload) %o7=0x100a1c --
userland() at 0x40421288
user trace: trap %o7=0x100a1c
pc 0x40421288, sp 0x7fdffffde31
pc 0x10080c, sp 0x7fdffffdef1
pc 0x4020ab74, sp 0x7fdffffdfb1





this was tracing a kldload of nullfs, but any kldload produces errors, although they seem to be different from module to module.



db> x/ia malloc_type_zone_allocated,16
malloc_type_zone_allocated:     save            %sp, -0xc0, %sp
malloc_type_zone_allocated+0x4: call            critical_enter
malloc_type_zone_allocated+0x8: nop
malloc_type_zone_allocated+0xc: lduw            [%g7 + 0x3c], %g1
malloc_type_zone_allocated+0x10:        sllx            %g1, 6, %g3
------
malloc_type_zone_allocated+0x14:        ldx             [%i0 + 0x40], %g2
------
malloc_type_zone_allocated+0x18:        brz,pt          %i1, malloc_type_zone_allocated+0x38
malloc_type_zone_allocated+0x1c:        add             %g3, %g2, %g4
malloc_type_zone_allocated+0x20:        ldx             [%g3 + %g2], %g1
malloc_type_zone_allocated+0x24:        add             %g1, %i1, %g1
malloc_type_zone_allocated+0x28:        stx             %g1, [%g3 + %g2]
malloc_type_zone_allocated+0x2c:        ldx             [%g4 + 0x10], %g1
malloc_type_zone_allocated+0x30:        add             %g1, 0x1, %g1
malloc_type_zone_allocated+0x34:        stx             %g1, [%g4 + 0x10]
malloc_type_zone_allocated+0x38:        subcc           %i2, -0x1, %g0
malloc_type_zone_allocated+0x3c:        be,pn           malloc_type_zone_allocated+0x58
malloc_type_zone_allocated+0x40:        or              %g0, 0x1, %g2
malloc_type_zone_allocated+0x44:        sll             %g2, %i2, %g2
malloc_type_zone_allocated+0x48:        sra             %g2, 0x0, %g2
malloc_type_zone_allocated+0x4c:        ldx             [%g4 + 0x20], %g1
malloc_type_zone_allocated+0x50:        or              %g1, %g2, %g1
malloc_type_zone_allocated+0x54:        stx             %g1, [%g4 + 0x20]
malloc_type_zone_allocated+0x58:




This is the disassembly of the function that died, I’ve artificially separated the point where the crash happens from the rest of the code.
First of all, the thing that scared me was that even if I know nothing of SPARC assembler, and even my Intel assembler is pretty much limited to 8086 instructions (although I still remember most of them clearly, as I wrote an 8086 emulator when I was in high school), I was able to correlate more or less that code with



        struct malloc_type_internal *mtip;
        struct malloc_type_stats *mtsp;

        critical_enter();
        mtip = mtp->ks_handle;
        mtsp = &mtip->mti_stats[curcpu];
        if (size > 0) {
                mtsp->mts_memalloced += size;
                mtsp->mts_numallocs++;
        }
        if (zindx != -1)
                mtsp->mts_size |= 1 << zindx;
        critical_exit();




that is the code of the function in C source.



Now my problem is to find what causes the “fast mmu miss”, or in general the panic. The registers are funny:



db> show reg
g0          0xffffffffffffffff
g1          0xc04ce800  log_cdevsw+0x48
g2          0xffffffffffffffff
g3             0x870ad
g4          0xfffff8000040fff8
g5              0x1dfd  fpu_fault_size+0x1c49
g6          0xcb591980
g7          0xc054d7b0  pcpu0+0x1a90
i0                0x12  pcpup+0xb
i1          0xc04816e0
i2          0xcb590ab8
i3                 0xa  pcpup+0x3
i4          0xcb590b70
i5                 0x1
i6          0xcb590221
i7          0xc01d7bac  kdb_enter+0x34
tnpc        0xc01d7bb8  kdb_enter+0x40
tpc         0xc01d7bb4  kdb_enter+0x3c
tstate      0x441d001601
kdb_enter+0x3c: ta              %xcc, 1




as i0 is set to a quite low value (0x12) and the debugger tells me it’s referred to pcpup (“pcpu pointer”) address plus a value… the problem is that pcpup is .. uh.. loaded in g7:



#define PCPU_REG %g7

register struct pcpu *pcpup __asm__(__XSTRING(PCPU_REG));




I wonder if it’s a miscompile or simply ddb going crazy; still if I understood the ldx operator well enough, it’s trying to load data from g2 (that’s –1) into the address 0x42 … it does not feel too much right.



Anyway, will try to debug this further when I can find someone in Gentoo/SPARC team who can help me understanding SPARC assembler.
Share this:

				Click to share on Threads (Opens in new window)
				Threads
			

				Click to share on Facebook (Opens in new window)
				Facebook
			

				Click to share on Mastodon (Opens in new window)
				Mastodon
			

				Click to share on LinkedIn (Opens in new window)
				LinkedIn
			

				Click to share on Pocket (Opens in new window)
				Pocket
			

				Click to share on Reddit (Opens in new window)
				Reddit



			Assembler
FreeBSD
Gentoo
Gentoo FreeBSD
Klothos
SPARC


			
			
				
									
				
					
					
						Flameeyes					

					2831 posts



					


	
	
		
			
				
					
						
							
							
								
									
		
			Leave a ReplyCancel reply
			
				
									
									
					
					
							
		

		
		

		This site uses Akismet to reduce spam. Learn how your comment data is processed.
								
							
						
					
				

				
							
		
	



	
		
			Related Posts
			
				

	
		
		

			16.02.25
			Whirlpool AP330W: It’s A Trap
			
						
				

												
																	
										
									
									
								
									
										
											Flameeyes										
									

																	
							
											
			
					
	



	
		
		

			02.02.25
			Hardware Review: Ozlo Sleepbuds
			
						
				

												
																	
										
									
									
								
									
										
											Flameeyes										
									

																	
							
											
			
					
	



	
		
		

			19.01.25
			From Furniture Websites To Sweet Home 3D
			
						
				

												
																	
										
									
									
								
									
										
											Flameeyes										
									

																	
							
											
			
					
	



	
		
		

			05.01.25
			So, Could We Get Open Source Washing Machines?
			
						
				

												
																	
										
									
									
								
									
										
											Flameeyes

Popular tags

The Latest
View All

Whirlpool AP330W: It’s A Trap

Hardware Review: Ozlo Sleepbuds

From Furniture Websites To Sweet Home 3D

So, Could We Get Open Source Washing Machines?

Debugging

Leave a ReplyCancel reply