A Paimei tutorial - hands on pyDBG - part 1

A brief reversing tutorial

I recently came across the problem, that I needed to collect strategically important information according to the Heap while executing the application. On Windows there're certain restrictions: tool-suites like Valgrind with croncile-recorder didn't seem to exist at first. Sysinternals' VMMap by Russinovich and Cogswell are much too basic in their functionality. But there's IDA, there's Pedram's Paimei... and there's Grenier's Byakugan in the Metasploit projects. And much more ;). In the following I'm just referring to Paimei, because I still have to sort some things out with WinDBG. It's an ugly monster that I need to tame actually.

For whom this tutorial maybe of interest:

  • people developing Windows exploits and searching for vulnerabilities
  • anyone who spent more than 5 minutes using IDA, Python, or pyDBG
  • Reverse engineering folks
  • deeply security interested individuals, the common addicted seeker - this is beginner friendly and contains links to adequate documentations

Sources for the following tutorial are:

  • the original docs of Paimei
  • the Woodman RCE forums community efforts
  • the MacHackers handbook, where Paimei is mentioned briefly
  • especially Chuck, whose screenshots I stole. And Ricardo Narvaja, whose tutorials are a good starting point
  • Gray Hat Python by Justin Seitz - because Paimei builds on pyDBG and pyIDA.

Conventions in my Python-code: I use the semicolon to mark significance in code. It's entirely optional. Furthermore I tend to skip the shebang because Python2.4 is old and shouldn't be used any longer - normally. Call your 2.4 version directly via command-line - or configure your IDE. Well... there're some installation guides here and there. Read them all. On Windows you need python2.4. You can try 2.5 and recompile, but I don't know the side effects for pyDBG. As far as I found out there're none.

First steps after installing: pyDBG

  1.  from pydbg import *  
  2. dbg=pydbg()              # simply instantiating  
  3. # copy the calc.exe there or somewhere else dbg.load("C:\copies\calc.exe");  

If you want to refer to the documentation: here. This contains all the features of pyDBG listed, explained. Enumerating processes works with enumerate_processes(). Note that you have convert the PIDs to hex with hex(the number before the process you want) in order to utilize dbg.run() in pyDBG.

Setting breakpoints with pyDBG

That has to be essential, because it's common basic functionality. Setting breakpoints, working with them. In Gray Hat Python Justin mentions how to extend breakpoint handlers, and much more. But I'll skip this level of detail for now (if you're deeply interested you now know the way to go). Alternatively there're numerous tutorials for pyDBG waiting to be found. In the following this small Windows executable is used.

  1. from pydbg import *
  2. from pydbg.defines import *
  3. def handler_breakpoint (pydbg):
  4. # to ignore the first Windows driven breakpoint.  
  5.         if pydbg.first_breakpoint:
  6.                 print "1st breakpoint - next step"      
  7.         return DBG_CONTINUE    
  8.         print "2nd breakpoint - check!"    
  9.         return DBG_CONTINUE  # now we're creating our own instance
  10. dbg = pydbg()  
  11. # and we register a breakpoint handler function
  12. dbg.set_callback(EXCEPTION_BREAKPOINT, handler_breakpoint)  
  13. proc = dbg.enumerate_processes() PID=0  
  14. # we get the value in the list, that's put out
  15. # after the process we seek to attach  
  16. for x in proc:
  17.     if x[1]=='CRACKME.EXE':
  18.        PID=x[0]  # if the PID is still 0 something went wrong
  19. if PID!=0:
  20.     dbg.attach(PID)
  21.     recv = dbg.func_resolve("user32", "MessageBoxA")
  22.     dbg.bp_set(recv)
  23.     dbg.run();  
  24. else:
  25.     print 'Whoops - did you start the process?'
  26.     raw_input()



I guess the last part needs some more comments: user32.dll is a core Windows subsystem DLL, and I now take the liberty to quote a figure from Microsoft® Windows® Internals, Fourth Edition by Russinovich and Solomon:

You simply know "MessageBoxA" because that's defined in the Windows Application Programming Interface. If you take a look at the linked Wikipedia section at the overview where "User Interface" is described... as it follows: [...] to create and manage screen windows and most basic controls, such as [...] functionality associated with the GUI part of Windows. This functional unit resides in [...] user32.dll on 32-bit Windows. Check! We needed some background information, but that's it: The recv variable will contain the point, in which the MessageBox of the Crackme is handled, because the API returns it. dbg.run() starts the binary afterwards.

Taking control - get more tricky

pyDBG is kewl, because it works with Python not just in Python. You can take advantage of anything Python already has, too. Let's get hands on the EIP and jump into the context of the MessageBox event now.

  1. import sys
  2. from pydbg import *  
  3. # start the program:
  4. dbg = pydbg()
  5. dbg.load(r"C:\CRACKME.EXE")
  6. dbg.debug_event_iteration();
  7. # starts here  
  8. # entry-point = break-point dbg.bp_set(0x401000)
  9. while not dbg.context.Eip ==0x401000:
  10.      dbg.debug_event_iteration();
  11.       print 'Got the EIP'

I simply recommend to use iPython(free) or Wing's interactive console (has completion, dude!!). Or just your terminal of choice while calling the interpreter. Don't try Cygwin's environment! - And you cannot start this with clicking, because we return the control of the attached process to the interactive console.
The while loop will run until the EIP is 0x401000. That's the point we defined to want, and there we drop of the loop, knowing where we exactly are. hex(dbg.context.Eip) will return the EIP. dbg.read(), dbg.hexdump() should be very familiar. Works sweet as.

The previous section featured how to resolve a function through WAPI with pyDBG into a variable, and using this as a breakpoint. Now we also use this as a point to jump to and to dump the context of the particular event function. [%] marks, that I use an active console. This is actually not running scripted, but I skipped to paste the outputs:  [%] recv = dbg.func_resolve("user32", "MessageBoxA"); ... [%] dbg.bp_set(recv) ... [%] while not dbg.context.Eip == recv:          dbg.debug_event_iteration(); ... [%] We drop off. The process is halted frozen at the point of the MessageBox - to deconstruct the assembly at this point now. For the output we want the hexsnakes and the assembly. I just recommend not to put out the assembly... but anyhow:

  • hex(recv) prints out the instructions.
  • dbg.disasm(recv) does the same.

Iterating with a for loop here creates the comfort output:  for x in range(20)     hex(recv+x), dbg.disasm(recv+x) Note that x starts at 0. And yes... we can put this in a script. No problem. dbg.run() will continue the process - as expectable. The only problem which remains is: the entry point. We defined it by hand. That's unnecessary. Nevertheless one should know it's possible; and you need to utilize another optional Python module, that has to be installed.

Entry-point detection with PEFile

 import pefile  pe = pefile.PE('CRACKME.EXE') ep = pe.OPTIONAL_HEADER.AddressOfEntryPoint This is another option how to set a breakpoint: by automatic detection. The first examples just utilized manual setting.

Okidoki - I can manually set breakpoints and detect them... Now hooking!

Letting pyDBG work with detected entry-points is a piece of cake. However before I go on with this tutorial I should mention that I do not officially support illegal file-cracking. The Crackme has been designed to allow a feeling of success while gaining deep knowledge about the specific debugging internals. Believe it! Okay, now lets log some APIs and dive into "hooking".

  1. from pydbg import *
  2. from pydbg.defines import *
  3. import utils            # belongs to pyDBG  
  4. # for the API exit
  5. def MessageBox (dbg,args,ret):
  6.     print 'Pause'    
  7.         print hex(args[0]),hex(args[1]),hex(args[2]),hex(args[3]),ret
  8.      # for the API start
  9. def MessageBox2 (dbg,args):
  10.     print'Enter'
  11.     print hex(args[0]),hex(args[1]),hex(args[2]),hex(args[3])
  12.  
  13. dbg = pydbg()
  14. dbg.load(r'C:\CRACKME.exe')  
  15. dbg.debug_event_iteration()      # real start point
  16. dbg.bp_set(0x401000)
  17. while not dbg.context.Eip == 0x401000:
  18.      dbg.debug_event_iteration()
  19.      print 'Got the EIP'
  20.      recv = dbg.func_resolve("user32", "MessageBoxA")  
  21.          # here's some new stuff actually
  22.      # it will get explained in the following paragraph  
  23.  
  24. # 1. define a hook container with the help of pydbg's utils
  25. hooks = utils.hook_container()
  26. # 2. adding a hook point hooks.add(dbg, recv,   4, MessageBox2, MessageBox)  
  27. # dbg = pydbg instance, recv = hook install-address,  
  28. # 4 = how many arguments the target function takes
  29. #     and the function names follow afterwards  
  30. dbg.run()

Hooking means: we observe the process and change its flow to be able to monitor the data which it accesses. The code here only examples soft-hooking because we're attached to the target process of the Crackme. Hard-hooking would mean to jump directly into the target's assembly. The functions here are non-intensive and aren't frequently called. Therefore soft-hooking is the technique of choice. We imported utils to be able to define a hook-container and added one hook-point. - To just log the API exit: hooks.add(dbg, recv, 4, None, MessageBox). - Because the function-prototype is: add(pydbg, address, num_arguments, func_entry_hook, func_exit_hook). As you might guess therefore to just log the API start: hooks.add(dbg, recv, 4, MessageBox2,None). The next step is to run this script, still with Python2.4 if you didn't recompile pydasm jet, and to try to register in order to create the API calls. Insert something and look at your Terminal-output:

You will see "Enter" on for every entry-hook and "Pause" for every exit-hook. MessageBox2 prints out the API arguments from the Stack. If the API successfully exited we see a 1, printed out by the MessageBox function, as the value "ret". MessageBox2 could do more:

  1. def MessageBox2 (dbg,args):
  2.     print'Enter'
  3.     esp = dbg.context.Esp
  4.     a = dbg.read(esp,100) # stores the next 100 bytes in a    
  5.         print dbg.hex_dump(a,esp) # formates them
  6.     print hex(args[0]),hex(args[1]),hex(args[2]),hex(args[3])

This puts out more of the context we're examining. In this Stack-dump we see our input-values reflected in the context again.

Visualizing this all

Who knows me: I like security-data visualization and developed some kind of addiction to render network-flows, binaries, and now execution flows. Especially Paimei has a graphing function that can help us to gain some more understanding. Paimei comes with a bunch of IDA-python scripts. One of these is called pida_dump.py. Edit -> Plugins -> IDA-Python -> start it and store the *.pida file somewhere. Add it as a Module to Paimei. When you installed Paimei you also installed uDraw. Start uDrawGraph.exe -server from command-line or add -server to the link after the "s. In any case it needs to listen. Connect the Paimei console with the uDraw server, which could be localhost:2542.

A right click reveals this functionality now. BinNavi or other tools work with IDA, too. Their results maybe sweeter but they're not free. And one particular thing they cannot do is to work with ollydbg. However imDBG has its own graphing function.

Try not to underestimate this. Most people assume to be able to cope without this while facing a growing complexity of design; and overlook more and more. Nevertheless a pure-textual output allows high-precision, a graphical allows orientation. And of course this is scriptable. And we didn't even touch the heap. Thing is: I got hungry and I'll complete this tutorial soonest.

So long for now,
wishi

Post new comment

The content of this field is kept private and will not be shown publicly.

Ihr Browser versucht gerade eine Seite aus dem sogenannten Internet auszudrucken. Das Internet ist ein weltweites Netzwerk von Computern, das den Menschen ganz neue Möglichkeiten der Kommunikation bietet.

Da Politiker im Regelfall von neuen Dingen nichts verstehen, halten wir es für notwendig, sie davor zu schützen. Dies ist im beidseitigen Interesse, da unnötige Angstzustände bei Ihnen verhindert werden, ebenso wie es uns vor profilierungs- und machtsüchtigen Politikern schützt.

Sollten Sie der Meinung sein, dass Sie diese Internetseite dennoch sehen sollten, so können Sie jederzeit durch normalen Gebrauch eines Internetbrowsers darauf zugreifen. Dazu sind aber minimale Computerkenntnisse erforderlich. Sollten Sie diese nicht haben, vergessen Sie einfach dieses Internet und lassen uns in Ruhe.

Die Umgehung dieser Ausdrucksperre ist nach §95a UrhG verboten.

Mehr Informationen unter www.politiker-stopp.de.