felix.leder@gmx.net | IDA Pro | License |



 
  
Search:


What is RE-Google

RE-Google is a plugin for the Interactive DisAssembler (IDA) Pro that queries Google Code for information about the functions contained in a disassembled binary. The top results are then displayed as comments to the function and can be opened by just clicking on it.

The top results will often tell you what to the function is actually doing or what you will find in the inside.

Download

The current version (v.0.1) is half-way stable. Click here to download RE-Google

The source code is available from our subversion repository:
svn co https://svn.carnivore.it/tools/REGoogle/ REGoogle

How to use it

There is just one script (REgoogle.py), which you have to execute from within IDA. The standard configuration will enumerate all functions and set the results.

RE-Google can be configured in the top section of the script. Options include e.g. the possibility to query just for the current function or to define certain blacklists that are not to be included in the results.

Requirements

The whole story

Some people say "Reverse Engineering is an art". Well, this might be true if consider stuff like mathematics as art. It is more an application of standard methods that evolve constantly. Actually, everybody can learn these methods and start to RE executables. With this plugin, even your granny can start reversing :)

Reverse engineering is like solving a jigsaw puzzle. In order to see the whole picture you need to find the corner pieces, then the frame, and then work your way forward from there. The corner pieces for reversing are strings, constants and function names. The function names that people normally start with are the one's imported from shared libraries (e.g. Dlls). Strings contain human readable hints about the functionality. Specific constants add more clues to solve the puzzle or can sometimes even be used to identify certain (types of) algorithms. The imported functions tell about the actions performed by it.

The major problem is that a lot of experience is needed to identify strings, constants or to know what the combination of imported functions may result in. But why don't we use the combined knowledge of many people in order to get this expertise. Google allows to search for this.

Google code search is very valuable when trying to find algorithms or code excerpts that contain this information. Often the few results you see on one page can already tell you what the function might be doing.

This plugin enumerates all functions and extracts strings, constants (also called immediate values), and the names of imported functions. If there is sufficient data, a Google Code search is performed and the result is added to the IDA database as function comment. Reviewing these comments sometimes turns the analysis of the considered function unnecessary and saves time.

Example A:

It seems to be very likely that the considered function is SHA-512 based on the results shown above. And it is :)

Example B:
UPX0:0040D7A5 sub_40D7A5 ; src/iexplorer/greta/regexpr2.cpp
UPX0:0040EA6D sub_40EA6D ; src/iexplorer/greta/regexpr2.cpp
UPX0:004102B7 sub_4102B7 ; src/iexplorer/greta/regexpr2.cpp
...
UPX0:0041E163 sub_41E163 ; src/iexplorer/greta/regexpr2.cpp
UPX0:0042183F sub_42183F ; src/iexplorer/greta/regexpr2.cpp
UPX0:0040EE2F sub_40EE2F ; trunk/shareaza/RegExp/regexpr2.cpp
UPX0:0041E945 sub_41E945 ; trunk/shareaza/RegExp/regexpr2.cpp

These functions seem to be part of a library related to regular expression parsing. Saves some time because those don't have to be investigated by hand, now.

Example C:
Some functions like the following only get a single result:
; openssl-0.9.8e/crypto/x509v3/v3_alt.c

Wow, perfect hit. And the result is pointing right to the source code. This will help when investigating related functions.

Example D:
Enough examples... Try it out yourself :)

Frequently Asked Questions

I get the error message "Too many Google queries in too little time."

Well, you are putting a lot of pressure on the Google services by sending too many queries too quickly. Raise the AFTER_QUERY_WAIT and get yourself two or three cups of coffee while waiting for the script to finish.

You should also consider only querying for functions you are really interested in by going to that function and using the configuation settings SEARCH_ALL_FUNCTIONS = False

Where do I find the configuration options?

It is all in the one single file (ugh, bad coding - I know). Look at the top part saying Configuration.

Credits

Thanks to Thomas Barabosch, Paul Mueller and Oliver Schmitt for contributing many good ideas. Special thanks to the Giraffe chapter of the Honeynet Project for the breeding ground.

License

RE-Google is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Even though this is free software, I enjoy feedback... :)

Contact: felix.leder@gmx.net

©2009 - Felix Leder