Post

Everyday Ghidra: Symbols — Prescription Lenses for Reverse Engineers — Part 1

Everyday Ghidra: Symbols — Prescription Lenses for Reverse Engineers — Part 1

Everyday Ghidra: Symbols — Prescription Lenses for Reverse Engineers — Part 1

In reverse engineering a closed-source binary using Ghidra or other software reverse engineering frameworks, a key objective is to retrieve information that clarifies the disassembled code. This involves identifying function names, prototypes, data types, constants, and enums. These elements, symbolized as human-readable identifiers, simplify both programming and reverse engineering by providing a more intuitive representation of the program’s state, akin to using a high level language versus assembly code. Leveraging these symbols within Ghidra can significantly aid in understanding the program’s behavior.

A symbol in computer programming is a primitive data type whose instances have a human-readable form.

Symbols make code easier to understand. They can transform code without meaning or context…

Into something we can work with…

Symbols breathe life into reverse engineering and bring hope to the reverse engineer.

Symbol Information Sources

There are several ways to recover name and type information from closed-source binaries. Let’s start with named exports.

Exports

When a binary wants to provide functionality for other programs, it typically makes that functionality available via a reference in the export table. If a binary exports a function by name, that function name will be available in the export table.

Export Name Table

The export name table contains the actual string data that was pointed to by the export name pointer table. The strings in this table are public names that other images can use to import the symbols. MSDN Exports

If we view the exports for a Windows binary, we can see all the functionality provided with useful names.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
PS C:\> dumpbin /EXPORTS C:\Windows\System32\localspl.dll

Microsoft (R) COFF/PE Dumper Version 14.37.32825.0
Copyright (C) Microsoft Corporation.  All rights reserved.

Dump of file C:\Windows\System32\localspl.dll

File Type: DLL

  Section contains the following exports for LocalSpl.dll

    00000000 characteristics
     E64EF86 time date stamp
        0.00 version
         400 ordinal base
         124 number of functions
         123 number of names

    ordinal hint RVA      name

        419    0 000A34B0 ClosePrintProcessor
        420    1 000A3530 ControlPrintProcessor
        421    2 000332C0 DllMain
        422    3 000A35B0 EnumPrintProcessorDatatypesW
        423    4 000A3680 GetPrintProcessorCapabilities
        424    5 00029830 InitializePrintMonitor2
        425    6 0000B1B0 InitializePrintProvidor
        401    7 00073160 LclIsSessionZero
        402    8 00073180 LclPromptUIPerSessionUser
        426    9 000A8510 LocalAddForm
        427    A 000A8550 LocalDeleteForm
        428    B 000A8590 LocalEnumForms
        429    C 00076BB0 LocalReadPrinter
        430    D 000A85E0 LocalSetForm
        431    E 000A3730 OpenPrintProcessor
        432    F 000A37C0 PrintDocumentOnPrintProcessor
        433   10 00077EE0 SplAbortPrinter
        403   11 0005C490 SplAddCSRPrinter
        434   12 000A8620 SplAddForm
        435   13 00079FB0 SplAddJob
        436   14 000A1130 SplAddMonitor
        437   15 000A1520 SplAddPort
        438   16 000A16D0 SplAddPortEx
        439   17 000A3850 SplAddPrintProcessor
        440   18 0005E870 SplAddPrinter
        <several lines omitted>

Ghidra can take advantage of this readily available information and apply the function name as a symbol throughout the analyzed binary.

You can view the binary exports in the Symbol Tree Window:

If the export is utilized within the binary, the corresponding function call is appropriately labeled in the Ghidra rendered pseudo-code.

Imports

On the opposite side of exports, we have imports. If the analyzed binary imports functionality from other libraries, then function names are typically used to reference the external functions exported from another binary. These names are easily recovered from the import table from a binary.

The binary localspl.dll has several imports:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
PS C:\> dumpbin /IMPORTS C:\Windows\System32\localspl.dll | more
    WS2_32.dll
              00000001 Characteristics
      0000000180134708 Address of HMODULE
      0000000180140380 Import Address Table
      000000018012AD90 Import Name Table
      000000018012BE98 Bound Import Name Table
      0000000000000000 Unload Import Name Table
                     0 time date stamp

                000000018003206B       Ordinal     4
                0000000180030EE2     2 FreeAddrInfoW
                0000000180030EF4       Ordinal   115
                0000000180030F18       Ordinal   116
                0000000180030F3C       Ordinal   111
                00000001800320FB       Ordinal     3
                00000001800320E9    4E WSASend
                00000001800320D7       Ordinal    22
                0000000180032035       Ordinal    21
                0000000180030E57     7 GetAddrInfoW
                00000001800320C5    20 WSACloseEvent
                00000001800320B3    25 WSACreateEvent
                00000001800320A1    58 WSASocketW
                000000018003207D       Ordinal    23
                0000000180032047    31 WSAGetOverlappedResult
                000000018003208F    4D WSAResetEvent
                0000000180032059       Ordinal     7

From several dynamically linked libraries (DLLs):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
PS C:\> dumpbin /IMPORTS C:\Windows\System32\localspl.dll | findstr dll
Dump of file C:\Windows\System32\localspl.dll
    msvcrt.dll
    ntdll.dll
    RPCRT4.dll
    api-ms-win-core-threadpool-l1-2-0.dll
    api-ms-win-core-memory-l1-1-0.dll
    KERNELBASE.dll
    KERNEL32.dll
    api-ms-win-eventing-provider-l1-1-0.dll
    OLEAUT32.dll
    SspiCli.dll
    CRYPTSP.dll
    GDI32.dll
    USER32.dll
    ACTIVEDS.dll
    browcli.dll
    NTDSAPI.dll
    sfc_os.dll
    WINTRUST.dll
    WTSAPI32.dll
    SETUPAPI.dll
    CFGMGR32.dll
    drvstore.dll
    ext-ms-win32-subsystem-query-l1-1-0.dll
    Cabinet.dll
    <several lines omitted> 

Imports can be viewed in Ghidra’s Symbol Tree Window:

Although not all imports are named (ordinals are sometimes used instead), named calls to external libraries can be utilized within the binary to improve the reverse engineering results.

Type Information Provides New Lenses

Simply having names isn’t sufficient for reverse engineering large binaries. In addition to function names, acquiring data type information is also crucial.

Public Symbols

On Windows, type information can be obtained from readily available public symbols (available for most Microsoft OS binaries).

Here is a dump of public symbols from user32.dll:

As you can see from above, function names and type information for function parameters are provided. If we can define a function prototype and parameter data types, Ghidra’s decompiler is smart enough to know that this paramater is of type X and propagate that to the subsequent decompilaiton.

Here is CreatefileW function prototype rendered using public symbol type information:

CreateFileW function prototype from public symbols

And the resulting decompilation respects the defined return type (HANDLE) and propagates it throughout.

Public Headers

Other times you can scrape type information from public headers files which can be used by your Software Reverse Engineering tool to provide better decompilation and navigation.

From the Windows SDK we know the full CreateFileW function prototype:

If we can define a function prototype and types for each parameter, Ghidra can leverage that information and propagate it throughout the decompilation. Here is the function signature for CreateFileW leveraging extra type information:

CreateFileW function prototype fully defined

Notice the full definition of the dwCreateDisposition. This information we can easily obtain from MSDN.

Then define that ENUM type in Ghidra:

With this extra type information the decompilation becomes even clearer.

Symbols breathe life into reverse engineering and bring hope to the reverse engineer.

Check out the enhanced CreateFileWpsuedo-code:

Debug Binaries and Private Symbols

If you are reversing a debug version of a binary, Ghidra can generally pull out the information and use it. Typically, if you have a debug version, there is no need to reverse as you most likely have the source.

As for private symbols, most Windows binaries don’t include private symbols. But that is not always the case…

Ghidra loading combase.pdb

The symbols file for combase.dll is massive and includes much more information than your typical pdb from Microsoft.

Perhaps, COM is so difficult that they want to provide reverse engineers and those trying to debug their software a glimmer of hope? 🙃


This is part one of a look into how symbols enhance reverse engineering and details on how Ghidra can take advantage. Stay tuned for part 2 when we walk through how to leverage Ghidra’s symbol acquisition automation.


Going Deeper

Everyday Ghidra

If you’re looking to get a foothold in reverse engineering using Ghidra, consider my training “Everyday Ghidra”.

Check out CLEARSECLABS for details on the latest course offerings.

Everyday Ghidra: Practical Windows Reverse Engineering.

This course provides a comprehensive guide to using Ghidra, covering fundamental operations to advanced techniques, with hands-on exercises on real-world Windows applications. It’s designed for those with foundational Windows and security knowledge, aiming to equip them with practical “everyday” reverse engineering skills using Ghidra.

This post is licensed under CC BY 4.0 by the author.