DLL Internals

Dynamic Link Library (DLL) is a shared library concept implemented by Microsoft on Windows platform. It uses Portable Executable (PE) file format which is same as Windows EXE files. These files will generally have extension of .dll and .ocx. DLL can export both Functions and Data for using in other modules. Each process will receive separate copy of Data, but same copy of code from DLL.

Static Library v/s Dynamic Library

Static library will merge functions into exactable file by linker. Implementation code will be located in .lib file. When we link with static library, it increases the size of executable. By using this method we can eliminate the loading any code dynamically during run-time and this might even improve performance in some cases. Disadvantage of this method, it increases the duplication of code in all executable file who ever refers this library.

Dynamic library will have set of functions and data. During linking, compiler will add stub (just a pointer to a DLL) in EXE file to call a function of a DLL. Using this method user can separate all common used functions and place in DLL. This can be referred from multiple applications and only 1 copy exists both in memory and HDD. Disadvantage of DLL is it will not solve versioning problem.

Types of Dynamic Linking

Load Time dynamic linking: Module makes explicit calls to exported DLL function and it looks like as if the function located locally. This required you to link a .lib file which is generated during the creation of DLL. This import library supplies all the information to load a DLL and locate the exported function when application is loaded. When we launch an application it loads all DLL’s which during loading of application, if Loader fails to locate a DLL, it will display error message saying “unable to locate DLL” and then stops launching application.

Run-time dynamic linking: To use this method we need to know prototype of a function and we can load any specific DLL using LoadLibrary() function. Once we load a DLL we get a handle to use for getting address of a function in memory and use it. To get address of a function we have to use GetProcAddress() function. FreeLibrary () function is used to unload DLL from memory.

Advantages of DLL

  1. Multiple processes can load same DLL at the same base address or different base address and still share same physical copy of DLL in memory. This saves memory and reduces swapping.
  2. When a function change within a DLL due to bug fix or any other improvements, application don’t need any recompilation or re-linking. But, function parameters, calling convention, return values should not change.
  3. Programs written in different language can call the function of a DLL, as long as the calling convention is same.

Function exporting from DLL

You can write a function in DLL and export it. Function prototype looks like following:

__declspec(dllexport) void TestFunc(int I, int j, int k);

This function will add into Export table of DLL. Each and every binary irrespective of extension like .EXE, .DLL, .SCR, .OCX, etc.. Will have both Import and Export table since; all of them use the same PE file format on windows. Sections like Export, Import will be uses based on the need like if DLL exports any function, linker will create Export section and adds function into it. If a DLL do not export any functions then it is no use of that DLL. To check what all functions a DLL exported you can use dumpbin.exe /EXPORTS (This utility ships along with Visual Studio):

Relative Virtual Address (RVA) : This address represents a relative address within a file on HDD. When it loads into memory, if we add this RVA address to base address where the DLL has loaded, we can get the actual address of a function in memory. We can pass either ordinal or a function name to GetProcAddress() function to get a function address in memory.

Importing a function from a DLL

You can import exported function of a DLL to your module (it can be another DLL or EXE file) and prototype should looks like following:

__declspec(dllimportt) void TestFunc(int I, int j, int k);

If your test.exe file is importing TestFunc(), linker will add details in test.exe what all functions it is importing and from which DLL. Using dumpbin /IMPORTS you can find the same:

After loading a DLL into memory, Loader will modify the Import Address Table (IAT) of a module. This table is filled from loader based on the location where the DLL has loaded in memory. Base address of a DLL + RVA address of a function will give a starting address of a function. This is how it can access a function like internal function. When I call win32 function GetCurrentThread() (This function, linker will place a stub like following inside an EXE file :

This __imp__GetCurrentThread (location: 0x405000) has the valid run-time address replaced by loader during loading of a DLL. When this calls a function it will jump to a location which is placed in 0x405000 address. Following 4 bytes you find at the location of 0x405000 address:

0x77E6C242 (Data stores in reverse in x86 system) is actual function address of GetCurrentThread() win32 function in a kernal32.dll. Following is the disassembled code of the same function:

Loading Library

Each DLL we load in a process maps it to virtual address space of calling process. Once it loads into memory successfully, process can call any of the functions located inside a DLL. System maintains per process reference count for each loading of a DLL from multiple process. When process terminates or unloads a DLL it decrements reference count. If this reaches zero (0), system will unload the DLL from memory and releases all physical memory allocates for DLL. Like any other function Exported function runs in the context of the calling thread context. Therefore following condition applies:

  • Process can use any handled opened inside function of a DLL
  • DLL uses calling thread stack
  • DLL uses virtual space of calling process
  • DLL function allocates memory from the calling process

Windows NT DLL Management

When first process loads a DLL into memory it makes all DLL data and code pages read-only. If another process loads DLL into memory, again it shares the same physical pages. As long as neither process write to these pages. As show in following figure:

If process1 writes any date to DLL memory, the content of physical page is copied to new physical page by OS and updates the virtual memory map in process1. Both processes now will have their own copy of physical page in memory. Due to this reasons both cannot write each other’s memory and both are protected like following figure:

Sharing the same physical page happens only if both of the process maps the DLL into same virtual address of respective process. Using the Copy-On-Write protection flag, it will identifying if any of these process is trying to write into DLL memory. For some reasons if one of them fail to load into default base address and loads somewhere else Copy-On-Write protection forces some of DLL pages code to copy to different physical page due to fixes of jump instructions are written within the DLL’s page and they will be different for each process. If the code section contains many references to data section, this might cause to have entire code section to be copied to new physical page.

DLL versions

Checking DLL version is more important when we want to track any bug in client place but not reproducible in development system. First we need to get all dependent DLL’s loaded along with binary and each versions of binary from client system then we can check the same in development system, this is how we should start debug any bugs by default. If everything is fine w.r.t DLL versions you can continue debugging. You can use dependency walker (www.dependencywalker.com) utility and it provides lot of information to you about DLL:

Green line window represents the binary and all dependency DLL. Brown line window represents how many function a module has imported from each DLL. Blue line window represents all the function a particular DLL exported. Red line window represents all the DLL’s loaded in memory along with additional info like file time stamp, link time stamp, file size, versions of a DLL, Subsystem, CPU, etc…

Installation of a Product

Earlier day’s common problem during installation of product is overwriting a DLL in windows directory just to make there product works in a system; this causes a problem of overwriting the DLL of existing and make other product unusable including OS application itself.

Third party develops a library which does compression and sells for various companies, these companies using this DLL they will solve there business problem and ships the products for customers. App1 and App2 will install in end-user system which uses this compression DLL. After some time, changes happen in Compression library it can be bugs fixes or some more feature addition. App1 will ship with new Compression library and installs in the same end user system by overwriting the old compression DLL in system. App2 will not run due to several reasons like old function which is exported from a compression DLL prototype might have changes, return value might have changed, calling convention might have changed or function name itself might have changed. This is the main disadvantage of DLL and fails in maintaining the versions of DLL. One option to solve this problem is to name different version of DLL like mfc40.dll, mfc42.dll, mfc71.dll, etc.. This kill’s all advantage which is provided by DLL like sharing same physical page in memory and space in HDD. This is how COM/DCOM born to solve versioning problem along with other problems.

To prevent overwriting system DLL’s since win2000, it has brought method of Windows File Protection (WEP). Only OS packages can update these DLL’s and not 3rd party applications. When you replace different version DLL in windows\system directory, it will not display any error message to you, but it will silently replace the same DLL with original DLL. All the original DLL’s it will keep in windows\system32\dllcache. This is how DLL hell (it means loading one program will break another) problem is reduced and may not go completely, since you can bypass this WEP protection too by changing a key in registry and many malware programs do it.

  • Mohamed Ismail

    Nice overview about DLL 🙂 Thanks