We led this project to determine the possibility to control a Microsoft Access 2013 application (or more recent versions) from a Java program to enable its migration.
We’ve set up a two-steps approach:
- A first step to allow the utilization of Access software Application Programming Interface (API).
- Then, we studied the API to let us manipulate the Access application from an Access software.
The final objective is to make the conception of the architecture presented in the figure below (Figure 1) possible.
DLL Reverse Engineering
The first step was to enable the Access software control from Java. The main project allowing to control Microsoft Office application is « Apache POI ». Though, it doesn’t support MS Access application. The only project we know allowing to manipulate Access from outside is the interoperability project developed by Microsoft. But it has been developed for the C# programming language and not for Java as we needed.
So, we could’ve followed two tracks to control Access software from Java:
- Doing the binding in C# and use it with Java
- Doing the Access’s DLL reverse engineering, the C# binding and reuse them to create the same API in Java
DLL reverse engineering is based on using a DLL exploration tool on Windows, and an existing app (TLBCodeGenerator) generating the Java code to call the ActiveX DLL.
To convert the C# API in Java, we went through those steps:
- Identifying the ActiveX DLL on MS Access (MSACC.dll).
- Extract the definition file (.tlb) with “Resource Hacker” tool. By default, an ActiveX DLL takes its definition file, but it doesn’t expose it.
- Using TBLCodeGenerator with the .tlb file on Access main DLL: this step will generate Java code with errors in the DLL not included in the Java project.
- Each error indicates the Windows register base address where the dependence DLL is. We find it, thanks to windows register tool.
- Rerun the process for each missing DLL identified.
- If there isn’t missing DLL, add the newly migrated DLL to the original project as Maven dependence.
Thus, the main DLL access and its dependencies have been migrated to Java.
We went through those steps manually and five DLL treatments have been necessary to obtain a usable and compilable binding. Though it is conceivable to automatize the process for biggest projects with more DLL needed.
Use of Java JNI/JNA
Once the DLL reverse engineering to control MS Access software has been done, we analyzed the API interoperability to extract the method to manipulate an application (Access software isn’t an Access application. It is an IDE – Integrated Development Environment – to develop Access application).
We defined different constraints to be resolved to make sure it is possible to manipulate an application:
- Launch an application
a. Launch a MS Access application from Java
b. Launch several times the same MS Access application from the same Java application
- Call VBA’s functions (Virtual Basic for Application)
a. Launch VBA’s functions without argument
b. Launch VBA’s functions with simple arguments (int, boolean, …)
c. Launch VBA’s functions with complex arguments (Type defined by the user)
d. Launch functions class VBA
To discover the API operation, we have based our research on the Microsoft documentation for C# and Access. Despite some differences, it helped us in manipulating the reverse-engineered Java version more easily.
The following presents the technical use of our binding after its installation.
Launch an application (and stop it)
Opening an application takes three steps:
1. Initializing the communication with the COM protocol through the definition file STDOLE2.tbl
2. Creation of an Application instance
3. Opening the database
Once those steps have been done, it is possible to launch the VBA’s functions. (Note: when various applications are launched together, they work on different memory spaces to avoid any variable access conflict between them)
To stop the application, different actions have to be executed to avoid the appearance of “zombie” processes in Windows.
Launch VBA’s functions
Nowadays, we launched a part of VBA’s functions. We treated VBA’s functions without arguments and with simple arguments. In both cases, VBA’s functions must be public (if a function is private, it can easily be converted to public by adding the modifying Public before the function name).
Then it is possible to call a method by using Java’s API with the “Run” method.
The method takes into account the name of the function called and its parameters.
We propose an approach and a tool to manipulate Microsoft Access applications according to the following constraints:
- MS Access’s DLLs must be installed on the using tool system
- MS Access version must be the 2013 or more recent version
- The entry points of the Access application must be represented as VBA public methods
There are several possibilities to continue this project:
- Finalize the call function work by including the treatment of VBA classes and the function treatment using complex arguments.
- POC on a small application or a more important application clearly split from the rest of the application.
- Study the possible implementation of our approach on dockers (or other implementation system) in which DLL can be placed differently or be missing.
- Nowadays we use a not compiled MS Access files (.accdb), we’ll have to make sure we can use them at the implementation.
- To finalize the project, we have to process to the application control and not the Access software control. It should be possible to manually process it from our work but the creation of a semi-automated Java Application process (Figure 1) would be an added value to the interoperability migration work. A track to follow is to work on tools already developed in the DRIT about MS Access application study (Santiago Bragagnolo, Ph.D. candidate in the DRIT, is working on it by its thesis).