arcpy function, variables, and classes outside modules - arcpy

arcpy is a package that holds different modules including the init.py.
When I look into general Python documentation, they mention that classes, functions, and variables can be defined inside a module, so a programmer can import them using:
from packageName.moduleName import funcA, varB, ClassC
But in arcpy, there are functions, variables, and classes that are not inside modules e.g. env, Intersect_analysis, etc.. Where are they implemented? Are they stored inside modules and ESRI links to them through init.py for instance.
I tried reading the code inside the arcpy package but it is not clear to me.
Thanks in advance

Esri software (e.g. ArcMap) primarily uses C++ for components in the software such as the Intersect Analysis tool.
The ArcPy library enables those tools to be executed as part of a script. The actual calculations and so on, for example when you run arcpy.Intersect_analysis, are still being done with the C++ based ArcObject Intersect tool.
ArcPy is a wrapper that enables Python access to those proprietary functions.

Related

Is it possible to split a SWIG module for compilation, but rejoin it when linking?

I hit this issue about two years ago when I first implemented our SWIG bindings. As soon as we exposed a large amount of code we got to the point where SWIG would output C++ files so large the compiler could not handle them. The only way I could get around the issue was to split up the interfaces into multiple modules and to compile them separately.
This has several downsides:
• Each module must know about dependencies in other modules. I have a script to generate the interface files which handles this side of things, but it adds extra complexity.
• Each additional module increases the time that the dynamic linker requires to load in the code. I have added an init.py file that imports all the submodules, so that the fact that the code is split up is transparent to the user, but what is always visible is the long load times.
I'm currently reviewing our build scripts / build process and I wanted to see if I could find a solution to this issue that was better than what I have now. Ideally, I'd have one shared library containing all the wrapper code.
Does anyone know how I can acheive this with SWIG? I've seen some custom code written in Ruby for a specific project, where the output is post-processed to make this possible, but when I looked at the feasibility for Python wrappers it does not look so easy.
I just did equivalent hack for TCL library: I use several SWIG modules, generating several .cpp files that are compiled in several .o files but compile them all in a single .so file that is loaded by a single TCL "load" command.
The idea is to creates a top swig module (Top) that calls initialization functions of all sub-modules (Sub1 and Sub2):
%module Top
%header %{
extern "C" {
SWIGEXPORT int Sub1_Init(Tcl_Interp *);
SWIGEXPORT int Sub2_Init(Tcl_Interp *);
}
%}
%init %{
if (Sub1_Init(interp) != TCL_OK) {return TCL_ERROR;}
if (Sub2_Init(interp) != TCL_OK) {return TCL_ERROR;}
%}
There's nothing special in the submodules files.
I end up with file Top.so that I load from TCL with command "load ./Top.so"
I don't know python but's likely to be similar. You may need to understand how the python extensions are loaded, though.
If split properly, the modules don't necessarily need to have the same dependencies as the others - just what's necessary to do compilation. If you break things up appropriately, you can have libraries without cyclic dependencies. The issue with using multiple libraries is that by default, SWIG declares its runtime code statically, and as a result, as problems passing objects from one module to another. You need to enable a shared version of the SWIG runtime code.
From the documentation (SWIG web page documentation link is broken):
The runtime functions are private to
each SWIG-generated module. That is,
the runtime functions are declared
with "static" linkage and are visible
only to the wrapper functions defined
in that module. The only problem with
this approach is that when more than
one SWIG module is used in the same
application, those modules often need
to share type information. This is
especially true for C++ programs where
SWIG must collect and share
information about inheritance
relationships that cross module
boundaries.
Check out that section in your downloaded documentation (section 16.2 The SWIG runtime code), and it'll give you details on how to enable this so that objects can be properly handled when passed from one module to the other.
FWIW, I've not worked with Python SWIG, but have done Tcl SWIG.

Generate .py stubs from Python C/C++ module

When I examined, for example, __builtin__.py module source, or some modules from numpy, I found that actually they contain only stubs of classes and methods with documentation. As far as I understand, they were generated by 'something'. This 'something' is mentioned in header of such modules, for example __builtin__.py contains '# from (built-in) by generator 1.96'
So. What is this 'generator' and where can it be obtained? Such stubs are handy: I can look up signature of a method right in my IDE without reading separate web documentation.

Retrieving the list of all the Python APIs given by an application

i would like to retrieve the list of all the APIs that an application is giving to the users.
The Application is written in C/C++ for the biggest part.
I was hoping that python has a standard function for that, i was also trying to approach this in terms of namespaces, since i'm not interested in all the keywords but only in the ones provided with the APIs, but i simply do not know where to start in terms of functions, i do not know about functions that are doing something related to what i'm trying to achieve.
The application uses Python 3.x to provide APIs.
Python doesn't have a notion of an API (or interface) as a language primitive. A module or package will expose some of its members (functions and variables) and hide others, so if you know which modules you are interested in, "exposing" in this sense is AFAIK the most meaningful concept.
The exposed members are the same ones that will be imported if you run from <module> import *. As you probably know, member names that begin with a single underscore, or begin with two underscores and do not end with two, are not meant to be part of the API and will not be exported; by default everything else will be exposed, but a module can customize its API by listing what should be exported in the __all__ variable-- see Importing * from a package.
So, to find the APIs you are looking for you must first know which top-level modules you are interested in. If the application in question is available to python as a single package, start with it. If it has a __all__ variable, its contents are the API for the package. If it does not, look through the contents of dir(<package>) and exclude anything that starts with only a single underscore, or starts with two underscores but does not end with two. If you're looking at a large package, some of what you'll find are themselves modules or packages. Examine them the same way, recursively.

DLL monitoring

Is there an application which allows me to see what is being sent to a DLL from a process?
I have a process and I have a DLL and I would like to monitor the parameters that are being sent to the functions so that I can use the DLL myself.
The EXPORT of the DLL is.
??0CCPCompressor##AAE#XZ
??0CCPExpandor##AAE#XZ
??1CCPCompressor##AAE#XZ
??1CCPExpandor##AAE#XZ
?Clear#CCPCompressor##QAEHXZ
?Clear#CCPExpandor##QAEHXZ
..Compress#CCPCompressor..
..Delete#CCPCompressor..
..Delete#CCPExpandor..
..Expand#CCPExpandor..
..Free#CCPCompressor..
..Free#CCPExpandor..
..Init#CCPCompressor..
..Init#CCPExpandor..
..New#CCPCompressor..
..New#CCPExpandor..
In general, this is a bad idea. Even if you have some set of captured parameters, without deep analysis of the DLL code you don't know what to do with those parameters and what ranges of parameters are accepted by certain methods. Example: if I call a method DoMathOperation(Add, 1, 2), you can mimic this call, but you won't be able to do DoMathOperation(Multiply, 2, 2) as you don't know that this is possible.
The simplest approach has been to simply relocate the original dll, and create a new dll that you make yourself, with the same exports. This dll would LoadLibrary the old dll from the alternate location.
This doesn't quite apply here - the dll is exporting c++ class members which has two consequences: c++ classes have to be statically loaded as there is no c++ mechanism to 'glue' c++ function pointers (obtained via GetProcAddress) into a class instance.
This means your shim dll would be in the unfortunate place of having to both import, and export, and identical set of symbols.
The only way around this is to write your shim dll in two parts:
Shim1:
One part would get the name of the original dll, and would export the same class defintion the original dll exported:
class __decldpec(dllexport) CCPCompressor {
...
Depends can crack the name decoration, or Undname.exe is distributed with Visual Studio.
This part would LoadLibrary() using an explicit path to shimdll2.dll located in some other folder, along with the original dll. GetProcAddress() would be needed to import functions exported by shimdll2.dll
Shim2:
The other shim dll would be located in a folder with the dll you are trying to intercept. This dll would have to import the class from the original compressor dll:
class __declspec(dllimport) CCPCompressor {
...
You can use the dll import library made by the first dll to actually link the symbols.
Then its a case of exporting functions from shim2.dll that shim1.dll will call whenever a CCPCompressor method is called.
NB. Other things: your version of the CCPCompressor class will need to have, at least, a large dummy array as you can't know from the dll exports how big the application expects the class to be (unless you happen to have an actual header file describing the class).
To decompose the exported names to build a class definition:
Open up the Visual Studio 20XX Command Prompt from the Start > Programs > Visual Studio 20XX -> Tools menu.
c:\...\VC>undname ?Clear#CCPCompressor##QAEHXZ
Microsoft (R) C++ Name Undecorator
Undecoration of :- "?Clear#CCPCompressor##QAEHXZ"
is :- "public: int __thiscall CCPCompressor:Clear(void)"
c:\...\VC>_
Do that for each function exported from the original dll (undname accepts some kind of text file to speed this process up) to find out how to declare a matching class def.
Is using detours compatible with your requirements?
From the site:
Overview
Innovative systems research hinges on the ability to easily instrument and extend existing operating system and application functionality. With access to appropriate source code, it is often trivial to insert new instrumentation or extensions by rebuilding the OS or application. However, in today's world systems researchers seldom have access to all relevant source code.
Detours is a library for instrumenting arbitrary Win32 functions on x86, x64, and IA64 machines. Detours intercepts Win32 functions by re-writing the in-memory code for target functions. The Detours package also contains utilities to attach arbitrary DLLs and data segments (called payloads) to any Win32 binary.
Detours preserves the un-instrumented target function (callable through a trampoline) as a subroutine for use by the instrumentation. Our trampoline design enables a large class of innovative extensions to existing binary software.
We have used Detours to create an automatic distributed partitioning system, to instrument and analyze the DCOM protocol stack, and to create a thunking layer for a COM-based OS API. Detours is used widely within Microsoft and within the industry.
The only reliable way is to debug your program (using any debugger like OllyDBG) and set breakpoint on required export function. Then you can simply trace the stack parameters sent to the calling function. This is only the start, you need to fully analyze function instructions within a debugger or disassembler to see what each parameter is doing and its type.

How to link to existing boost python module

I have wondered about this on and off but I never really got a definite answer. Is it possible within the boost.python framework to link against another boost.python module.
For example I have exported class A within boost_python_module(libA) and function B(A a) within boost_python_module(libB). Is it possible to specify in libB to link to A of libA.
The other way of looking at this problem would be that right now I have to generate all my bindings in one shot within one module. Is it possible to generate bindings incrementally over several boost_python_module.
The Boost.Python way to handle what you are asking for is to divide your package in compilation units as explained in the tutorial and later do a merge in a main compilation unit that actually declares the modules.
You cannot link independent modules in Boost.Python because they declare specific Python entry points that are executed by Python when you load your module. For example, if the binary module name is mod.so, the Python interpreter will look for a function called init_mod (that is what BOOST_PYTHON_MODULE(mod) declares) and execute the code of that function. Within the code of that function, it expects to find Python C-API declarations of objects (instances, classes, etc).
If you link, for example, the mod.so binary to another module binary (say, foo.so), when Python loads mod.so, it would only find and execute init_mod and will ignore init_foo.
I don't know well shared library, but what works for me is to import all my modules, which can reference with each others, within python: import libA; import libB.
It is of course possible to put these imports in an __init__.py file, so that whitin python you just have to do: import myLib.

Resources