Monday, June 19, 2017

How can I create a language independent library using Python?

Leave a Comment

If I create a package in Python, then another Python user can import that package and interface with it.

How can I create a package so that it doesn't matter what language the other user is calling the library with?

I could specify input and output file formats so that another language can interface with my Python code by merely providing the input files and reading the output files. However, creating the input and output files is very expensive computationally. Is there an easier solution?

7 Answers

Answers 1

If you want another language to be able to use your library directly (not using any kind of remote service or IPC stuff which is a whole different kettle of fish), you need to write language bindings for it, meaning there is a layer they call in the target language that calls your package under the hood. There are various toolkits for creating this, but it's usually the kind of thing you do if you want to be able to call a C or C++ library from a higher level scripting language like Python. For example, in C there exists the SWIG project to help make it possible to call C from Python, PHP, Ruby, etc.

This page will give you a bunch of intro links, it's a big and hard topic to be honest. I've only done it from C to Python myself. https://wiki.python.org/moin/IntegratingPythonWithOtherLanguages

Answers 2

Providing an API over a common protocol like http, and with a common format for calls and responses - like a REST service - is probably what you want to do. There are so many resources to help you get started writing a REST Web service with python, like this:

https://blog.miguelgrinberg.com/post/designing-a-restful-api-with-python-and-flask

If you want to keep it local to the machine, and provide your python functionality to be used in programs locally, python.org gives you a primer here:

https://docs.python.org/2/extending/embedding.html

Answers 3

In the general case, two different languages can't live in the same process. So you have to make one language call another throught interprocess communication (IPC).

The simplest and usually effective way to do so is via input/output of the calee "library". It usually has some kind of serialisation overhead, but in a typical matlab/python interaction it should be not noticeable.

In this configuration the slowest part is the startup of the python process which can be supressed by keeping the same process alive between two call.

Here an example of ipc between a main program in python (but could be written in any language) and a library in python using stdin/stdout and json as serialization


#main program in foreign language  import mylibwrapper  print(mylibwrapper.call("add",[1,2]))  mylibwrapper.exit() 

#mylibwrapper.py supposed written in foreign language import subprocess import json  process = subprocess.Popen([     "python3",      "mylib.py"],     stdin = subprocess.PIPE,     stdout = subprocess.PIPE,     encoding = "utf8")  def call(name,args):   process.stdin.write(json.dumps([name,args]))   process.stdin.write("\n")   process.stdin.flush()   result = process.stdout.readline()   return(result) def exit():   process.terminate() 

#mylib.py  import json, sys  def add(arg1,arg2):   return arg1 + arg2  if __name__ == "__main__":    for line in sys.stdin:     name, args = json.loads(line)     function = { "add" : add }[name]     sys.stdout.write(json.dumps(function( *args)))     sys.stdout.write("\n")     sys.stdout.flush() 

Answers 4

@bluprince13 There is no such way to have a library callable from every language, at least not directly without wrapper code. COM interface on Windows is close which then can be imported by most programs (such as Excel, MATLAB, JAVA) but this is a huge pain to write.

When you say the read/write is an expensive operation, you must not be using Pandas read_csv and to_csv functions - they are blazing fast (C++) implementations. Faster yet are binary HDF5 files although they are harder to work with for most users http://pandas.pydata.org/pandas-docs/version/0.20/io.html read_hdf and to_hdf, which is supported by plenty of languages https://en.wikipedia.org/wiki/Hierarchical_Data_Format. Using input and output files will make your program more portable.

Using embedded Python (compiled) you can simply use whatever Python functions you've created in their .py form (embedpython.exe at my DropBox link at the end of this post, along with all the files in the zip there), which is probably your best, easiest, and fastest route- for sourcecode / usage reference: Embedded Python does not work pointing to Python35.zip with NumPy - how to fix? It is by FAR the easiest way to get your code compatible on any system, and changing between your Python scripts is easy (when you are calling different libraries the entire packages have to be available in a subfolder). In an Anaconda Python installation the files will be in the respective installed packages folder, usually C:\Anaconda3\Lib\site-packages\ [packageName] \ ; typical Python installations are located at C:\Python\Lib\site-packages\ [packageName] \). Then you put the entire installation directory for each package under the "extension_modules" directory. So it looks like extension_modules\numpy\, extension_modules\pandas\, and all the libraries you are importing (along with libraries the packages are dependent on).

Here are some examples of how to call the respective language with the EXE: JAVA: Process process = new ProcessBuilder("C:\\PathToExe\\embedpython.exe","pyscript","pyfunction","input_param1","input_param2").start(); MATLAB: system('"C:\PathToExe\embedpython.exe" pyscript pyfunction input_param1 input_param2'); VBA: Call Shell("C:\PathToExe\embedpython.exe" "pyscript" "pyfunction" "param1" "param2", vbNormalFocus) C++: How to call an external program with parameters? .NET C#: How to pass parameters to an exe? as you see, the list goes on and on... pretty much any language can call an EXE file. Sure you want maximum performance but to get compatibility across all languages you have to compromise in some fashion. But using the above suggestions your performance should still be great provided the .py functions are optimized.

Making everyone's life easier here's the compiled version x64 Python 3.5 Windows NumPy SciPy and Pandas Intel MKL included: https://www.dropbox.com/sh/2smbgen2i9ilf2e/AADI8A3pCAFU-EqNLTbOiUwJa?dl=0

Answers 5

As it has been already mentioned many times - one of the ways is to create REST API and send input and output over HTTP.

However, there is another option which is more complex. You can use CORBA (Common Object Request Broker Architecture). There is an implementation of CORBA in python omniORB. CORBA allows to interface between apps written in various languages.

There are a number of examples on using CORBA with python on the web.

Answers 6

In a similar spirit as what Nurzhan mentions above regarding CORBA, you could use OPC UA: https://en.m.wikipedia.org/wiki/OPC_Unified_Architecture

It is an architecture oriented to device control via server to client communication but may suit your needs. In my work we have used licensed C/C++ (Unified automation and Java sdks (prosys), explored Python options and also embedded solutions from PLCs and the inter-plarform communication works well.

There are several open source projects for OPC UA in Python in the web, e.g. freeopcua.

Answers 7

You can use Cython to relatively easily extend your python code so that it can be compiled as a C library. Mostly, this just involved replacing the def keyword with cdef public for the functions you want to expose, and annotating your variables types.

Here's an example of such Cython code:

cdef public void grail(int num):     printf("Ready the holy hand grenade and count to %i", num)  

At that point, many languages have Foreign Function Interfaces (FFI) to C code.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment