mbl
mbl

Reputation: 925

How does COM achieve language interop?

I understand how COM can achieve compiler agnostic C++ code, since it defines an ABI by being careful what features of the C++ language to use. It's just C++ code talking to C++ code in a really clever way. However I still don't understand how it can allow for language interop with C# or Javascript for example.

Where is the boundary? The only explanation I have right now is that the language compiler itself must have special support for COM so that it can generate the proper assembly code to allow for accurate communication between caller/callees.

Upvotes: 0

Views: 294

Answers (3)

Kenny Kerr
Kenny Kerr

Reputation: 3115

Since you've tagged your question with WinRT, I assume you're asking specifically about how this is achieved by WinRT language projections. In that case, all languages must have some way to map their natural language constructs to the COM ABI that WinRT defines. That ABI is derived from metadata encoded in the ECMA 335 standard and special rules are applied to transform the abstract metadata into a concrete ABI. There are naturally different ways to achieve this. The CLR itself was updated to support WinRT in C#. The Visual C++ compiler was (sadly) updated with language extensions to support WinRT via C++/CX. The C++/WinRT approach is very different in that it requires only a standard C++ compiler and all of the knowledge about WinRT is delivered via a standard C++ header-only library. Other languages might take different approaches, but at the end of the day they must agree on the way that types expressed in metadata are transformed into objects and virtual function calls on the ABI based on COM.

And while this process is not well documented at the moment, C++/WinRT is one of the only open source language projections and thus acts as a useful reference implementation for those who need to understand how WinRT works under the hood.

https://github.com/microsoft/cppwinrt

Upvotes: 1

Euro Micelli
Euro Micelli

Reputation: 33998

It’s not magic, of course.

COM sets up rules for language interop. It’s just a contract, with some helpful tooling. Each language that wants to support COM has to find a way to abide by the rules on its own. They all have to provide their own compatible mechanism one way or another.

In the case of C++, the rules appear to come for free as you mentioned, but be aware there is one caveat: the language standard does not specify the layout and mechanism of classes and virtual functions. The method mimicked by COM is one extremely common implementation of virtual calling (“the VTable”), and COM follows the exact layout used by the Microsoft compiler. But you can have a perfectly valid C++ compiler where classes with virtual functions would not be compatible with the COM layout. It’s just that nobody does that, at least not in Windows compilers. So even in C++ there is some “meeting in the middle” by the compiler.

In C, you have to do the whole thing by hand. Other languages might allow you to do the same thing (assembler of course).

To help compiled languages exchange information about specific contracts, COM provides Type Libraries and mechanisms to read them. A compiler or language that want to take advantage of them also has to “meet in the middle” and learn how to process them (for example, the Microsoft C++ #import directive; the VB6 Libraries menu).

No every language will support everything you can do in COM, because there is a point (in more obscure features) where the return on investment in implementing support in the language doesn’t pan out. Each language has to pick its own limitations. There is plenty of stuff that you can do in COM (read the IDL specs) that VB6 cannot do.

Because following COM rules in a script-like language is between inpractical and impossible, COM offers a higher-level approach (Automation) that is more amenable to dynamic languages, even if more limited. But a language implementer that wants to provide client support for Automation has to implement an understanding of the IDispatch interface, an activation mechanism, and a translation to its language’s proper facilities. And a scripting language wanting to provide support for creating COM servers has to work even harder to implement a valid COM IDispatch implementation and a standalone host engine on behalf of the user scripts. Even VBScript couldn’t do this at the beginning, until Microsoft added .SCR support with the Windows Scripting Host. “Meeting in the middle” again.

If a language wants to support both pure COM and Automation, they need to work double hard; support for one does not automatically give you support for the other.

For .NET languages like C#, most of the work is done for both native COM and Automation inside of the .NET Runtime, which provides the implementation of the COM Callable Wrappers (CCW) and Runtime Callable Wrappers (RCW) necessary to interact with COM, and handling the conflicts between the Reference Count approach of COM and the GC approach of .NET. Microsoft did all the work in one place so individual .NET language designers didn’t have to.

So, yes, the language implementer has to work extra to give the language special support for COM: following the binary layout rules, implementing a translation layer when needed, and/or possibly providing tooling to read Type Libraries.

Language Interop requires both sides (the caller and the callee) to “meet in the middle” somewhere. COM is just a specification that gives designers that middle ground, “a place where all can meet”.

Upvotes: 1

selbie
selbie

Reputation: 104514

The "Type Library" is what enables interop of COM components between different languages.

https://learn.microsoft.com/en-us/windows/desktop/midl/com-dcom-and-type-libraries

A type library (.tlb) is a binary file that stores information about a COM or DCOM object's properties and methods in a form that is accessible to other applications at runtime. Using a type library, an application or browser can determine which interfaces an object supports, and invoke an object's interface methods. This can occur even if the object and client applications were written in different programming languages. The COM/DCOM run-time environment can also use a type library to provide automatic cross-apartment, cross-process, and cross-machine marshaling for interfaces described in type libraries.

The other approach for language interop (e.g. C++ projecting objects to Javascript) is that a COM object can implement IDispatch.

Upvotes: 1

Related Questions