DMSPool and parallelism in connection with SAPALink

The tia® Content Server (from now on abbreviated as "CS") is able to process a large number of incoming requests. There are several tools that are being used internally.

 

When a SAP system(s) sends ArchiveLink requests to our CS, they are processed independently as individual HTTP servlet requests. The first role is played by the web server itself. It is responsible for the strict separation of the requests.
The CS is stateless, which means that the request does not contain any information that will be reused later.
The CS always responds with the appropriate ArchiveLink HTTP response codes. This allows the CS to be used in high availability environments, as individual instances can act independently of each other and each can respond to a request on its own.

To communicate with a physical archive system we use our own interface /SDK. The so-called SAPALink SDK.
Our own tia® Store (supports: Filesystem, S3, Azure Blob, NetApp, iCAS, DB Blob, EMC Centera) uses this as well. This way we ensure that partners can use the same interface/SDK to connect their own store/archive system.

SAPALink calls predefined methods (specified in the SDK) via reflection. Two of them are LinkOpen() and LinkClose().

In the CS we create connections to the SAPALink implementation via the so-called DMSPool.
This is based on the library "Commons Pool" from Apache (https://commons.apache.org/proper/commons-pool/ ). This pooling works as follows:

An HTTP ServletRequest reaches the web server. This forwards it to the CS.
This creates a DMS object with the very first request (directly after starting the CS).
It uses this and calls the SAPALink method "LinkOpen()" in this object. This happens once per DMSObject. Now the request is processed (e.g. a Create now calls componentCreate in the SAPALink interface.
During this process the DMSObject is in use and cannot be used by any other request).
If the request is finished (regardless of success or failure) the DMSObject is "released" and thus returned to the DMSPool.
The next request which comes now checks whether an existing DMSObject is available and uses this.
So this connection is reused. (The same principle as well as the same library is used with DB-Pools).

If a request comes in now, while the DMSObject is still in use, the DMSPool creates a new DMSObject by itself. This leads to a new SAPALink connection (and thus again to a new LinkOpen()).
The same principle applies here: At the end of the request the object is returned to the pool.
In case of a high parallel request volume (e.g. due to a migration or due to 1000s of SAP users)
this results in several parallel DMSObjects. This is desired. In case of a particularly high volume of requests (200-300 requests per second), it can also happen that 30-50 DMS objects are created and returned to the pool. These consume however almost no resources, since these are only "logical" connection objects.

Configuration

By means of settings one can influence the pooling procedure.

It should be pointed out that it can cause negative effects to change settings in the pooling procedure!

 

DMS Connection Keep Alive: This setting specifies whether the connection should be actively returned to the DMSPool even if there is an error. This is recommended, as ArchiveLink errors are legitimate.
If this checkbox is "unchecked", it means that in case of an error the whole connection will be closed ("LinkClose()") and the DMSObject will be destroyed. Since the creation of a connection is possibly accompanied with a longer runtime (depending on what happens with LinkOpen), this is to be weighed up
whether one really wants to destroy the connections every time. The performance suffers in this case.

Timeout aquiring DMS Object (ms): The time how long the DMSPool should wait for a response from the SAPALink implementation (LinkOpen()) until the connection is automatically terminated. If a new DMSObject is created and the underlying backend is not available, the request is held (default: 5 seconds) until it is reported back to the caller with an error (timeout). This used to occur frequently with slow backend systems, but no longer plays a noticeable role with today's systems (SAPALink implementations usually respond in the ms range).

Below the "Common" tab, some settings also refer to the DMSObject.

Max Process Count: Specifies the maximum connections of DMS objects.
Default: 0 = Unlimited. Here you can limit the number of DMSObjects. For example, if you specify only "1", only one connection (and thus only one LinkOpen()) to the SAPALink backend system is opened.
If many requests come in parallel, they will wait because only one request from a DMSObject can be used at a time.

Attention: If more requests come in parallel than can be created/used here, the last ones always run into a timeout, because the request queue is saturated at some point (and the TTL of requests is also exceeded). It is recommended to leave the value at "0".

Backend Read Timeout(s): Specifies the time in seconds until the request is aborted by the CS if the backend (the SAPALink implementation) does not respond in time to a GET (Read) command.
Example: An attempt is made to read a 100 GB file, but the ReadTimeout is set to 1 sec.
So the CS aborts the request after one second (with timeout), because the response [100 GB probably takes several seconds] did not come back within this one second.

Backend Write Timeout(s): Analogous to Backend Read Timeout, but for write commands (POST, PUT).

Min Process Count: We have a partner that requires an open connection at all times.
If the parameter "Min Process Count" is maintained with a value > 0, they cannot be closed.
This is a special case that should be prevented. It is recommended to keep 0 in the default.

Closing connections

Connections are only closed by the CS during shutdown.
Means, as soon as the OSGi bundle gets a stop status (e.g. by stopping the web application or by shutting down the web server), no new DMS objects are created. The DMSPool is "closed". The current requests are processed (and the shutdown is actively blocked by the system until the requests are processed).
New requests are rejected. If all DMS objects that are "in use" have been processed, a "terminate" is called via each connection. This triggers the "LinkClose()" method in SAPALink. After all DMS objects have been destroyed and all connections have been closed, the CS stops actively
and the web server / application can shut down.

In running mode connections are only closed in combination with "DMS Connection Keep Alive" [unchecked]. It is recommended to keep the connections open (as in the default), because a LinkClose() is not desired in the running operation depending on the implementation at the partner.