Configuration (Store) - S3 ManagedBuckets (13)

The S3ManagedBuckets will distribute new S3-Objects/Documents among a configured amount of Buckets. This Configuration is useful for huge data-masses that should distributed equally among S3-Buckets. Its highly recommended to set AllowBucketCreate to true.

The KGS Store will hash incoming DocIDs/Identifier-Names in order to determine in which Bucket the Document needs to be created. If the Bucket does not exist and the Parameter “AllowCreateBuckets= TRUE” is set, a new Bucket with the given Name (and if applicable: prefix) will be created. The Range of Buckets where the Document may be created in is determine by the Parameter “BucketsPerGroup”.

Find attach a Statistic about the distribution algorithm.

There are a few things to keep in mind:

  • Bucket Names often needs to follow specific guidelines. For AWS for example: AWS Bucket Names Rules .Please ensure you follow these rules (for prefix)

  • There are often Limits when i comes to Amount of Buckets or BucketSizes. AWS for example only allows to create 100 Buckets per Default / Account. BucketSize is in AWS not specified. Nevertheless, ensure you dont hit any restrictions for your S3- BackendSystem.

Especially when Prefixes for different Repositories are defined, the amount of created Buckets might grow very fast. Max = [BucketGroups * BucketsPerGroup * Prefix per Repository]

[COMMON SETTINGS]

In order to use S3 ManagedBuckets, please set in [COMMON SETTINGS] section the parameter StorageType = 13.

[STORAGE_S3_MANAGEDBUCKETS]

Parameter

DataType

Description

Default

ConnectionUser

String

Depends on the used S3 endpoint.



Examples:

Amazon AWS → Access key ID



ConnectionPass

String

Depends on the used S3 endpoint.

This parameter value will be encoded during the first startup.



Examples:

Amazon AWS → Secret access key





Region

String

Used for Amazon AWS.

Specify your AWS region here. In Case you use a S3 Endpoint, the Region will be used as SignerRegion.



Protocol

String

http or https

https

AllowCreateBuckets

Boolean

Should be set to true, otherwise the Bucketnames that are determined by the incoming ObjectID/DocID cant be created automatically.

TRUE

RequestSigner



Possible values:

  • NoOpSignerType

  • QueryStringSignerType

  • AWS3SignerType

  • AWS4SignerType

  • AWS4UnsignedPayloadSignerType

 

(for using Signature Version 2 (SigV2), please use the QueryStringSignerType. This enables per default SigV2 as SignerType).

 

AWS4UnsignedPayloadSignerType

HashCheck

Boolean

Enables or disables the hash-key check during document checkout (get).

Can be specified per contentrepository by adding a trailing "_<ContentRepository>".



Example: HashCheck_FI = FALSE

FALSE

RetentionPeriod

Integer

Defines the amount of time (in months) from creation date until the documents should be protected from modification.

Can be specified per contentrepository by adding a trailing "_<ContentRepository>".



Example: RetentionPeriod_FI = 12



LockFiles

Boolean

If enabled locks all objects in compliance mode during the create process.





BucketName

String

Defines the name of the bucket in which files will be archived.

Every bucket name has to be unique throughout all Amazon regions!





EndPoint

String

URL for your S3 endpoint.

Should not be used in case of Amazon AWS. Use the Region parameter instead.





ClientOptions

String

Possible values:

  • PathStyleAccess



Values can be combined with a semicolon.



Example:

ClientOptions = PathStyleAccess:true;Parameter2:false





MaxConnections

Integer

Defines the maximal number of opened connections.



ConnectionTimeout

Integer

Timeout for connections and read requests in seconds.



ContrepInPath

Boolean

TRUE: For every Contentrepository a directory will be created and documents and / or corresponding sub-directory-trees will be attached into this directory.

FALSE: Leading Contentrepository-directory will not be created.

FALSE

DocumentMetaDataRepresentation

Boolean

FALSE: The metadata of an filed document will be stored as content of the file.

TRUE: The metadata of an filed document will be stored as metadata of a 0-byte file.

TRUE

UseFlatStylePath

Boolean

TRUE: If this options is enabled KGS Store won't build a substructure tree, but save all files inside a single directory, the repository.

FALSE: If this option is being disable KGS Store will build a structure of sub-directory which represent the DocumentID.

In case of Hitachi Vantara HCP storage system we recommend to set this parameter to FALSE!

TRUE

BucketGroups

Integer

Amount of Groups. Should always start with 1. In the later Process when a new fresh Group of Empty Buckets are needed (e.g. the old Group has a huge amount of objects etc.), the Parameter should incremented by one (so 2,3,4 …).

1

BucketsPerGroup

Integer

Amount of Buckets per Group. Thats the amount of Buckets that are created depending on the ObjectID/DocID.

Example: With BucketGroups = 1 and BucketsPerGroup = 5, the Store will distribute all incoming S3Objects among these five Buckets.

It is important to keep the amount of BucketsPerGroup on a fixed value. Even when the BucketGroups-Parameter is incremented.

5

BucketPrefix

String

Determine the Prefix of the automatically created Bucket.

The prefix needs to follow the S3 guidlines regarding BucketNaming.

Can be specified per ContentRepository by adding a trailing _

 

Example:

BucketPrefix_FI = FI

 

Be aware: multiple BucketPrefix will create the amount of Buckets needed depending on the BucketsPerGroup amount.

 

So the total amount of created Buckets are

BucketPrefix * BucketPerGroup * BucketGroups

 

 

BucketNameFormat

String

specify Rules for BucketName Format. It follows the Java String.format() rules.

https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html

Caution: in Case the BucketName format inherit a percentage ( % ) Symbol, the String must end with an percentage-Symbol as well.

 

Example:

BucketNameFormat = %6.6s%

This will lead to a BucketName with minLength of 6 and a maxLength of 6.

 

OnBucketCreate

String

UserExit for calling dynamically a script whenever a new Bucket is created. Java ProcessBuilder is used. https://docs.oracle.com/javase/7/docs/api/java/lang/ProcessBuilder.html

Syntax: comma-separated values.

$bucketName is a variable that get replaced by the actutal bucketName that is created.

$repository is a variable that get replaced by the actual (content)repository.

Windows e.g.

cmd.exe,/c,C:/myScripts/testScript.bat,$bucketName,$repository

Linux:

sh,-c, script.sh,$bucketName,$repository

 

path to script can be either absolute or relative (relative: in combination with “OnBucketCreateWorkingDir”).

 

Can be specified per ContentRepository by adding a trailing _

OnBucketCreate_FI = cmd.exe,/c,script.bat,$bucketName

 

OnBucketCreateWorkingDir

String

Folder (from) where the script from “OnBucketCreate” will be executed. If empty, the Script will be called at the actual script location.

May also be defined for specific repsoitory / repositories.

Example:

OnBucketCreateWorkingDir_FI = Z:\WorkingDirectories\Create_FI

OnBucketCreateWorkingDir_FI = /mnt/working_directories/create_fi/

 

HCPMode

Boolean

Changes the behavior, how the content type is being commited.

Please note: HCPMode is not related to HCPRetentionMode!

TRUE: Content type is being committed as a user-meta-data-field.

FALSE: Content type is committed as a normal content type.

FALSE

HCPRetentionMode

Boolean

Enables or disables hardware-based retention on HCP.

Please note:

In order to use HCP retention, the following has to be done ( depends on the version of your HCP and whether it’s on-premise or within the cloud! ):

  • The HCP-bucket has to has file locking enabled

  • The parameter “LockFiles” has to be true

  • Retention has to be set!

TRUE: Enables hardware-based retention on HCP.

FALSE: Disables hardware-based retention on HCP.

FALSE

CleanVersions

Boolean

true: Deletes all previous versions on update and delete operations in buckets with versioning.

false: All versions remain.

TRUE

 

 

 


→ Configuration (Store) - StorageTypes - ArchiveLink