Repository configuration

From OpenKM Documentation
Revision as of 19:43, 25 January 2010 by Pavila (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

OpenKM uses Apache Jackrabbit to handle the document repository. From the Jackrabbit site:

Apache Jackrabbit is a fully conforming implementation of the Content Repository for Java Technology API (JCR). A content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more. Typical applications that use content repositories include content management, document management, and records management systems.

This means that if you configure an OpenKM repository, you are configuring a Jackrabbit repository.Jackrabbit offers several repository configurations: it can be stored in the local filesystem or in a remote database, or even in the AWS (Amazon Web Service) cloud.

Configuration parameters

The repository configuration file, typically called repository.xml, specifies global options like security, versioning and clustering settings. A default workspace configuration template is also included in the repository configuration file. The top-level structure of the repository configuration file is shown below:

<!DOCTYPE Repository
          PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 1.4//EN"
          "http://jackrabbit.apache.org/dtd/repository-1.4.dtd">
<Repository>
    <FileSystem .../>
    <Security .../>
    <Workspaces .../>
    <Workspace .../>
    <Versioning .../>
    <SearchIndex .../>    <!-- optional -->
    <DataStore .../>      <!-- optional -->
</Repository>

The repository configuration elements are:

  • FileSystem: The virtual file system used by the repository to store things like registered namespaces and node types.
  • Security: Authentication and authorization configuration.
  • Workspaces: Configuration on where and how workspaces are managed.
  • Workspace: Default workspace configuration template.
  • Versioning: Configuration of the repository-wide version store.
  • SearchIndex: Configuration of the search index that covers the repository-wide /jcr:system content tree.
  • DataStore: Data store configuration.

Bean configuration elements

Most of the entries in the configuration file are based on the following generic JavaBean configuration pattern. Such configuration specifies that the repository should use an instance of the specified class with the specified properties for the named functionality.

<ConfigurationElement class="fully.qualified.ClassName">
    <param name="property1" value="...">
    <param name="property2" value="...">
<ConfigurationElement>

Configuration variables

Jackrabbit supports configuration variables of the form ${name}. These variables can be used to avoid hardcoding specific options in the configuration files. The following variables are available in all Jackrabbit versions:

  • ${rep.home}: Repository home directory.
  • ${wsp.name}: Workspace name. Only available in workspace configuration.
  • ${wsp.home}: Workspace home directory. Only available in workspace configuration.

Security configuration

The security configuration element is used to specify authentication and authorization settings for the repository. The structure of the security configuration element is:

<Security appName="Jackrabbit">
    <SecurityManager .../> <!-- optional, available since 1.5 -->
    <AccessManager .../>       <!-- mandatory until 1.4, optional since 1.5 -->
    <LoginModule .../>         <!-- optional -->
</Security>

By default Jackrabbit uses the Java Authentication and Authorization Service (JAAS) to authenticate users who try to access the repository. The appName parameter in the <Security/> element is used as the JAAS application name of the repository.

If JAAS authentication is not available or (as is often the case) too complex to set up, Jackrabbit allows you to specify a repository-specific JAAS LoginModule that is then used for authenticating repository users. The default SimpleLoginModule class included in Jackrabbit implements a trivially simple authentication mechanism that accepts any username and any password as valid authentication credentials.

Once a user has been authenticated, Jackrabbit will use the configured AccessManager to control what parts of the repository content the user is allowed to access and modify. The default SimpleAccessManager class included in Jackrabbit implements a trivially simple authorization mechanism that grants full read access to all users and write access to everyone except anonymous users. The slightly more advanced SimpleJBossAccessManager class was added in Jackrabbit 1.3 (see JCR-650). This class is designed for use with the JBoss Application Server, where it maps JBoss roles to Jackrabbit permissions.

Workspace configuration

A Jackrabbit repository contains one or more workspaces that are each configured in a separate workspace.xml configuration file. The Workspaces element of the repository configuration specifies where and how the workspaces are managed. The repository configuration also contains a default workspace configuration template that is used to create the workspace.xml file of a new workspace unless more specific configuration is given when the workspace is created. See the createWorkspace methods in the JackrabbitWorkspace interface for more details on workspace creating workspaces.

The workspace settings in the repository configuration file are:

<Workspaces rootPath="${rep.home}/workspaces"
            defaultWorkspace="default"
            configRootPath="..." <!-- optional -->
            maxIdleTime="..."/>   <!-- optional -->
<Workspace .../>   <!-- default workspace configuration template -->

The following global workspace configuration options are specified in the Workspaces element:

  • rootPath: The native file system directory for workspaces. A subdirectory is automatically created for each workspace, and the path of that subdirectory can be used in the workspace configuration as the ${wsp.path} variable.
  • defaultWorkspace: Name of the default workspace. This workspace is automatically created when the repository is first started.
  • configRootPath: By default the configuration of each workspace is stored in a workspace.xml file within the workspace directory within the rootPath directory. If this option is specified, then the workspace configuration files are stored within the specified path in the virtual file system (see above) configured for the repository.
  • maxIdleTime: By default Jackrabbit only releases resources associated with an opened workspace when the entire repository is closed. This option, if specified, sets the maximum number of seconds that a workspace can remain unused before the workspace is automatically closed.

The workspace configuration template and all workspace.xml configuration files have the following structure:

<Workspace name="${wsp.name}">
    <FileSystem .../>
    <PersistenceManager .../>
    <SearchIndex .../>           <!-- optional -->
    <ISMLocking .../>            <!-- optional, available since 1.4 -->
</Workspace>

The workspace configuration elements are:

  • FileSystem: The virtual file system passed to the persistence manager and search index.
  • PersistenceManager: Persistence configuration for workspace content. For more info, read http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ.
  • SearchIndex: Configuration of the workspace search index.
  • ISMLocking: Locking configuration for concurrent access to workspace content.

Nota advertencia.png To modify the configuration of an existing workspace, you need to change the workspace.xml file of that workspace. Changing the <Workspace/> element in the repository configuration file will not affect existing workspaces.