|
|
(11 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
− | == Apache ==
| + | In order to extend OpenKM functionalities, it can be integrated with some external software which improves the OpenKM user experience, adding new features to the application. We are working to expand this list of applications, so stay tuned! |
− | Expose OpenKM directly from JBoss can be dangerous if you need the application to be accessed from Internet. Also this 8080 may be closed by a firewall. For these reasons, is a good idea expose your OpenKM installation through the standard web port 80. In the following steps we explain how to configure Apache to handle these request and forward to JBoss application server using the AJP13 protocol.
| |
| | | |
− | From the Apache documentation: The AJP13 protocol is packet-oriented. A binary format was presumably chosen over the more readable plain text for reasons of performance. The web server communicates with the servlet container over TCP connections. To cut down on the expensive process of socket creation, the web server will attempt to maintain persistent TCP connections to the servlet container, and to reuse a connection for multiple request/response cycles.
| + | * [[Apache]] |
| + | * [[Nginx]] |
| + | * [[OCR]] |
| + | * [[OpenOffice.org]] |
| + | * [[SWFTools]] |
| + | * [[Antivirus]] |
| + | * [[Acme CAD Converter]] |
| | | |
− | The first thing in to install the required Apache software. From Debian / Ubuntu you can install Apache with a single command:
| + | [[Category:Installation Guide]] |
− | | |
− | $ sudo aptitude install apache2
| |
− | | |
− | Edit the file called /etc/apache2/apache2.conf and configure a ServerName to prevent warnings in the Apache startup process:
| |
− | | |
− | <source lang="apache">
| |
− | ServerRoot "/etc/apache2"
| |
− | ServerName "your-domain.com"
| |
− | </source>
| |
− | | |
− | Enable the proxy module, needed to forward petitions to JBoss:
| |
− | | |
− | $ sudo a2enmod proxy_ajp
| |
− | | |
− | Now create the configuration file /etc/apache2/sites-available/openkm.cfg with this content:
| |
− | | |
− | <source lang="apache">
| |
− | <VirtualHost *>
| |
− | ServerName openkm.your-domain.com
| |
− | RedirectMatch ^/$ /OpenKM
| |
− | <Location /OpenKM>
| |
− | ProxyPass ajp://127.0.0.1:8009/OpenKM
| |
− | ProxyPassReverse http://openkm.your-domain.com/OpenKM
| |
− | </Location>
| |
− | CustomLog /var/log/apache2/openkm-access.log combined
| |
− | </VirtualHost>
| |
− | </source>
| |
− | | |
− | The VirtualHost ServerName must be other than ServerName in the main Apache configuration. Enable this site configuration:
| |
− | | |
− | $ sudo a2ensite openkm.cfg
| |
− | | |
− | You have to enable explicity the proxy access editing the Apache configuration file ''/etc/apache2/mods-available/proxy.conf'':
| |
− | | |
− | <source lang="apache">
| |
− | <Proxy *>
| |
− | AddDefaultCharset off
| |
− | Order deny,allow
| |
− | Allow from all
| |
− | Deny from all
| |
− | #Allow from .example.com
| |
− | </Proxy>
| |
− | </source>
| |
− | | |
− | Finally restart Apache:
| |
− | | |
− | $ sudo /etc/init.d/apache2 restart
| |
− | | |
− | Now you can access your OpenKM installation from http://openkm.your-domain.com/. Another advantage of using Apache is that you can log OpenKM access and generate web statistics.
| |
− | | |
− | For more info, visit:
| |
− | * http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
| |
− | * http://httpd.apache.org/docs/2.2/mod/mod_proxy_ajp.html
| |
− | | |
− | == OCR ==
| |
− | Tesseract is an Open Source OCR engine adopted by Google. It works really well. The OCR natively can read TIFF documents and has hight ratio of recognition with images 300 dpi of resolution and converted to lineart (1 bit color).
| |
− | | |
− | You can download the source code from http://code.google.com/p/tesseract-ocr/ and compile yourself. Also download the language files you need and uncompress them in the same folder of the application.
| |
− | | |
− | If you are using a computer with Debian / Ubuntu, the installation simplifies a lot:
| |
− | | |
− | $ aptitude install tesseract-ocr
| |
− | | |
− | And
| |
− | | |
− | $ aptitude install tesseract-ocr-eng
| |
− | | |
− | If you want to add support for english language. Now you have to tell OpenKM to use this OCR application. Edit the file OpenKM.cfg:
| |
− | | |
− | $ vim OpenKM.cfg
| |
− | | |
− | And set the system.ocr property to the path of the tesseract executable:
| |
− | | |
− | <source lang="java">
| |
− | system.ocr=/usr/local/bin/tesseract
| |
− | </source>
| |
− | | |
− | For more info, go to http://code.google.com/p/tesseract-ocr/.
| |
− | | |
− | There is also another interesting free OCR application called OCRopus. It has many improvements over Tesseract but is on early development stage. Last released version (0.3.1) is quite usable and works very well but have to be compiled and actually is a difficult task. Visit http://code.google.com/p/ocropus/ for more info.
| |
− | | |
− | == OpenOffice.org ==
| |
− | OpenKM can convert some document types to PDF. This is a great help if need to read an Microsoft Office / OpenOffice.org document and you don't have the software installed in the computer.
| |
− | | |
− | You need an OpenOffice.org installation in the OpenKM server, and also this OpenOffice.org application has to be running in server mode (also known as headless). In Debian / Ubuntu, depending of you OpenOffice.org version you will have to install an X11 virtual server or not:
| |
− | | |
− | $ apt-get install xvfb
| |
− | | |
− | And start it using this command:
| |
− | | |
− | $ xvfb-run /usr/lib/openoffice/program/soffice -headless -accept="socket,host=127.0.0.1,port=8100;urp;" -nofirststartwizard
| |
− | | |
− | From OpenOffice.org 2.3, it is not necessary the X11 virtual server but you should install these packages:
| |
− | | |
− | $ aptitude install openoffice.org-headless openoffice.org-java openoffice.org
| |
− | | |
− | But before of this, you must enable a couple of repositories:
| |
− | | |
− | <source lang="text">
| |
− | deb http://en.archive.ubuntu.com/ubuntu/ hardy-updates universe
| |
− | deb http://en.archive.ubuntu.com/ubuntu/ hardy-updates multiverse
| |
− | </source>
| |
− | | |
− | This script simplifies the start process (For security reasons, you should no start OpenOffice.org as root):
| |
− | | |
− | <source lang="bash">
| |
− | #!/bin/sh
| |
− | unset DISPLAY
| |
− | /usr/lib/openoffice/program/soffice "-accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager" -nologo
| |
− | -headless -nofirststartwizard
| |
− | </source>
| |
− | | |
− | OpenOffice.org will listen at port 8100, so you can check that the application has started running this:
| |
− | | |
− | $ netstat -putan | grep 8100
| |
− | | |
− | Also you can configure OpenOffice.org as a service with this script:
| |
− | | |
− | <source lang="bash">
| |
− | #!/bin/bash
| |
− | # openoffice.org headless server script
| |
− | #
| |
− | # chkconfig: 2345 80 30
| |
− | # description: headless openoffice server script
| |
− | # processname: openoffice
| |
− | #
| |
− | # Author: Vic Vijayakumar
| |
− | # Modified by Paco Avila and Federico Ch. Tomasczik
| |
− | #
| |
− | SOFFICE=/usr/bin/soffice
| |
− | PIDFILE=/var/run/openoffice-server.pid
| |
− | set -e
| |
− | case "$1" in
| |
− | start)
| |
− | if [ -f $PIDFILE ]; then
| |
− | echo "OpenOffice headless server has already started."
| |
− | sleep 5
| |
− | exit
| |
− | fi
| |
− | echo "Starting OpenOffice headless server"
| |
− | $SOFFICE -headless -nologo -nofirststartwizard -accept="socket,host=127.0.0.1,port=8100;urp" & > /dev/null 2>&1
| |
− | touch $PIDFILE
| |
− | ;;
| |
− | stop)
| |
− | if [ -f $PIDFILE ]; then
| |
− | echo "Stopping OpenOffice headless server."
| |
− | killall -9 soffice && killall -9 soffice.bin
| |
− | rm -f $PIDFILE
| |
− | exit
| |
− | fi
| |
− | echo "Openoffice headless server is not running."
| |
− | exit
| |
− | ;;
| |
− | *)
| |
− | echo "Usage: $0 {start|stop}"
| |
− | exit 1
| |
− | esac
| |
− | exit 0
| |
− | </source>
| |
− | | |
− | Change the permissions to this file:
| |
− | | |
− | $ chmod 0755 /etc/init.d/openoffice
| |
− | | |
− | Install openoffice init script links:
| |
− | | |
− | $ update-rc.d openoffice defaults
| |
− | | |
− | And this script will launch OpenOffice.org on every system reboot. Also you can launch it manually this way:
| |
− | | |
− | $ /etc/init.d/openoffice start
| |
− | | |
− | More info at:
| |
− | * http://www.artofsolving.com/node/10
| |
− | * http://www.oooforum.org/forum/viewtopic.phtml?t=11890
| |
− | * http://code.google.com/p/openmeetings/wiki/OpenOfficeConverter
| |
− | | |
− | == Antivirus ==
| |
− | OpenKM can check if a submitted document is infected. It works with an Open Source antivirus software called ClamAV. Edit OpenKM.cfg and add this line:
| |
− | | |
− | <source lang="java">
| |
− | system.antivir=/path/to/clamscan
| |
− | </source>
| |
− | | |
− | This screenshot shows an error message from OpenKM because the submitted document is infected by a virus:
| |
− | | |
− | To install ClamAV on Debian / Ubuntu distribution:
| |
− | | |
− | $ sudo aptitude install clamav
| |
− | | |
− | To install ClamAV in Centos 5.2 you need more work. First create a file named ''/etc/yum.repos.d/dag.repo'' with this content:
| |
− | | |
− | <source lang="text">
| |
− | [dag]
| |
− | name=Dag RPM Repository for Red Hat Enterprise Linux
| |
− | baseurl=http://apt.sw.be/redhat/el$releasever/en/$basearch/dag/
| |
− | gpgcheck=1
| |
− | gpgkey=http://dag.wieers.com/packages/RPM-GPG-KEY.dag.txt
| |
− | enabled=1
| |
− | </source>
| |
− | | |
− | Now install the program as root:
| |
− | | |
− | $ yum install clamd.i386
| |
− | | |
− | Start the daemon:
| |
− | | |
− | $ /etc/init.d/clamd start
| |
− | | |
− | And update the virus database:
| |
− | | |
− | $ freshclam
| |
In order to extend OpenKM functionalities, it can be integrated with some external software which improves the OpenKM user experience, adding new features to the application. We are working to expand this list of applications, so stay tuned!