Friday, October 8, 2010

Pentaho 3.6 Connection Pooling with MySQL using c3p0

This post is in relation to connection timeouts when using MySQL 5.x with Pentaho.  The default setting is for MySQL DB connections to timeout after 8 hrs, causing Tomcat to lose connections overnight, and resulting in strange errors within the Pentaho User Console the next day. One solution to this problem is to use c3p0 for connection pooling, and also to change some of the default connection settings for MySQL.

I had some issues setting this up correctly, and the steps outlined in the Pentaho Solutions book are slightly incorrect, so I figured I would document what I did to resolve this. I set this up on Ubuntu 10.04.

Download c3p0 for linux from sourceforge and follow the instructions below.

Copy c3p0-x.x.x.x.jar to the following locations (and verify proper permissions):

<pentaho_install_directory>/biserver-ce/tomcat/webapps/pentaho/WEB-INF/lib/
<pentaho_install_directory>/biserver-ce/tomcat/common/lib/
<pentaho_install_directory>/administration-console/lib/

File:

<pentaho_install_directory>/biserver-ce/pentaho-solutions/system/hibernate/mysql5.hibernate.cfg.xml

Edit as follows:

1. Insert the text below immediately before the <!-- mySQL configuration --> comment.

<!-- c3p0 connection pooling settings -->
<property name="connection.provider_class">org.hibernate.connection.C3P0ConnectionProvider
</property>
<property name="hibernate.c3p0.acquire_increment">3</property>
<property name="hibernate.c3p0.idle_test_period">14400</property>
<property name="hibernate.c3p0.min_size">5</property>
<property name="hibernate.c3p0.max_size">75</property>
<property name="hibernate.c3p0.max_statements">0</property>
<property name="hibernate.c3p0.timeout">25200</property>
<property name="hibernate.c3p0.preferredTestQuery">select 1</property>
<property name="hibernate.c3p0.testConnectionOnCheckout">true</property>

Monday, October 4, 2010

Startup Script for Pentaho 3.6 on Ubuntu 10.04

#!/bin/sh

# Startup script for Pentaho 3.6 BI Server and Administrative Console
# Usage: ./pentaho-init.sh start | stop | restart
# Based on original startup script provided by Roland Bouman & Jos van Dongen in Pentaho Solutions

# set up command for pentaho user, set java environment
cmd="sudo -u pentaho JAVA_HOME=/usr/lib/jvm/java-6-sun"
JAVA_OPTS="-Djava.awt.headless=true"

case "$1" in
start)
# go to the pentaho biserver directory, or biserver will not be able to find config files
cd /opt/pentaho/biserver-ce
# run startup script for pentaho biserver
$cmd ./start-pentaho.sh >> pentaho-biserver.log &
# go to pentaho admin console directory, or PAC will not be able to find config files
cd /opt/pentaho/administration-console
# run startup script for pentaho admin console
cd /opt/pentaho/administration-console
$cmd ./start-pac.sh >> pentaho-admin-console.log &
;;
stop)
# go to pentaho admin console directory, or PAC will not be able to find config files
cd /opt/pentaho/administration-console
# run shutdown script for pentaho admin console
$cmd ./stop-pac.sh >> pentaho-admin-console.log &
# go to the pentaho biserver directory, or biserver will not be able to find config files
cd /opt/pentaho/biserver-ce
# run shutdown script for pentaho biserver
$cmd ./stop-pentaho.sh >> pentaho-biserver.log &
;;
restart)
$0 stop
$0 start
;;

*)
echo "Usage: $0 {start|stop|restart}"
exit 1
esac

exit 0

Pentaho 3.6 Admin Console Configuration on Ubuntu 10.04

I thought I would post a note on the configuration of the Pentaho Administrative Console (PAC)... if you have modified the base-url property in the ../biserver-ce/tomcat/webapps/pentaho/WEB-INF/web.xml file, make sure that the entry is also contained in your /etc/hosts file. Otherwise you will get a very uninformative error message when trying to open the Pentaho Administrative Console, even within a browser on the actual host running the Pentaho server. The message I got was Unable to connect to BI Server. Confirm user credentials and web setting XML configuration. Alternatively, you may get an error message at the command line when executing the PAC startup script (../administration-console/start-pac.sh) such as Unable to resolve host <hostname> Seems a bit ridiculous to me, but that's what happens.

For example, my server name is jupiter, and it's running the Pentaho biserver on a local subnet. In order for browser clients from other hosts to connect to the Pentaho biserver, I modified the base-url property from the default of localhost to jupiter.

Related configuration file entries:

../biserver-ce/tomcat/webapps/pentaho/WEB-INF/web.xml

<context-param>
     <param-name>base-url</param-name>
     <param-value>http://jupiter:8080/pentaho/</param-value>
</context-param>

/etc/hosts

192.168.0.192     jupiter

This issue is covered in this thread in the Pentaho Community forums. I also recommend setting up the PAC server to run as a service at startup, in the same way as the Pentaho biserver.

Friday, October 1, 2010

Pentaho 3.6 on Ubuntu 10.04

I now have Pentaho 3.6 up and running on Ubuntu 10.04.  I'm running it on a Gateway Tower with an Intel Pentium 4 CPU @ 2.4GHz, with 2GB RAM, and 40GB hard drive.  Pentaho is performing well on this system, though I'm admittedly not running a lot of other services and haven't really taxed it yet.  We'll see how it does when I start to load some data on it.


Pentaho is a fairly straightforward install on Ubuntu, however I reconfigured it to run on MySQL instead of using the standard HSQLDB database that is bundled with the install package.  In addition to using the install and setup instructions in the Pentaho Solutions book, I also followed the MySQL 5.x guide located here by an Australian named Prashant Raju, which I found to be very useful.  I may convert it to PostgreSQL in the future, though for now I'm going to leave it as is.

A couple of details that I think are worth calling out within the installation and configuration steps:
  1. If you want remote hosts to be able to access the server and run reports you must change 'localhost' to a resolvable hostname or IP address in the base-url property of the web.xml file located in the ../biserver-ce/tomcat/webapps/pentaho/WEB-INF directory.  Otherwise remote hosts will be able to access the server but reports, dashboards, etc. will not execute and display properly.  This is covered in the on-line install guide I referenced above, but is an important step that is easy to miss.
  2. The JDBC pentaho security files and JDBC configuration files for the tomcat server have changed since the writing of Pentaho Solutions, follow the instructions in the on-line MySQL 5.x guide referenced above.
  3. I noticed that Pentaho/Tomcat did not react well when I was playing with the xml configuration files.  Although I bounced the server via the included start/stop scripts for Pentaho, I found that I sometimes needed to bounce the system after making config changes.  Otherwise I would get very strange errors in the browser when accessing and running reports, dashboards, charts, etc.
  4. I'd recommend caution if you're considering moving the ../biserver-ce/pentaho-solutions directory from it's standard location.  The on-line guide recommended doing this, I believe just for convenience of access, but it caused some issues for me and I changed it back to it's original location.
  5. I do recommend setting up a pentaho user account and running the pentaho and tomcat services under that account, and configuring the Pentaho BI server to start automatically at boot.  This is not covered within the online guide referenced above, but can be done in the standard way by creating a startup script and placing it into /etc/init.d and adding it to the services startup list.  This is covered well in the Pentaho Solutions book, and also in plenty of places online.  If you need help with this, here's the standard Ubuntu community doc link.
Well, I think that's it for now.