Step 1 - Get a nice clean linux distribution focused on use on servers.
I am using Ubuntu JeOS, as it will be hosted on a VMware virtual machine. This also means that the walkthrough that follows will be very debian specific.
See the following pages for help:
This page aims at documenting how to create virtual appliance using Ubuntu Server Edition's JeOS.
This page has snapshots of the entire install process and also information on how to set up a LAMP stack but if you want to follow this guide, don't install any of the applications it asks. It is only useful for our purposes up until the first reboot, before the guide does things we don't need, like activating the root user account, or adding PHP.
Note that the user name I will be using in this guide is simply 'user' and I will refer to this as either 'user' or 'username'. Replace this with whatever the username was that you chose during installation.
Step 2 - install it and set up networking and firewalls
Now, firewalls aren't as critical as you might think, especially if you have installed something like JeOS which has nothing really running as default. But there are some very handy tricks to help stop abuse from malicious script kiddies.
(For example, my favorite two liner I always add to the iptables firewall script is a couple of lines that rate limits ssh access tries to 3 attempts every 180 seconds. (Note, that the following doesn't immediately ACCEPT it, it passes it into a iptables chain called 'TRUSTED' which deals with what may be genuine attempts at access. If you wanted to just accept it, change TRUSTED to ACCEPT.)
# Rate limit SSH attempts.
iptables -A INPUT -p tcp -m tcp --dport ssh -m state --state NEW \
-m recent --hitcount 3 --seconds 180 --update -j DROP
# Allow first attempts through
iptables -A INPUT -p tcp -m tcp --dport ssh -m state --state NEW \
-m recent --set -j TRUSTED
NB Something along these lines would be fine to rate-limit upload attempts to Fedora as well.
[Edit: seems there was something awry with the following script - as illustrated here. Thanks to the Rubric team. I've added their fix :) But their fix may not be enough, so I'd advise not applying this until everything is installed and working correctly! I'll test it out as soon as I can. ]
Full example firewall script, including this snippet and port opening for tomcat, http/https, and ssh. (As this is from my home server, it'll include a few other services that you may not need for this walkthrough).
If you wish to use the SSL connection to Fedora on port 8443, remember to open that port as well!
Step 3 - Install all updates and get the basic applications we will need
Get a root prompt on a commandline:
e.g.
[user@server]$ sudo -s
[root@server]#
Then make sure you can a) connect to the internet and that b) the server is up to date:
[root@server]# apt-get update
[..... lots of lines of stuff ....]
Hit http://gb.archive.ubuntu.com gutsy-updates/multiverse Sources [1708B]
Fetched 278kB in 0s (319kB/s)
Reading package lists... Done
[root@server]# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be upgraded:
[ whatever packages that need to be upgraded will be listed here]
X upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 2497kB of archives.
After unpacking 16.4kB of additional disk space will be used.
Do you want to continue [Y/n]? Y
[ .... lots of lines of packages installing hopefully without error ..... ]
Hopefully, once those are installed, your machine will be up to date. Now to install on all the necessary packages:
[root@server]# apt-get install build-essential python-dev mysql-server sun-java5-jdk openssh-server python-mysqldb python-pysqlite2
Just let those install, but be aware that certain packages should ask you for information during installation, such as the required default root password for MySQL and a prompt will ask you if you agree with the Sun licence for Java.
Also, note that at this point, you should be able to SSH into the machine to continue working on it. It makes it a lot easier to cut and paste from guides if you do!
Step 4 - install some python libraries using Easy Install
Go here: http://peak.telecommunity.com/DevCenter/EasyInstall and download and install the ez_install.py script as it shows. Simply running it as root should do the trick:
[Edit: I did originally write the first part of this for Fedora 2.2, but since the REST api for Fedora 3.0 looks pretty damn usable, I've re-written this guide for version 3. Removing the installation/configuration for the SOAP client, drastically reduces the library dependancies this needs.]
[root@server]# wget http://peak.telecommunity.com/dist/ez_setup.py
[root@server]# python ez_install.py
So, we need to install some python libraries for later, iCalendar format (vobject) , OpenID consumer library (python-openid), and also install other miscellaneous things, such as a library that can generate UUIDs and a very good web framework called Pylons:
[root@server]# easy_install python-openid(NB we have already installed the python libraries to interact with MySQL and SQLite with the apt-get install command earlier. It is best to install the latest stable packages for the items above, which is why they are installed through easy_install.)
[root@server]# easy_install uuid
[root@server]# easy_install vobject
[root@server]# easy_install pylons
Step 5 - Get Fedora-Commons and Apache Solr.
Either just blindly download the packages I tell you to:
[user@server]$ wget http://downloads.sourceforge.net/fedora-commons/fedora-3.0b1-installer.jar
[user@server]$ wget http://apache.rmplc.co.uk/lucene/solr/1.2/apache-solr-1.2.0.tgz
Or better, download them from the homepages of the projects themselves, using links2
Install a text-based web-browser and browse and download the packages that way (A manual page for links2):
[root@server]# apt-get install links2
When that has finished installing, you can drop out of your root session (press Ctrl+D, or type 'exit') and download the relevant applications:
[root@server]# exit
[user@server]$ links2
Don't be alarmed, it's meant to blank the screen! Press the letter 'g' and an location bar prompt will appear.
First let's go to the Fedora commons site so type in 'http://www.fedora-commons.org/' and press enter. Use the cursor keys to go down and click (press return) on the 'Download Fedora 3.0 beta 1' link (24/02/2008). Scroll down a bit, and you should see a link to download the installer. You will be presented with the 'save jar file' dialog, so save the fedora installer jar file.
Now, let's get the search appliance, Solr. Got to 'http://lucene.apache.org/solr/' and click on the 'download' link. Choose a mirror, go into the 1.2/ folder on that mirror and download the 'apache-solr-1.2.0.tgz' file. Press 'q' to quit links2.
Step 6 - Make the server environment ready for Fedora Commons
If you now list the home directory, you should see something like this:
[user@server]:~$ ls
apache-solr-1.2.0.tgz doc ez_setup.py fedora-3.0b1-installer.jar
We will need the following:
- A directory to store Fedora's root directory (config files, logs, libraries, and default Tomcat instance)
- A mysql database and account for Fedora to use
- (Optional) A large filesystem to hold Fedora's data storage directory
[user@server]$ sudo -s
[root@server]# mkdir /opt/fedora30b1
Let the user own it: (Remember change 'user' to whatever your user is actually called!)
[root@server]# chown user:user /opt/fedora30b1
(Optional) And to aid upgrading, create a symlink at /opt/fedora to this folder:
[root@server]# ln -s /opt/fedora30b1 /opt/fedora
Fedora needs certain environment variables to be set up now, FEDORA_HOME and JAVA_HOME at the very least. Open up the system wide profile (/etc/profile) and add them in there. (I'm using the nano editor, vim is also available from a default JeOS install.)
[root@server]# nano -w /etc/profile
And add the following lines to the end of the file (also, note that there *must not* be any gaps either side of the '=' character, as tempting as it might be to press space to space it out to look better.):
# If you did not create the symlink, just point directly at your Fedora root
FEDORA_HOME=/opt/fedora30b1
# or if you did do the 'ln -s ...' step, use this instead:
FEDORA_HOME=/opt/fedora
export FEDORA_HOME
# If you did not create the symlink, just point directly at your tomcat root
CATALINA_HOME=/opt/fedora30b1/tomcat
# or if you did do the 'ln -s ...' step, use this instead:
CATALINA_HOME=/opt/fedora/tomcat
export CATALINA_HOME
JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun
export JAVA_HOME
Save the file (Ctrl-X in nano)
Now, to check that this has worked, type the command 'exit' a few times to logout and then log back in again as your default user. If things have worked well, the following commands should work:
[user@server]$ echo $FEDORA_HOME
/opt/fedora30b1
[Or '/opt/fedora' depending on what you chose.]
[user@server]$ echo $JAVA_HOME
/usr/lib/jvm/java-1.5.0-sun
Now to sort out MySQL. Remember that default root password you set for MySQL? You'll need it now.
[user@server]$ mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.0.45-Debian_1ubuntu3.1-log Debian etch distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql>
Now issue the following commands:
mysql> create database fedora30;
Query OK, 1 row affected (0.00 sec)
mysql> grant all on fedora30.* to 'fedoraAdmin'@'localhost' identified by 'PUTYOURPASSWORDHERE';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> ALTER DATABASE fedora30 DEFAULT CHARACTER SET utf8;
Query OK, 1 row affected (0.00 sec)
mysql> ALTER DATABASE fedora30 DEFAULT COLLATE utf8_bin;
Query OK, 1 row affected (0.00 sec)
mysql> exit
Bye
[user@server]$
(NB You may or may not need to add the utf-8 configuration lines for your particular version of MySQL, but as far as I know, the commands are harmless if you don't need them and utterly crucial if you do. Well, crucial unless you are dealing purely with ascii, but could you really guarantee that?)
Step 7 - install Fedora commons 3.0b1
(Note - Official installation guide is here)
Go to the location where you saved the fedora installer, probably the user's home directory and run the installer. I'll include the entire installation dialog here. Where the response is blank, I simply pressed enter to accept the default.
[user@server]$ cd /home/user
[user@server]$ java -jar fedora-3.0b1-installer.jar
***********************
Fedora Installation
***********************
To install Fedora, please answer the following questions.
Enter CANCEL at any time to abort the installation.
Detailed installation instructions are available at:
http://www.fedora.info/download/
Installation type
-----------------
The 'quick' install is designed to get you up and running with Fedora
as quickly and easily as possible. It will install Tomcat and an
embedded version of the McKoi database. SSL support and XACML policy
enforcement will be disabled.
For more options, including the choice of hostname, ports, security,
and databases, select 'custom'.
To install only the Fedora client software, enter 'client'.
Options : quick, custom, client
Enter a value ==> custom
Fedora home directory
---------------------
This is the base directory for Fedora scripts, configuration files, etc.
Enter the full path where you want to install these files.
Enter a value [default is /opt/fedora] ==>
Fedora administrator password
-----------------------------
Enter the password to use for the Fedora administrator (fedoraAdmin) account.
Enter a value ==> PUTTHEPASSWORDYOUDLIKEHERE
Fedora server host
------------------
The host Fedora will be running on.
If a hostname (e.g. www.example.com) is supplied, a lookup will be
performed and the IP address of the host (not the host name) will be used
in the default Fedora XACML policies.
Enter a value [default is localhost] ==>
Authentication requirement for API-A
------------------------------------
Fedora's management (API-M) interface always requires user authentication.
Require user authentication for Fedora's access (API-A) interface?
Options : true, false
Enter a value [default is false] ==>
SSL availability
----------------
Should Fedora be available via SSL? Note: this does not preclude
regular HTTP access; it just indicates that it should be possible for
Fedora to be accessed over SSL.
Options : true, false
Enter a value [default is true] ==>
SSL required for API-A
----------------------
Should API-A be accessible exclusively via SSL? If true, requests
to access API-A URLs will be automatically redirected to the secure port.
Options : true, false
Enter a value [default is false] ==>
SSL required for API-M
----------------------
Should API-M be accessible exclusively via SSL? If true, requests
to access API-M URLs will be automatically redirected to the secure port.
Options : true, false
Enter a value [default is true] ==> false
Servlet engine
--------------
Which servlet engine will Fedora be running in?
Enter 'included' to use the bundled Tomcat 5.5.23 server.
To use your own, existing installation of Tomcat, enter 'existingTomcat'.
Enter 'other' to use a different servlet container.
Options : included, existingTomcat, other
Enter a value [default is included] ==> included
Tomcat home directory
---------------------
Please provide the full path to your existing Tomcat installation, or
the path where you plan to install the bundled Tomcat.
Enter a value [default is /opt/fedora/tomcat] ==>
Tomcat HTTP port
----------------
Which HTTP port (non-SSL) should Tomcat listen on? This can be changed
later in Tomcat's server.xml file.
Enter a value [default is 8080] ==>
Tomcat shutdown port
--------------------
Which port should Tomcat use for shutting down? Make sure this doesn't
conflict with an existing service. This can be changed later in Tomcat's
server.xml file.
Enter a value [default is 8005] ==>
Tomcat Secure HTTP port
-----------------------
Which port (SSL) should Tomcat listen on? This can be changed
later in Tomcat's server.xml file.
Enter a value [default is 8443] ==>
Keystore file
-------------
For SSL support, Tomcat requires a keystore file.
If the keystore file is located in the default location expected by
Tomcat (a file named .keystore in the user home directory under which
Tomcat is running), enter 'default'.
Otherwise, please enter the full path to your keystore file, or, enter
'included' to use the the sample, self-signed certificate) provided by
the installer.
For more information about the keystore file, please consult:
http://tomcat.apache.org/tomcat-5.5-doc/ssl-howto.html.
Enter a value ==> included
Policy enforcement enabled
--------------------------
Should XACML policy enforcement be enabled? Note: This will put a set of
default security policies in play for your Fedora server.
Options : true, false
Enter a value [default is true] ==> false
Enable Resource Index
---------------------
Enable the Resource Index?
Options : true, false
Enter a value [default is false] ==> true
Enable REST-API
---------------
Enable the REST-API? The REST-API is an EXPERIMENTAL feature that exposes
the Fedora API with a REST-style interface. In particular, URL endpoints
should not be considered final, nor has policy enforcement been evaluated.
For more information about the REST-API, see
http://www.fedora.info/wiki/index.php/RESTful_Fedora_Proposal
Options : true, false
Enter a value [default is false] ==> true
Database
--------
Please select the database you will be using with
Fedora. The supported databases are McKoi, MySQL, Oracle and Postgres.
If you do not have a database ready for use by Fedora or would prefer to
use the embedded version of McKoi bundled with Fedora, enter 'included'.
Options : mckoi, mysql, oracle, postgresql, included
Enter a value ==> mysql
MySQL JDBC driver
-----------------
You may either use the included JDBC driver or your own copy.
Enter 'included' to use the included JDBC driver, or, enter the location
(full path) of the driver.
Enter a value [default is included] ==>
Database username
-----------------
Enter the database username Fedora will use to connect to the Fedora database.
Enter a value ==> fedoraAdmin
Database password
-----------------
Enter the database password Fedora will use to connect to the Fedora database.
Enter a value ==> PUTYOURDBPASSWORDHERE
JDBC URL
--------
Please enter the JDBC URL.
Enter a value [default is jdbc:mysql://localhost/fedora30?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true] ==>
JDBC DriverClass
----------------
Please enter the JDBC driver class.
Enter a value [default is com.mysql.jdbc.Driver] ==>
Successfully connected to MySQL
Deploy local services and demos
-------------------------------
Several sample back-end services are included with this distribution.
These are required if you want to use the demonstration objects.
If you'd like these to be automatically deployed, enter 'true'.
Otherwise, the installer will put the files in your FEDORA_HOME/install
directory in case you want to deploy them later.
Options : true, false
Enter a value [default is true] ==>
Preparing FEDORA_HOME...
Configuring fedora.fcfg
Installing beSecurity
Installing Tomcat...
Preparing fedora.war...
Processing web.xml
Deploying fedora.war...
Deploying fop.war...
Deploying imagemanip.war...
Deploying saxon.war...
Deploying fedora-demo.war...
Installation complete.
----------------------------------------------------------------------
Before starting Fedora, please ensure that any required environment
variables are correctly defined
(e.g. FEDORA_HOME, JAVA_HOME, JAVA_OPTS, CATALINA_HOME).
For more information, please consult the Installation & Configuration
Guide, located online at
http://www.fedora.info/download/ or locally at
/opt/fedora/docs/userdocs/distribution/installation.html
----------------------------------------------------------------------
And that should merrily go away and install and setup Fedora and the bundled Tomcat server for you. Unlike other services you may install, this won't start the Fedora service, nor will it create a handy startup/shutdown script that integrates with you linux startup scripts in /etc/init.d. We will create one later on.
Step 8 - Further configuration of Fedora 3.0
!IMPORTANT! Fix the broken 'mail.jar' library! (Broken, as in the REST api will not work correctly with the version release in 3.0b1)
Get it from here: http://python-fedoracommons-webarchive.googlecode.com/files/mail.jar and use it to replace the mail.jar found in $FEDORA_HOME/tomcat/webapps/fedora/WEB-INF/libs/mail.jar. Restart Tomcat if you need to.
I am keen on UUIDs, and I cannot see a good reason for not using them. I suggest using the fedora id 'namespace' of uuid, so that a fedora URI will look like <info:fedora/uuid:d3733f61-1083-4a3e-b914-5a853c42189b>
It is also trivial to generate these in python, consider the following code:
[user@server]$ python
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from uuid import uuid4
>>> uuid4().urn[4:]
'uuid:d3733f61-1083-4a3e-b914-5a853c42189b'
To get Fedora to accept these though, the 'uuid' namespace needs to be added to the retainPID region in fedora's configuration file.
[user@server]$ nano -w /opt/fedora/server/config/fedora.fcfg
Press Ctrl-W and search for retainPID. Add in uuid to the list of namespaces (the ordering is not important):
<param name="retainPIDs" value="demo uuid test changeme ...
Step 9 - Installing Solr
[NB you will only have to follow the guide below, but here are the official docs, should you get in trouble
http://wiki.apache.org/solr/SolrInstall - Basic installation
http://wiki.apache.org/solr/SolrTomcat - Tomcat specific things to bear in mind]
Extract the whole archive somewhere on disc and you will see something like this in the apache-solr-1.2 folder:
~/apache-solr-1.2.0$ ls
build.xml CHANGES.txt dist docs example KEYS.txt lib LICENSE.txt NOTICE.txt README.txt src
~/apache-solr-1.2.0$ ls dist
apache-solr-1.2.0.jar apache-solr-1.2.0.war
The easiest thing is to install Solr straight into the instance of Tomcat that Fedora has installed. One thing to be aware of is that search applications eat RAM and Heap for breakfast, so make sure you install it onto a server with plenty of RAM and it would be wise to increase the amount of Heap space available to the Tomcat instance. This can be done by making sure that the environment variable CATALINA_OPTS is set to "-Xmx512m". This can be done inside the catalina.sh script in your /opt/fedora/tomcat/bin directory.
[i.e. just add CATALINA_OPTS="-Xmx512m" at the beginning of the file if it doesn't already exist.]
One final bit of advice before I point you at the rather good installation docs is that you might want to rename the .war file to match with the URL pathname you desire, as the guide relies on Tomcat automatically unpacking the archive:
So, a war called "apache-solr-1.2.0.war" will result in the final app being accessible at http://tomcat-hostname:8080/apache-solr-1.2.0/. We will rename ours when we copy it into Tomcat's webapps directory.
Finally, Solr needs a place to keep its configuration files and its indexes. The indexes themselves have the capability to get huge (1Gb is not unheard of) and need somewhere to be stored. The documentation linked to below will refer to this location as 'your solr home' so it would be wise to make sure that this location has the space to expand. (NB this is not the directory inside Tomcat where the application was unbundled.)
So, let's create a solr home in /opt as we did for fedora (NB change user):
[user@server]$ sudo -s
[root@server]# mkdir /opt/solr
[root@server]# chown user:user /opt/solr
Place the solr.war into Fedora's Tomcat instance:
[root@server]# exit
[user@server]$ pwd
/home/user/apache-solr-1.2.0
[user@server]$ cp dist/apache-solr-1.2.0.war $CATALINA_HOME/webapps/solr.war
Finally, we have to make sure a variable is available in Tomcat's environment; the location of the Solr home directory. Remember that CATALINA_OPTS line we added before? Amend that now to look like:
(E.g. via nano -w $CATALINA_HOME/bin/catalina.sh )
CATALINA_OPTS="-Xmx512m -Dsolr.solr.home=/opt/solr"
Now, as we will shape the Solr search service later on (i.e. choosing the fields to be indexed, and how to index them for faceted searching) we will just copy across the basic solr example, to make sure everything is running fine.
[Make sure you are in the unpacked solr directory:]
[user@server]$ pwd
/home/user/apache-solr-1.2.0
[user@server]$ cp -a example/solr/* /opt/solr
[user@server]$ ls /opt/solr
bin conf README.txt
Adding HTTP authentication to Solr update
First add a username/password to tomcat/conf/tomcat-users.xml:
<tomcat-users>
...
<user username="solradmin" password="XXXXXXXX" roles="solradmin">
...
</user>
Then, in your Solr context, in tomcat/webapps/solr/WEB-INF/web.xml, add the following:
<web-app>
.... usual stuff ....
<security-constraint>
<web-resource-collection>
<web-resource-name>
SolrUpdate
</web-resource-name>
<url-pattern>/update/*</url-pattern>
</web-resource-collection>
<auth-constraint>
<role-name>solradmin</role-name>
</auth-constraint>
</security-constraint>
<!-- Define the Login Configuration for this Application -->
<login-config>
<auth-method>BASIC
<realm-name>Auth needed
</login-config>
</web-app>
NB BASIC authentication sends the password over by plain-text, so this isn't too great but is suitable for a localhost updater. Change this to DIGEST to increase the security, but bear in mind you may need to set the Realm for the Tomcat container and Digest hash mechanism (SHA1, MD5, etc)
(Some good guides to securing Tomcat services are but a Google search away - for example: http://www.unidata.ucar.edu/projects/THREDDS/tech/reference/TomcatSecurity.html )
Step 10 - Test your foundation
Now, we need to start up Fedora, and hopefully, it will all go smoothly:
[user@server]$ cd /opt/fedora/tomcat/bin/
[user@server]$ ./startup.sh
Using CATALINA_BASE: /opt/fedora/tomcat
Using CATALINA_HOME: /opt/fedora/tomcat
Using CATALINA_TMPDIR: /opt/fedora/tomcat/temp
Using JRE_HOME: /usr/lib/jvm/java-1.5.0-sun
Now try these links:
http://localhost:8080/fedora/search
http://localhost:8080/fedora/describe - make sure 'uuid' is one of the retainPIDs
http://localhost:8080/solr/admin Should look like a whole heap of options and bells and whistles.
Any 404 or 500 Server errors means that something has come unstuck. But, if you've followed this guide, using an Ubuntu Gutsy you should be all set without a problem - I just followed it on my home computer without a hitch :)
Next up, we are going to build a pylons interface to do basic CRUD type functionality with the ability to link together items semantically, using the SIOC project's (http://sioc-project.org/ontology) namespace, at http://rdfs.org/sioc/ns#