In my part time I am working on eXtensible Catalog Drupal Toolkit project. It is a next generation Open Source library* discovery interface, i. e. the end user interface of a catalog. We have three types of Drupal projects: a row of Drupal modules (about 20+ modules), a theme, and a distribution with a installation profile. All there projects are stored permanently at drupal.org's Git repository. If you want to try out the Toolkit, right now it is quite simple, because the distribution. This post is about my practices I follow when working on the development of this distribution.
The normal module/theme development is that I set up a new Drupal site, go to modules and themes, checkout the git repos, and start working. When I fix a problem, I commit it, and push back into the shared repository, so other developers can use my code. But in case of distribution development we work with a Drupal instance without Git directories inside. Our main task is this time to check whether we have issues during the installation process.
The eXtensible Catalog is quite unique in several aspects: it should handle millions of records (usually every reading material has a node and since we work with FRBR module, we have 4 other record types, so an academic library with 3 million items produces 12 million metadata records, 3 million nodes, and 3 million Solr documents). We don't use CCK and field/entity framework (we created our own solution, but that is not the topic of this entry). One of our design principles was that librarians should be able to modify (almost) every aspects of the catalog. The consequence of all these is that lots of the business logics are described in database, and behind the relatively small number of end user pages, we have dozens of admin pages - to govern the display and behaviour of the end user pages (this is the biggest difference between XC and my other project, Europeana, where the business logic is described by Java codes). If you would have to fill these admin screens by scratch, you would probably give it up before reaching the end of this huge task, so we provided default values. These default values are stored in comma separated values or XML files, and injected as a part of the installation process of the individual submodules. Some modules dependent on others, for example we have an xc_metadata module to handle metadata management, and some metadata schema modules for the RDA-like XC schema, and another one for the Dublin Core schema. When the XC schema module is installed, it calls the metadata module's API to store and register the XC schema to the system, so the order of installation is crucial in this case.
The distribution project does two things:
- with Drush make's help it selects and downloads the necessary components a library needs to setup a fully working XC Drupal instance and
- it has an installation profile, which installs these components in the right order, and sets some site-wise values, like the default theme, and the name of site. It also prepares the harvester to be able to fetch some demo data from another XC component (the Metadata Services Toolkit) with which the library can start working.
From this picture you might guess, that there are lots of things during the installation of Drupal, and these processes are triggered and orchestrated by the installation profile. However if something went wrong it is usually happens inside modules (sometimes as the result of the combination of several modules). As I said earlier in this phase of development we concentrate on the issues happens during the installation, and we enter this phase only if in the after-installation-period of the life cycle the software works perfectly (if that's not true, we have to go back to our previous development model). In this distribution-development phase we have to fix the module, then launch the installation again. But since here we lack the advantages of Git, we modify the codes in another directory, where we have Git and copy the modified versions here with rsync. I personally use Eclipse, and in that tool a new project setup is not painless, so I usually work with my code in an established directory, and copy the codes when it needs. Others prefer to fix the code in place, and sync with the repo in a different way. Either way is OK, and might work for you. Here is the script I use for syncing, and cleaning up the database:
# clean_db.sql contains the deletion of all tables inside Drupal
mysql -u XXXXX -p xc710 < clean_db.sql
# syncing code bases
rsync -azv --exclude '.git' $INSTALL_WORKSPACE $INSTALL_SITE
rsync -azv --exclude '.git' $MODULE_WORKSPACE $INSTALL_SITE/modules/xc
It will ask your database password, deletes all the tables, and updates the code base, skiping the git-related directories. Then I go to Drupal, and launch http://localhost/xc-7.x-test-installation/install.php, select the installation profile I'd like to test (which is always my favorite one: "eXtensible Catalog Drupal Toolkit installation" ;-), and click the big button (or use Drush, and I don't need to delete the tables and use the browser for launching installation -- see the end of this post). This iteration (code change - deployment - testing) goes on untill all current issues has been solved.
Then enters another circle: making my code available, and testing whether the solution is available to others. So I commit my changes with Git, and push them into the Drupal.org repository. To build another distribution release requires a couple of steps:
- create a new release of the module/theme
- create a new tag in the repo
- create a release in the Drupal.org project page
- create a new release of the distribution
- modify the drush make file to reflect to the new module/theme version (and commit it)
- create a new tage
- create a new release in the Drupal.org project page of the distribution
While the project is not mature enough (which practically means, that we know that still there are annoying bugs) we create an alpha versions, like 7.x-1.0-alpha3. If all those bugs were killed, we can start to create release candidates such as 7.x-1.0-rc2. The tags, and so the name of releases should follow Drupal's naming conventions. I personally don't like creating distribution based on dev versions, because it is hard to tell, which state of the software we test against (dev versions are generated twice a day on Drupal.org with the current state of the defined git branch, so today's dev version is different than yesterday's dev version - if the project development are constant, and ours is so). The creation of distribution releases takes longer time than that of modules/themes. So a full circle of this process might take half an hour of boring and repetitive tasks and simple waiting. But when its done, with few commands we can test whether our development fixed the problems:
$ cd /var/www
$ drush dl xc_installation-7.x-1.0-alpha3 --drupal-project-rename=xc-7.x-test-installation
$ cd xc-7.x-test-installation
$ drush si xc_installation --db-url=mysqli://[mysql user]:[mysqlpw]@localhost/xc --site-name="eXtensible Catalog" --account-pass=admin
Then we can log in http://localhost/xc-7.x-test-installation/ as admin/admin, and check whether it works. When everything is working - not only for us, developers, but for our friends/colleagues who we asked to test the release, we can create a new release with the final name, which is 7.x-1.0 without the alpha/rc suffixes. The process is the very same, except at the end you might want to open a bottle of champaign**, since you finished a half year work.
* library as an institution for collections of readings
** I am an abstinent, I will make my tea.