Journal Articles

CVu Journal Vol 32, #2 - May 2020 + Internet Topics
Browse in : All > Journals > CVu > 322 (9)
All > Topics > Internet (35)
Any of these categories - All of these categories

Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.

Title: Diving into the ACCU Website

Author: Bob Schmidt

Date: 02 May 2020 18:12:16 +01:00 or Sat, 02 May 2020 18:12:16 +01:00

Summary: Matthew Jones gives us an insight into the paddling going on under the water beneath the swan that is our website.

Body: 

This article is mostly a war story, but is also a guide for anyone who wants to work on our website in the privacy of their own computer.

The requirement

Last Autumn, the ACCU committee decided it would be a good idea to offer a free 6-month trial membership to anyone attending the conference and who was not a member. The membership would only allow access to the online versions of the magazines, and would expire in September.

As the membership secretary, it naturally fell to me to make this happen. Assuming maybe 100 sign-ups, it would not be out of the question to simply gather names, addresses and emails, and enter it all by hand. It turns out that would have been quicker, and easier, but an incredibly tedious day’s work.

I already have to poke around [1] in the SQL database that lives behind the website, but to do this I blindly repeat magic spells worked out by my predecessor, Mick Brookes. This gives me little knowledge about how it works, and how the new requirement fits. So I said I would have a go at adding the new feature…

A disclaimer

Many programmers will know that this is a classic LAMP [2] setup, and some readers might already know how to tackle this task. But my background is embedded, real time, and C++. I use Linux at home but as a user, not a hacker. My command line Linux fu is weak. I have never administered or written a proper website. I know only the very basics of SQL, and I have never programmed in PHP. In fact, I only know enough [3] to truthfully rate myself at 0 or 1 out of 10 at all of these. I very quickly entered the ‘valley of despair’ stage and stayed there. Almost everything in this article is probably the wrong way to do it!

This is real code archaeology. It is also a classic ‘how not to do it’ story. I should have gathered all the information I could, read it and digested it, until I fully understood the system. Then, with skill and judgement, made a few perfectly crafted tweaks. But this is said with hindsight. When you start an apparently small job like this you just wade in assuming you can bash it out in a few evenings. Some of the references listed below I only discovered whilst writing this article. With a wry smile, I freely admit I only have myself to blame.

The system

The website was written, and for a long time supported, by Tim Pushman [4]. It was written over 10 years ago, when the chosen technology was current and appropriate. But about five years ago, he wisely decided to move on and left it to us.

The website uses a content management system called Xaraya [5]. It is so old, it doesn’t even have a Wikipedia entry. It is no longer being developed or supported. Not even Stack overflow is any use. If you search for ‘Xaraya’, there are just five, yes FIVE, hits.

Xaraya is written in PHP and is built up from modules. There are many modules, including one for managing articles, one for book reviews, and one for Worldpay, our credit card payment system. And, of course, there is one for managing subscriptions to the site. Every member of ACCU has encountered this, whether or not you were aware of it. The subscriptions module manipulates our membership database. It is a cut & paste clone of the original Xaraya code, with many modifications for ACCU-specific features. The changes for our new trial membership fell entirely within this module; after all that is the whole point of a modular system.

Xaraya and SQL

Every Xaraya module has a xarinit.php file. Predictably, this gets run at startup. One of its jobs is to initialise the module’s database tables if they don’t exist. So the very first time the website runs, provided an empty accuorg_xar database exists, it will be set up for us. As I write this, I have just tried starting the website with an empty database but it stayed empty. After actually reading [5], it turns out there is an installation process, (again, a PHP module) that a new site must invoke.

Another initialisation task is the creation of global variables. These live in a database table called xar_module_vars, indexed by module and name, allowing each module to ask for one of its global variables by name through xarModGetVar(). This has an annoying side effect. Some of the website content is stored as strings in these global variables. So when you’re trying to find some code and your only way in is a string on a website page, you have to grep the source code and the database. The original initialisation values are in xarinit.php, but the values might have been edited later, in SQL, meaning the live content can diverge from the source code.

Xaraya pages

I haven’t been involved in the publications and book review aspect of the site, so I don’t know how dynamic content works.

Static pages, such as those relating to subscriptions, come in two parts. There is a PHP file which contains a function that returns an array of data. The function will typically access the database, and maintain any state needed if the page interacts with it through buttons. The page appearance is described by a .xd template file. These templates are XML, which Xaraya transforms into HTML by combining it with the dynamic data from the PHP function.

Error logging

If the web server has a problem, it will log to /var/log/apache2/error.log or similar. This is entirely standard and predictable.

If Xaraya has a problem, it will log to /home/accu/public_html/var/logs/log.txt. Yes, that’s var/logs, but ‘logs’ plural, and /home/.../var, not /var. And it will only log if the file already exists, and with the right permissions. All very fiddly and frustrating when you’re expecting a file to appear automatically in /var/log. It took a while to work out this particular wrinkle.

Xaraya has two printf logging functions: xarLogMessage() and xarLogVariable(). They both do what they say, and are pretty much the only way to debug a live site. I spent a great deal of time instrumenting the code.

The work

After poking around in the PHP for a few evenings, I thought I had got the measure of the membership module and was ready to try some tentative changes. This immediately broke the live website, resulting in a couple of emails to the webmaster from puzzled members trying to renew. It was clear that working live was out of the question, especially as this is ACCU’s busiest time of year: the 3 months leading up to the conference is when most of our new members join, and consequently a large number of existing members renew.

I took a step back and decide that no matter how hard it was going to be, I had to work on an offline version of the site. Time to saddle up for a long ride.

Thankfully, about 18 months ago Jim Hague put the entire website into git [6]. In principle, cloning that repo, and cloning the SQL database should get us our own private version of the ACCU website. But first we need a web server. There are many, many, how-to guides to this on the internet. It boils down to this:

1. Clone the git repo

  1. Clone the repo to /home/accu/public_html. This is the same root path as on the real webserver. Keeping it the same gives the highest chance of it working right away, and allows us to focus on real work, rather than discovering later on that this was a mistake. Who knows how many hard coded paths there are in the code? (Answer: not zero!)
  2. Create a new branch and check it out immediately.

2. Install MySQL

  1. Set up an admin user, root, and the Xaraya user accuorg_xarad.
  2. Install MySQL workbench.
  3. Use workbench to check that our SQL server is alive and kicking.

3. Install Apache

  1. Check http://localhost for signs of life.
  2. Tell the Apache2 config file to serve a site from /home/accu/public_html.
  3. Change the site config to use a port other than 80, just in case we accidentally expose something to the wider internet whilst making changes.

At this point, we might see a web page but we have no PHP so we just see index.php as raw text. (If you can’t get to the git repo, use one of the trivial index.php examples below.)

4. Install PHP 5

At the time of writing, the default PHP version you get from apt-get install php is 7. Xaraya needs 5.x and does not work with 7, offering nothing but HTTP 500 errors.

The internet tells us to do this:

  add-apt-repository ppa:ondrej/php
  apt-get update
  apt-get install php5.6 libapache2-mod-php5.6

At which point, the errors should change to 403 Forbidden. Progress of sorts. Again, the internet helped me work out that we need to add this:

  <Directory "/home/accu/public_html">
    Require all granted
  </Directory>

in /etc/apache2/sites-available/000-default.conf

Now we can see this trivial index.php OK, but our full Xaraya home page still doesn’t work.

  <html>
    <head>
      <title>PHP Test</title>
    </head>
    <body>
      <?php echo '<p>Hello World</p>'; ?>
    </body>
  </html>

Using this index.php:

  <?php
    phpinfo();
  ?>

we get a very useful and detailed PHP information page that tells us everything we need to know. We just have to work out exactly what it is try to telling us...

5. Install PHP modules

We have some missing PHP modules. Each step forward results in a new and puzzling error message. We need to decipher each one to work out which module might be missing, then install it, and repeat until everything starts to work:

  apt-get install php5.6-mysql
  apt-get install php5.6-xml

But there is a shortcut! We can run phpinfo on the live website and compare the modules very carefully to find out what we are missing on our machine. Whatever the technique, by a process of iteration we end up getting Xaraya to sit happily at the top of our LAMP stack.

6. Install the database

Once the HTTP errors stopped, Xaraya error pages took their place. One of the messages was Unknown database 'accuorg_xar'. At least this is not cryptic. Luckily “here’s one I made earlier” [7]: the real website. It just takes a few lines to copy it over:

  ssh <username>@dennis.accu.org
  mysqldump -u accuorg_xarad -p accuorg_xar > /tmp/
  accu.sql
  logout
  scp <username>@dennis.accu.org:/tmp/accu.sql .
  mysql -u root -p accuorg_xar <  accu.sql
  mysql -u root -p -e 'show databases'

Keeping this accu.sql file handy also allows us to erase any mistakes, or newly created test users, by simply repeating the import, thus turning back time:

  mysql -u root -p accuorg_xar <  accu.sql

Nice. At this point we have a snapshot of the real ACCU website running on our local host. Now the real ‘fun’ can begin.

The changes

I’m not going to go into detail on the changes for adding the trial membership. It isn’t particularly interesting. It’s pretty poor code, and I’m not proud of it, but it does the job. If you really are interested, ask for access to the git repo. Here is a summary:

  1. Add a new subscription type so we can treat trial subscriptions as a special case.
  2. Add a new tab to the user account page so new subscribers can choose to sign up for trial or full membership.
  3. Make the trial subscription page bypass the Worldpay payment system, because its free.
  4. Find every place in the existing code where the type of subscription is important and consider whether trial subscriptions need to be included (e.g. expiry reminders, renewals, etc).
  5. Add a special case to subscription expiry so that trial members all expire on the 30th September (rather than the usual case of 365 days after they renewed).
  6. Add a switch to the database so we can easily show and hide the trial membership feature around conference time.
  7. Do an awful lot of testing with test users.

Conclusion

I have tried to explain the basics of the ACCU website system. I have shown how running your own local experimental copy of the system should be well within the capabilities of anyone familiar with the linux command prompt.

Due to its age and the difficulties in maintaining it, we tend to leave the website well alone. Routine activities centre around adding content. Actual functional changes such as this are very rare, although this work should give us some confidence that we could do more if pushed. However, the site is obviously nearing the end of its life, and it might be kinder to just call the RSPCA[8].

I hope this has given an insight into what it takes to maintain our beloved website, and by implication, what it would take to replace it. If this has piqued your interest, why not try following the instructions, or getting involved in some other way?

References

[1] Amongst other things, we generate the list of e-voters for the AGM with a SQL query.

[2] The LAMP software bundle, Wikipedia: https://en.wikipedia.org/wiki/LAMP_(software_bundle)

[3] The Dunning-Kruger effect: https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

[4] gnomedia codeworks: https://www.gnomedia.com/

[5] The Official Xaraya User Guide: http://www.xaraya.hu/files/xarayaguide.html

[6] Git repository of the ACCU website: https://git.accu.org/jim/xaraya-website [Note: Not publicly viewable since it also contains members-only content like CVu.]

[7] Here’s One I Made Earlier: Classic Blue Peter Makes https://www.amazon.co.uk/Heres-One-Made-Earlier-Classic/dp/0857835130 (With apologies to our non-UK readers: this is a very British cultural reference!)

[8] RSPCA (Royal Society for the Prevention of Cruelty to Animals): https://www.rspca.org.uk/. Again, possibly a cultural reference. See https://en.wikipedia.org/wiki/Animal_euthanasia

Matthew Jones started programming with BBC Basic, and then learned C during a VI form summer job. He has been programming professionally for over 20 years, having moved on to C++, and usually works on large embedded systems.

Notes: 

More fields may be available via dynamicdata ..