Utilities in {HOME}/bin/
bin/create_tables
script
bin/backup_tables
script
bin/restore_table_backup
script
bin/rid
script
bin/updareq
script
bin/sid_base
script
bin/clean-up
script
bin/apu
script
bin/events_archiving
script
The short-ids
Logging. Tracking and debugging problems
Web interface and users’ interactions
Automatic profile update (APU)
Background searches for documents and other resources
Sessions
Metadata processing
The log of short-ids
MySQL log
Archiving and rotating the logs
The events database
What to back up and how to restore
Critical data
Other data
MySQL tables
Update daemon database
Restoring a service after a crash
Managing user accounts
Userdata structure: owner branch
Userdata structure: records branch. Personal record
Administrator’s screens
Access
/adm/sql
/adm/events
Checking a particular date
Filters
/adm/events/recent
/adm/events/pref
/adm/search
/adm/sessions
Customizing an ACIS installation
Phrases
Building a site around an ACIS installation
Files in an ACIS installation
Other related files:
{HOME}/bin/
bin/setup — update system’s configuration.
bin/create_tables
scriptbin/create_tables
creates MySQL tables for ACIS.
If a table exists already, it will not delete it. If you
want to force re-creating the tables, give it
-f
command-line switch.
bin/backup_tables
scriptUsage:
bin/backup_tables [database table1 table2 ...]
bin/backup_tables
creates backups of the specified
tables from the specified database. If no database and no
tables are specified, it backups most important tables from
the ACIS database. Currently these are:
acis.sysprof acis.suggestions acis.cit_suggestions
sid.sid_id_to_handle sid.sid_last_numbers
“acis” and “sid” here stand for your configuration values acis-db-name and sid-db-name respectively if they are defined, or db-name if they are not.
A directory is created for the backup files, based on backup-directory and the current date. For example, if backup-directory is set to “/opt/ACIS/backup”, and executed on 1 September 2006, it would be: “/opt/ACIS/backup/2006/09/01/”. The backup-directory must exist already, but its subdirectories will be created if the do not exist.
To be able to use this utility, ACIS mysql user must have FILE and LOCK TABLES privileges on the server.
You don’t have to shut down the service to create backups with this utility.
bin/restore_table_backup
scriptUsage:
bin/restore_table_backup backup_file [...]
Restores one or several backups previously created by bin/backup_tables.
bin/rid
scriptbin/rid start
starts and bin/rid stop
stops the update daemon.
bin/rid restart
stops and then re-starts it.
bin/updareq
scriptbin/updareq COLLECTION PATH [TOO_OLD]
PATH
in collection COLLECTION
.
TOO_OLD
is time in seconds. If a file was last
time processed more than TOO_OLD
seconds ago, the
daemon will process it again (even if it didn’t change
since). By default, TOO_OLD
is
86400*12
seconds, which means 12 days. Read more about the updates in update daemon section.
If the script fails to send a request, it will complain.
bin/sid_base
scriptThe short-id database management utility.
bin/sid_base clear
bin/sid_base backup
{HOME}/SID/backup/YYYY/MM-DD/
directory, where
YYYY
, MM
and DD
are current year,
month and day number.bin/sid_base import file
bin/sid_base import-dup file
bin/clean-up
scriptExpired sessions cleaner. Will go through session files in
{HOME}/sessions/
, look into each file, check
how old it is and what session type it is and will close the
session if its time has come.
A new-user session’s time has come if it is a week old. Such sessions are wiped out without regret.
A regular user session’s time has come if it is older than session-lifetime seconds, as specified in configuration. When closing a regular user session, her data (possibily changed) will be saved and her profile page will be updated.
ACIS needs this script for normal operation. Run it regularly, e.g. by a cron-job. As it prints out some information about each session it looks at, you better capture its output into a log file. For example
*/11 * * * * /home/user/acis/bin/clean-up >> /home/user/acis/clean-up.log
bin/apu
scriptSee bin/apu.
bin/events_archiving
scriptThe script incrementally processes the events database, which is an alternative logging method, used by ACIS. The nature of the database makes it difficult (slow and expensive) to browse through older events. This script packs session data into a single database row, thus making access to this data via the /adm/events screen quicker.
It makes sense to configure a cronjob to run this script once a day or once a week.
Short-ids are short unique alpha-numeric identifiers, that ACIS assigns to the personal and other records. They are used in ACIS extensively and must stay unique and persistent and designed for this. The short-ids are also used as shorter aliases for the longer external identifiers. There is a strict one-to-one relationship between each short-id and its longer counter-part.
The assigned short-ids are kept in the short-id database and
logged into SID/short-id.log
.
ACIS is a complicated software and keeps a number of different logs. Each subsystem maintains its own log, and sometimes — several logs. The logs give you capability to see what is going on and to track problems, if any. Below in another section there is a list of files in an installation of ACIS, including the logs. Here is a description of what is logged and into which files.
The users’ interactions with the system are logged into the
acis.log
file. This is the main log, which shows
what users were or are doing with the system. Problems and
errors that happen during these interactions are
additionally logged to the acis-err.log
file.
Additionally, all the interactions are stored into the events database. This database is browsable online via the /adm/events screen.
Each incoming request may be logged into so-called requests-log; this is disabled by default, but is governed by the requests-log configuration parameter. It is a brief log and it only contains the trailing URL part of the request, and it should give you an idea about which screen the request was for.
Also, it is important to monitor the web-server’s error log. If something doesn’t work well in ACIS, and its code crashes unexpectedly before even being able to handle the request or during that, only the web-server’s logs will contain info and details about the problem.
In case of real problems of ACIS web interface misbehaving, the “debuggings” log may get useful. To enable it set the debug-log parameter in main.conf to a file name. For each incoming request handled by ACIS, it will contain a detailed trace of everything what happens internally. One request may generate about 15-25 lines of processing details, sometimes even much more than this. But note that it will slow things down considerably and will eat your diskspace quickly.
APU-related activity (see bin/apu) is logged into the
autoprofileupdate.log
file.
When a user enters hers research profile after a while or
after she changes her name variations, ACIS runs an
automatic search for research. APU processing for a record
also runs this search. In both cases it is run in the
background, not under the web-server process, as the CGI
requests are. So it is logged into a separate log—the
back.log
file. It shows which name variations where
searched and how many matching items were found.
Sessions are created by the web interface of ACIS, and so it
is usually logged together with other events of the web
interface. But the abandoned or left sessions are taken
care of by the bin/clean-up script. It closes the
expired sessions, saving the changes to the user’s accounts,
if necessary. And while it does not itself write a log, we
recommend that you configure the cronjob to redirect its
output into the clean-up.log
file.
But if you need to check the currently active sessions and, possibly, to peek into one of them, use the /adm/sessions administrator’s screen.
Metadata is processed by the update
daemon and the update daemon writes several logs. One
log is general and covers the update requests coming from
applications (including ACIS’ web interface) and utilities
(bin/updareq). It documents the essential details of
each request: the collection id, the path to update, the
TOO_OLD
parameter and, of course, time of the
request. Then it logs the channel number. All that is in
the RI/daemon.log
file.
There are 5 channels in the update daemon and they run in parallel. Each channel is processing one request at any given time, if any at all. If all 5 channels are busy and a new request comes in, it would have to wait till one of the channels becomes available.
Each of the channels has its own logfile. These are named
from RI/update_ch0.log
for channel no. 0 through
RI/update_ch4.log
for channel no. 4. A channel’s log
includes all the details of the requests taken and the
processing details. If any problems or errors happen during
processing, they would normally be reported to the channel’s
log.
The actual metadata processing also involves ARDB. That
work is logged into the ardb.log
file.
All created and assigned short-ids
are logged into the SID/short-id.log
.
This log can be used to recreate the short-ids database, see bin/sid_base script.
If an SQL query or a statement fails, it is logged into the
sql.log
file, with all the details. Normally this
log will only contain messages about database connections
having been established or re-established.
For an actively used service the amounts of logs generated is great and no matter how big your harddrive is, one day they will fill it up. To avoid it and to stop worrying about it, we recommend configuring log rotation with the logrotate tool.
Here is how your logrotate.conf
file could look like:
compress
/home/user/ACIS/*.log /home/user/ACIS/RI/daemon.log /home/user/ACIS/RI/ri.log {
rotate 300
olddir /home/user/ACIS/oldlog/
daily
copytruncate
delaycompress
}
If the version of logrotate you have supports dateext
directive, you may use that as well.
You will probably want to rotate your web-server’s logs in a similar fashion.
The events database is an alternative to the acis.log
file. It contains data and details about what happened to
your ACIS service over time. But it is not a text file as
the usual logs are, but it is a MySQL database table with a
number of columns for capturing different kinds of events in
detail. You can browse that table with your favorite MySQL
database management tool, like PHPMyAdmin (the table is
surprisingly named “events”), but some details will be
hidden from you because of the data packing method used. So
we suggest to use the /adm/events screen instead,
because it was designed for this purpose.
This screen gives you capability to see what was happening to your ACIS on a certain hour of a certain day, and who was registering recently, and which works did that man claim. You can monitor things as they happen.
The database also covers the work of some of the scripts, like bin/apu, because such scripts alter the user accounts.
Also, to keep these things usable, it helps to run the bin/events_archiving script regularly (or occasionally). It will pack parts of the database and make loading things from it quicker.
There are two most critical things you should backup
regularly in a service: the userdata files and the short-id
log. The userdata files are all the
files below the userdata/
directory in your ACIS
home. They should be archived with all the directory
structure. The short-id log is the
SID/short-id.log
file.
You should also backup your primary metadata, but that should be obvious.
Also obvious, but let me say this for completeness. In addition to userdata, short-id log and your primary metadata, keep a safe backup of:
site/
directory (see here),presentation/default/phrase-local.xml
)This will make you able to reconstruct a severely damaged service with moderate effort in moderate time.
To allow even faster and smoother recovery after a severe system failure with a data loss you may also keep backups of the databases. It is optional, because all the crucial data can be reconstructed from the critical pieces listed in the previous section.
So do this only if you have a cheap backup facility with enough space. In that case additionally create copies of:
See bin/backup_tables and bin/restore_table_backup scripts.
Consult MySQL documentation on additional ways to back up the mysql tables: Database Backups and all the section Backup and Recovery. (These links are for MySQL version 4.1.)
The update daemon uses Berkeley DB for storing its data. Consult its documentation for backup instructions: Database and log file archival.
If you want to backup that, you should intergrate it into the database housekeeping. It would be much easier then.
Of course, how you are going to restore a service depends greatly on what damage happened. If you have lost the machine with all its setup, you’ll have to recreate the installation and its configuration first. Then restore from backups all the files that you can restore.
If you have lost your MySQL tables, use bin/create_tables to recreate the database structure or restore the databases from your backups if you have them.
If you still have the tables after the crash, check and try to recover them with MySQL’s table integrity checking utilities.
Restore the userdata/
directory’s contents from
your backups.
Restore the SID/short-ids.log
log from your
backups.
Recreate the short-ids database by calling
bin/sid_base import SID/short-ids.log
command.
Restore your primary metadata.
Start the update daemon, call bin/rid
start
command.
Make sure MySQL server is running and is configured properly.
Request the update daemon to process your primary metadata, use the bin/updareq utility.
Make sure the web-server (Apache) is running and all the cron-jobs are restored.
Check as much as you can to see if it all works well.
The user accounts in ACIS are XML files in the
{HOME}/userdata/
directory. We call them userdata
files. These files use ACIS::Data::DumpXML
format,
which means such a file represents a Perl data structure.
Having it in XML means it is human-readable and -editable at
the same time. By editing an account (i.e. a userdata) file
you directly edit the user’s information, and so you have
power to change it in any way.
The userdata/
directory has two-level structure.
Userdata file for account vasya@pupkin.com
will be
in userdata/v/a/vasya@pupkin.com.xml
.
“v
” and “a
” are the first two English
letter-characters of the user’s account name (email
address).
Userdata file for each account consists of two main parts:
the user part — owner
element and and the records
part — records
element. In XML terms, both are
children of document element data
:
<data>
<owner>
...
</owner>
<records>
...
</records>
</data>
In XPath abbreviated syntax, the first is
/data/owner
and the second is
/data/records
. Each of these elements has its own
purpose and structure. Therefore below I’ll call them
owner branch and records branch respectively.
Owner branch of a userdata file describes the account and
the user. It’s primary content are login
,
name
, password
elements, whose purpose
won’t be a great puzzle for you. To make it clear,
password
is verbatim password that the user has to
enter to enter into account, no ciphering applied. Element
name
contains human name of the user.
Element type
is not neccessary. It may contain
zero, one or many elements to specify additional privileges
of the user. For instance, when user is supposed to manage
several personal records, she will have advanced
privilege, i.e. advanced
element in type
element.
Administrator users will have admin
element in
type
; more about it later.
Typical owner branch of a userdata may look like this:
<owner>
<IP>81.25.33.145</IP>
<last-change-date>2004-08-25 14:14:52 +0300</last-change-date>
<login>vasya@pupkin.com</login>
<password>ngi5Go</password>
<name>Vasya Pupkin</name>
<type><empty-hash /></type>
</owner>
Order of elements in owner
does not matter. The
IP
and last-change-date
elements are added
by ACIS and updated on each user login, but not used for
anything.
The records branch contains a list of records, that the user has created. Most users only have one record in the list. But still this means the following structure.
<data>
...
<records>
<list-item>
...
</list-item>
...
</records>
</data>
Most usually the record in a userdata describes the same person as the owner branch does, but technically this doesn’t have to be that way.
The only possible record type at this stage is
person
record type.
A person record is a hash in perl terms. In XML terms it is
an unordered list of elements, which cannot be repeated.
Elements id
, type
, sid
represent
the record’s identifier, type (of text value
“person
”) and short-id respectively.
Element contact
contains contact info of the
person. name
contains name data.
contributions
— the research profile data.
affiliations
— you guessed it.
The items which need mentioning are:
contact/email-pub
true
” if user gave her
permission to publish her email address. Contains
<undef/>
otherwise.name/additional-variations
contributions/accepted
contributions/refused
about-owner
yes
” if the record describes the
account owner. System adds this to every record normally
created by the registration process and uses it to build
appropriate elements of user interface (“your profile”
vs. “Vasya Pupkin’s profile”).profile
profile/url
contains personal profile URL, profile/file
— the
profile filename path in the local filesystem,
profile/export
— names of files, to which the
record is exported.Some administrator functions are available through the ACIS
web interface itself. It makes monitoring and administering
the service easier. Several /adm/…
screens exist
for these functions.
Administrator screens are potentially a huge security risk. Be sure that only trusted individuals have access to it.
There are two ways a user can get access to an administrator screen. First, she can enter a special password, admin-access-pass. If:
then system lets her in.
When system asks the user for admin’s password, there is an option to save the password in a cookie. Browser will store the cookie for a month since its creation. After that browser will drop the cookie and you’ll have to enter the password again.
Second, a currently logged-in user might have an
administrator privilege. A user has administrator privilege
if there is admin
element in her userdata’s owner
type
element (XPath:
/data/owner/type/admin
). There is no way a user
can get administrator privilege other than by a manual edit
of her userdata file.
The second way to access the administrator screens is preferable. Especially, when you access it from a shared computer, beause it doesn’t leave the pass cookie.
/adm/sql
This screen gives you capability to enter and execute any SQL query on behalf of the ACIS’ MySQL user and with acis-db-name as the default database.
It prints out the query results, if any.
/adm/events
The screen provides access to the events database, which contains a log of everything what happened within ACIS’ web interface.
The screen itself works in two modes: overview and data retrieval. Overview mode shows you for which years, months and days there is data to browse and offers you access to that data (i.e. hyperlinks). The data retrieval mode shows the actual events logged on a certain period of time.
There is also a subscreen for setting your preferences for the data retrieval mode; see below.
Both modes of the screen are linked to each other and provide navigational cues, which make it straightforward to use; we hope so.
The overview mode offers a form (at the bottom of the screen), which you can use to browse events for a particular date or a period of time. Just enter the starting date and (optionally) the finishing date and click “SHOW” button.
Alternatively, you can directly request the screen to show events for a particular date, via URL. Add a date in format YYYY-MM-DD to the URL of the /adm/events/ screen and you get that. E.g.,
http://acis.super.edu/adm/events/2005-11-04
If you want to check events for a day starting from a particular time, add “/HH:MM:SS” to an URL which has a date already:
http://acis.super.edu/adm/events/2005-11-04/12:03:45
There are several filters which you can enable via the preferences screen, or explicitly by adding “?filter” or “?filter1+filter2” to the URL.
At this time there are two filters defined:
hidemagic
onlyresearch
For example,
http://acis.super.edu/adm/events/2005-11-04?onlyresearch
/adm/events/recent
Shows the most recent events logged. What is considered recent depends on the preferences with a default of 12 hours.
/adm/events/pref
The screen to choose and set your preferences for working with the /adm/events screens. It requires cookies and JavaScript in your browser.
The preference govern the maximum amount of data to load on a page (it can be huge), filtering options, and an option on presentation of the user sessions.
/adm/search
This screen gives you capability to search in some of the important tables of ACIS. It contains three search forms: for documents, for records (personal records) and for users.
You choose what field do you want to search by, enter the
search key and push [SEARCH]. If the key expression
includes “%
” sign, it assumes that you use MySQL’s
simple pattern matching syntax (operator LIKE
). In
that syntax %
means zero or more of any character.
If your expression doesn’t include percent character, it
assumes that you want full field value match. For instance,
if you search by a known email address of a user and you
enter her complete address.
The search works differently when you search for documents by title. If you do not use percent char, it will search by full-text index, so it will be a word search. If you do use the percent char, it will search by substring/phrase match.
Generally, this is a simple search utility. It is not supposed to provide complete information about the documents, personal records and users in the system, just the most basic info.
/adm/sessions
This screen lists all currently open sessions, their type and user name, and how old the session is.
You can look into each session (see all the data that is stored in it) and delete it, if you think it has to be deleted.
An ideal web application is so flexible that you can build it into an existing website seamlessly and you can customize its wording and any other presentational aspects of it. ACIS can not do that: it is too much to ask from us.
Instead, ACIS offers you two features to give you some
control over your site: phrases and semi-static pages.
They support your need to communicate to your users
effectively.
Phrases are bits of (X)HTML markup which some of the ACIS XSLT templates insert into the final pages in appropriate context. A phrase may be a single word or a whole bunch of paragraphs; size doesn’t matter. Each phrase has its own identifier and a default value; but you can override it.
For instance, the service-intro
phrase is a slogan
that is displayed on ACIS homepage below the
site-name-long. The default for this phrase is:
<p><big>Through our service you tell the world
about yourself.</big></p>
and it is defined in the
presentation/default/phrase.xml
.
{HOME}/presentation/default/phrase.xml
is the default
phrases file. You are welcome to take a look into this file
to see what phrases are there. Sometimes, though, XSLT
stylesheet itself defines default content of a phrase. That
happens when default content of a phrase needs to include
run-time specific values. (Thus it cannot be written to a
ready-to-use phrase.xml
file.)
If you want to override a phrase, you need to create
{HOME}/presentation/default/phrase-local.xml
file:
<phrasing>
</phrasing>
and create <phrase id=’…’> … </phrase>
element inside it with id of the phrase you want to
override. For example, phrase-local.xml
may look
like this:
<phrasing>
<phrase id='service-intro'>
<p><big>Register and be cool!</big></p>
</phrase>
</phrasing>
That will redefine the above-mentioned
service-intro
phrase.
The other important phrases are:
news
news
phrase right
after the service-intro
phrase, before the
New registration section.announcements
homepage-title
presentation/default/index.xsl
.page-footer
page-footer
after the main
content div
, inside <div class=’footer’>
. By default it
includes: an administrator’s email link, home link
(home-url) and a link to page top.
The default is defined in presentation/default/page.xsl
.
email-footer
email-footer-html
new-registration-intro
<div>
, right above the
“Introduce yourself” header. Default: empty.institution-search-instructions
Enter main words of your institution name. You may try
acronyms, too.
”.metadata-idenfitifer
metadata identifiers
”.research-main-epilog
phrases.xml
.not-satisfied-with-automatic-search
phrases.xml
.email-confirmation-about-registering
email-confirmation-no-works-claimed
email-confirmation-about-registering
phrase. Default: empty.Before ACIS displays a phrase content to the user, it processes it in a specific way. All simple XHTML it passes through. In addition to XHTML it might include some special ACIS markup. This special markup is not explained here; actually it is not documented anywhere. One interesting case of special markup that you need to know is that you can include other phrases in your phrase. Example:
<phrasing>
<phrase id='news-brief'>
<p><big>We celebrate a 100-years annivesary of the
service!</big></p>
</phrase>
<phrase id='announcement'>
<phrase ref='news'/>
</phrase>
<phrase id='news'>
<h2>News</h2>
<phrase ref='news-brief'/>
</phrase>
</phrasing>
Semi-static pages of an ACIS site
Second customization thing you have is capability to create web pages in ACIS. System will serve these pages as if they were a natural part of the service. Thus you can publish news, user instructions, FAQs on an ACIS site and develop them over time.
When ACIS gets a request, it first checks the screens
configuration (screens.xml
file). If the request
does not match any of the screens in screens.xml
,
ACIS checks site/
directory of your ACIS home. If
the directory contains an XML file with a matching pathname,
it will load this file and show it to the user. A file has
a matching pathname if its relative pathname from
site/
equals the requested address plus
“.xml
”. For instance, if base-url is
“http://web.site.org
” and request came for
“http://web.site.org/about
”, ACIS will check
site/
directory for about.xml
file.
The structure is simple:
<page>
<title>page title</title>
<content>
<p>Once upon a time... In a far, far end of the
world... There lived a page.</p>
</content>
</page>
page
element may contain optional name
attribute and optional style
elements. The
name
attribute for hiding links to a page on this
same page. (Remind me later to expand on this.)
The style
elements are copied to the head
of the resulting HTML page. Content of content
will go into the page body.
The pages will look as all other ACIS pages to the user. They’ll have the same standard header, the same footer, will use the same styling.
In the header they’ll have links appropriate to the user’s context. If the user is undergoing an initial registration process, the header will invite him to continue registration. To an already-registered user who logged-in, the header will propose to return into account or log out.
When creating such pages, you’ll want to check that the page
name won’t conflict with any of the screens defined in ACIS.
Check screens.xml
file for that.
It will be useful if you know a little bit about what files and directories does an ACIS installation consist of. An ACIS home directory will contain these items:
directory | what’s in it |
---|---|
userdata/ | userdata files — files, describing the users’ accounts |
sessions/ | files of currently open sessions |
lib/ | ACIS Perl libraries (source code) |
bin/ | ACIS utility scripts |
unconfirmed/ | finished, but not yet confirmed registration files |
deleted-userdata/ | the userdata files of users, who removed their accounts |
presentation/ | the XSLT templates, which generate HTML pages and emails; also some non-XSLT, but related files. |
RI/ | Update daemon’s files: configuration, logs and data |
SID/ | short-id database logs and backups |
site/ | the semi-static page files, which ACIS will serve as native screens, see Building a site around an ACIS installation |
filename | description |
---|---|
main.conf | main configuration file |
ardb.conf | ARDB configuration file, generated from main.conf |
acis.conf | ACIS::Web configuration file, generated from main.conf |
thisconf.sh | shell script, containing all the configuration parameters from main.conf as environment variables; generated from main.conf |
screens.xml | ACIS::Web (Web::App) screens configuration. See Internals Guide. |
configuration.xml | ARDB data processing configuration. |
contributions.conf.xml | Research contribution roles per research work type. (in ACIS::Data::DumpXML format). |
ri-socket | Update daemon’s unix-domain socket; the daemon listens to it. |
RI/collections | RePEc::Index collections configuration, generated from main.conf |
acis.log | ACIS web-interface general log |
acis-err.log | same as acis.log , but only troubles are logged |
sql.log | MySQL queries log. Normally, only client connections and failed queries are logged. |
ardb.log | ARDB log |
SID/short-id.log | Short-ids assignment log. Very important file, never delete it and back up regularly. It is important for short-id database recovery in case of trouble. Short-id database consistency and continuity is essential to ACIS work and the metadata it produces. |
RI/daemon.log | Update daemon log. Includes all update requests, and shows which channel was each request given to for processing. |
RI/update_ch0.log … RI/update_ch5.log | Update daemon logs of processing the actual requests. Each of the files correspond to one update daemon channel. Show which datafiles and which records were found and show conflicts and other problems or findings about the data. Grow quickly. |
RI/sql.log | MySQL queries log for connections and queries done by the update daemon. |
back.log | ACIS background research searches log. Each automatic research search is logged there with details on which particular name variations were searched and how many hits there were. |
presentation/default/phrases-local.xml | Phrases local replacement file. |
presentation/default/phrases.xml | Default phrases file. |
$Id$
Generated: Wed Aug 29 22:59:09 2007
ACIS project, acis@openlib.org