Introduction & overview
Research Profile screens
Identified Items
Automatic Suggestions
Refused Items
Manual Search
Auto Update Preferences
Document-to-Document Links
Full-text URLs
bin/export_fturls_choices
script
Main Research Profile screen
Automatic research search
Exact searches
Additional searches
Fuzzy name search
bin/fuzzy_search_table
script
Configuration of the Research Profile and its screens
Automatic search
Document-to-document links configuration
Full-text URLs configuration
Fuzzy search
Selected technical details
Technical: database tables
resources
rp_suggestions
ft_urls
ft_urls_choices
Technical: code structure
Research profile is one of the main parts of a personal record in ACIS. It lists the research works that the person has authored or otherwise took part in creation of. Research works are usually documents: articles or papers, but it can also be a book or a chapter in a book, a software component, a series, et cetera.
At the same time, research profile is a part of the ACIS web interface which is designed to let users manage their list of research works.
When a person includes a work into his or her RP, we oftern refer to the event as “claiming”; we say, for instance, the user claimed a document.
ACIS maintains its own database of documents and other research items. (We sometimes use a general word “resource” to refer to them.) And users do not have an ability to directly add their own stuff to the document database. The personal RPs can only include items that are already present in the resource database.
The identified items screen lists all the currently-claimed works of a person. And it allows to remove items from the list, for example, to fix a mistake of adding a wrong item.
Automatic search is the main procedure that we execute to find works for a person’s RP. Automatic Suggestions screen is where we show the results of the automatic search and let user either accept them or not, individually.
The refused items is a list of research items which should not be suggested for inclusion into the person’s RP. It is a blacklist of sorts.
The refused items screen lets user review the list and delete items for it, if desired.
While automatic search should find every resource an ACIS service has for a person, sometimes the metadata is not accurate. This and other reasons mean that automatic search is not always absolutely effective. Therefore, we let users do their own search by several different criteria: by the work title, by the author/editor name, by record identifier.
On the Manual search screen users do those searches and handle their results.
ACIS provides APU — automatic profile update service, which executes automatic research searches for a person even when user is not directly asking that. It may automatically add closely matching items to the person’s RP. But if user doesn’t want that service, he can disable it on the Auto Update Preferences screen.
Document-to-document links is an advanced feature of RP. It lets users connect the works of their RP with each other, specifying the type of relation between them. For instance, many different works may be different versions of the same research report. Some work is a continuation of an earlier one. And so on.
The range of possible relation types is defined by the system administrator.
On the Document-to-document links screen users can review and delete the links they have previously created and they can create new ones.
The links data are then exported in AMF with the user profile (if AMF export is configured with metadata-amf-output-dir). It may look like this
<text ref="repec:wop:cirano:96s14">
<follow-up xmlns="http://acis.openlib.org/2007/doclinks-relations">
<text xmlns="http://amf.openlib.org" ref="repec:mit:worpap:382"/>
</follow-up>
</text>
<text ref="repec:wop:epruwp:9701">
<isreferencedby>
<text ref="repec:wop:cirano:97s41"/>
</isreferencedby>
</text>
Another advanced and optional feature of RP. If you have full-text links for your research works (articles, papers, etc.) but the data is not 100% authoritative, you may ask the authors to review and flag right and wrong links. At the same time, you may ask them for their permission to archive the full-text file (if it is correct). Please refer to the Textilshchiki document, section Full-text file recognition for a better description of the rationale for this feature.
The Full-text URLs screen shows the currently known URLs for each of the RP items. (There may be several URLs per item.) And for each URL it shows its current status. If user made no decision about it yet, then the assumed default status is shown. Otherwise it shows the latest user-made decision. Thus user can review his or her previous decisions and change them.
See below instructions on how to configure the feature and on its input data format.
The collected data of users’ decisions can then be exported out of ACIS in a simple format:
bin/export_fturls_choices
scriptThe script is for exporting data from the ft_urls_choices table (and some related fields in other tables). It outputs data on the standard output in a simple tab-delimited one-record-per-line format. The following fields are included (in this order):
authoritative
(supplied in the primary metadata
for the document) or automatic
(automatically found via third-party
tools)correct
,
abstractpage
, wrong
, anotherversion
mayarchive
,
checkupdates
, notarchive
or an empty stringThe script may optinally accept one or two date parameters on the command line. With such parameters, script would only output decisions taken in the given period. If only one date is supplied, script outputs all data from that day on. The dates are expected in the YYYY-MM-DD format.
Displays a menu of all the screens with a brief introduction into each and some general status information. Provides a button to force automatic search for the person with her current name variations.
This is search by the person’s name variations in the names of the document authors (and editors). As its name states, it finds exact matches only.
Features to find mistyped author (editor) names in the document metadata.
This requires running bin/fuzzy_search_table utility every once in a while and some configuration.
Find a detailed explanation of how this is supposed to work in the Textilshchiki document, section Fuzzy searching.
bin/fuzzy_search_table
scriptThe script initializes the database tables which are needed for the fuzzy name search to work. Should be run regularly. Depending on size of your documents database, it may take a while to do its job.
Takes no arguments and prints out its progress (the executed database statements) to standard output.
See all research profile parameters.
The whole feature has to be enabled with a document-document-links-profile parameter.
The relation types have to be specified in an XML file
doclinks.conf.xml
in the ACIS installation directory. The file
has a simple structure; a self-explanatory example file is supplied in
doclinks.conf.xml.eg
.
The whole feature won’t be there unless you have enabled it with a full-text-urls-recognition parameter.
The input data format is AMF-based. The authoritative URLs:
<text id="..">
<file>
<url>url</url>
</file>
</text>
Automatically found URLs:
<text id="..">
<hasversion>
<text>
<file>
<url>url</url>
</file>
</text>
</hasversion>
</text>
If you have full-text URLs data separate from the document data,
configure it as a special metadata collection in main.conf. Use
FullTextUrlsAMF
as its type. E.g. this collection is named
‘URLs’:
metadata-collections="Papers URLs ..."
metadata-Papers-home=/path/to/Papers
metadata-Papers-type=AMF
metadata-URLs-home=/path/to/URLs/data
metadata-URLs-type=FullTextUrlsAMF
...
Before this data becomes available to users, it has to be processed with the update daemon. You have to explicitly request an update (see bin/updareq).
resources
+----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+-------+
| id | varchar(255) | | PRI | | |
| sid | varchar(15) | | MUL | | |
| type | varchar(20) | | | | |
| title | varchar(255) | | MUL | | |
| classif | varchar(50) | YES | | NULL | |
| location | text | YES | | NULL | |
| authors | text | YES | | NULL | |
| urlabout | text | YES | | NULL | |
+----------+--------------+------+-----+---------+-------+
rp_suggestions
+--------+----------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+----------+------+-----+---------------------+-------+
| psid | char(15) | | PRI | | |
| dsid | char(15) | | PRI | | |
| role | char(15) | | | | |
| reason | char(30) | | | | |
| time | datetime | | | 0000-00-00 00:00:00 | |
+--------+----------+------+-----+---------------------+-------+
ft_urls
PRIMARY KEY( dsid, checksum ), index url_i(url(30)), index
source_i(source(50))
ft_urls_choices
primary key prim(dsid, checksum, psid), index t_i(time), index psid_i(psid)
Core modules:
APU modules:
Document to document links:
Full-text URLs:
$Id$
Generated: Wed Aug 29 22:59:09 2007
ACIS project, acis@openlib.org