GeneNetwork is based on a complex of Python,
JavaScript? , and htm code. The orginal GN was built using Python by a single programmer (Jintao Wang). Jintao was the only programmer trhough 2004. Between 2004 and 2006, a small number of other programmers have worked on the Phython code, including Stephen Pitts, Alex Williams, Hongqiang Li, Zhaohui Sun, Robert Crowell. At present (May 2007), the GN code consists of approximately 25000 lines of Python (.py), 4200 lines of
JavaScript? (.js), and 27000 of htm (.htm or .html).
The current lead programmer at UTHSC is Hongqiang Li (May 2007).
The current lead programmer at UWA in Perth is Munish Mehta (May 2007).
The current lead programmer at the HZI, Braunschweig Germany is Evan Williams (May-Aug 2007).
Some issues were discussed in July-August 2006 with the goal of making the code base more maintainable, for example the migration of all code to Subversion. This topic has been moved to a new page
GNCodeBaseInitialIssues.
Zhaohui Sun built a more portable version of GN code base. He
- removed almost all site-specific elements, so it is now easies to migrate/install GN at a new location.
- identified all dependencies (packages, environments) that are also important for installing GN on a new machine.
- migrated all necessary components to Subversion.
- described in detail the installation procedure in a README file.
- set up a GN beta site to test the Subversion output code.
The Subversion GN code base now allows programmers to install a fully-functional
GeneNetwork onto other machines (Linux primarily, but also Mac and PC). This new version puts the code base under more formal management and also allow other research groups to contribute more easily and effectively to GN.
GeneNetwork Codebase
The GeneNetwork code base is written in Python and is maintained in a Subversion repository on the
WebqtlMachine. The code consists of a set of cgi programs and a set of mod-python programs developed mainly by Jintao Wang. Stephen Pitts also wrote two loosely-connected tools, NetworkGraph and CompareCorrelates, that provide additional functionality. Most functions are now performed by mod-python except those written by Stephen.
Code Management
As of early 2007, all code is in Subversion (see
SubversionSetup). Standard operating procedures for code development using are described at
GNCodebasePlan.
Structure of the code (and data)
The directory structure of
GeneNetwork code and data are organized as described below. If you check out a copy of the trunk of GN project from Subversion, you should have these directories (from the top level):
-
support/ contains some important python packages developed at UTHSC and used by GN
-
thirdparty/ contains some third-party python packages
-
scripts/ some important scripts for installation and updating database
-
data/: genotype data sets
-
tests/: unit testing scripts
-
web/: the major work directory
-
web/cgi-bin: cgi-bin python codes with some of Stephen's code
-
web/webqtl: the static html pages and data and a working directory
-
web/webqtl/webqtl: mod-python code (trait.py is from Stephen)
-
web/webqtl/javascripts: javascript code
-
web/webqtl/changable_html: a copy of the html pages and data that are editable from the GN web site
-
web/webqtl/image: a place for generating temporary images
-
web/webqtl/images: images/icons used by the code
John suggests that we should have a list-of-programs page to show basic functions and the flowchart. --JohnShi, Oct 4, 2006.
Installations
Detailed information about the GN installation, including prerequisites, methods of checking out code, and methods of configuration are described in the README file (attached).
Routine data update / backup
A few scripts run on headmaster, Opteron, and webqtl computers. These scripts keep data files on the main site and beta sites up-to-date. Some scripts run as cron jobs in the root account. These scripts are kept at scripts/ dir in the gn project of the Subversion repository.
-
bkupMysql.sh. This script backups/dumps all tables of the production webqtl database (db_webqtl). It runs on Opteron as a cron job. Ken Manly has set up Retrospect server to backup the data dumped by the script.
-
generif/addRif.py. This script downloads data from NCBI to update our GeneRIF? _Basic table. It runs every day on webqtl as a cron job.
-
copyData2beta.csh. This script copies all editable html files from main site to the beta site, so as to update the data in web2qtl. It runs every weekend as a cron job on headmaster
-
checkinhtml.csh. This script submits the changes in those editable html files to Subversion. It needs to be run manually on webqtl or opteron because a password needs to be entered.
- = This script updates the web2qtl database from the dump of the opteron database=. This has not been implemented (as of May 2007?). The purpose of this script is to keep the main database (opteron) and the test database (web2qtl) synchronous. But note that this process should also happen from the main db to the test db.
Related Issues
This beta site was built completely from the Subversion GN repository. The trunk version of the GN project was checked out onto web2qtl at /gnshare/gnbeta (actually a symbolic link), and was configured accordingly (Apache). So far this site has been comprehensively tested although more tests may be needed. Note that the beta site uses a
MySQL database installed locally on the
Web2qtlMachine? .
Future Plans
- Futher test the beta site and then release it to the main site
- Install GN to other machines (Mac) for the use of development.