The problem presented by the hard coding of GN is serious. Almost everything has been hard bound to the domain name www.genenetwork.org.

For example, www.systemsgenomics.com maps to www.genenetwork.org but doesn't work, since the GN bundle code allows only www.genenetwork.org to work properly. This set up really confuses me. There are no special benefits to doing that.

The dehardcoding procedures will modify the code to accomplish the following:

  • The GN html and python code can be put any directory. The latest Linux system apache default directory is "/var/www/html", and with specification of httpd it can be put anywhere. The current GN setup only allows the bundle code to be put into /gnshare/gn/web/webqtl, which is not good. For example, for a modern server configuration, the OS is usually put into / by RAID 1, and the functioning code base is put into a RAID 10 if there is big data requirement. Another benefit is that if the server crashes the data will still work.

  • Foreign databases will be synchronized to one server.

  • The distribution of the GN mirror requires GN to not be hard bound into certain server or directories. otherwise every mirror setup will require a large amount of work.

Record of Dehardcoding

Kev has been working on commenting the code, and Fan has been working on ripping the database connection modules off from the code, and using a unified connection to deal with all DB I/O.

Most of the comments are on the machine 132.192.47.13. They're labeled Kev Adler, or just KA. For most files, I tried to summarize what they do and, in relevant cases, what they're called by. A few files were already well commented, but by and large that wasn't the case. Although I don't understand how all the functions work, I've tried to summarize those as well, in terms of what input/output a programmer can expect.

Taking off the 777 privileges

It is ridiculous to make the whole GN code base directory world wide accessible!!! It is a miracle that the GN code base was not hacked or modified accidentally. On the bundle exp machine the GN code base is set to 644 and privileged by apache. After the dehardcoding process is over, a privilege table will be presented.

Unbind code connection with the URL www.genenetwork.org

Almost all html code has a tag in the code forcing the base directory for all files to www.genenetwork.org. Currently, when a mirror is set up, a python script is run to replace the old URL with the new one. Better would be to comment it out; if the significant directories are structured appropriately, the default, machine-dependent base directory should work.

  • SNAG-0002.jpg:
    SNAG-0002.jpg

Taking out the hard link "/"

The "/" needs to be removed and the appropriate directories relocated to "DocumentRoot", defined in httpd.conf. By taking this out we allow the GN code base to be put in any directory which http can access.

  • SNAG-0003.jpg:
    SNAG-0003.jpg

Additionally, the folders /images, /genotypes and etc need to be taken care of, as well as all the /*.html file. A related problem is that the html is itself generated by code. There are some really strange html commands. Those tags need to be replaced by the most common ones to maintain compatibility.

Separate the Beta code from Production code

In the production code base the beta code is removed. The production code has only been updated from beta code. Beta code will be put only in an experimental platform.

For example accountX.html has been removed.

dbdoc problem

Many files are not in the appropriate directory. For example most of the links in http://www.genenetwork.org/advancedSearch3.html are not working. The info file needs to be collected from the database and reproduced.

  • SNAG-0004.jpg:
    SNAG-0004.jpg

The hard link to genenetwork.org has also been removed.

blatinfo problem

This is either a hard coding problem or a serious directory missing issue. We need to find out what exactly it is.

  • SNAG-0005.jpg:
    SNAG-0005.jpg

Unified Glossary.html and References.html.

The references of Genenetwork will be put on www.genenetwork.org/references.html if it is a bundle server, if not, put a references.html so that we need to update only one references.html to be synced world wide.

  • SNAG-0006.jpg:
    SNAG-0006.jpg

Directory Setup

For example the images/upload is a directory for important genotype and phenotype information. image/upload is temp directory contains mid-process files. I reckon that images/upload is to be changed to another directory which makes sense.

  • SNAG-0007.jpg:
    SNAG-0007.jpg

Dynamic Content Management

The following code shows that the selection on GUI are hard coded in the html file, which is hard to extend. Using a dynamic content management is an optimal choice. but big job.

  • SNAG-0008.jpg:
    SNAG-0008.jpg

list of Base URL which is taken off:

*www.genenetwork.org *www.webqtl.org *web2qtl.utmem.edu *webqtl.utmem.edu

There are totally 613 hard links removed from GN code base.


Unifying DB Link to MySQL database

In the old code, there are 5 types of connection set to MySQL database which are totally unnecessary. some of the connections even does not allow username and password authorization. here, we unify all DB link to one, and set them into webqtlConfig file. in future we just need to modify webqtlConfig file once if we need to change database configuration.
  • SNAG-0009.jpg:
    SNAG-0009.jpg

  • SNAG-0010.jpg:
    SNAG-0010.jpg

Analysis of .htaccess file

Python code requires detailed configurations set to the code base directory. This section analyses the current .htaccess file and makes changes to it to make sure that GN codebase can be moved around.

Another purpose is to simplify the .htaccess setting.

In the home directory there is a .htaccess file like the following:

  • SNAG-0000.png:
    SNAG-0000.png

Options + Includes override the previous httpd.conf setting to allow CGI to be run. GN code base httpd.conf is complicated and each directory is set to certain permissions.

Options -Indexes disable indexing and directory browsing, which is useful.

XBitHack? on: xbithack tells Apache to parse files for SSI directives if they have the execute bit set. So, to add SSI directives to an existing page, rather than having to change the file name, you would just need to make the file executable using chmod. I have doubts on this particular setting.

The .htaccess file of webqtl directory(python code directory) is set as following:

Each piece of python code(file) has to inform the apache of its PythonHandler? . otherwise Apache does not know what it is happening.(a bit inmature)

  • SNAG-0001.png:
    SNAG-0001.png

  • SNAG-0002.png:
    SNAG-0002.png

Pythonpath defines the directory of library which mod_python search through. it will be replaced as not absolute path and sys.path

like this: PythonPath? "['../webqtl'] + sys.path"

One thing is that the required libraries have to be installed in sys.path directory.

then GN code base is able to be moved around.

Topic attachments
I Attachment Action Size Date Who Comment
pngpng SNAG-0000.png manage 13.2 K 05 Jun 2008 - 16:31 FanZhang  
pngpng SNAG-0001.png manage 21.3 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0002.jpg manage 80.3 K 05 Jun 2008 - 16:31 FanZhang  
pngpng SNAG-0002.png manage 2.2 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0003.jpg manage 48.6 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0004.jpg manage 69.5 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0005.jpg manage 53.0 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0006.jpg manage 84.1 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0007.jpg manage 117.4 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0008.jpg manage 183.4 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0009.jpg manage 29.9 K 05 Jun 2008 - 16:31 FanZhang  
jpgjpg SNAG-0010.jpg manage 56.8 K 05 Jun 2008 - 16:31 FanZhang  
Topic revision: r13 - 02 Sep 2008 - 20:05:34 - RobWilliams
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback