Future Ideas

From Mouse BIRN Team

Mouse BIRN Atlas Tool (MBAT) (Some use case comments based on the VTC between UCLA/LONI and Rob Williams last week)

  • The existing Gene Network GUI in MBAT allows a user to submit queries to the GeneNetwork scriptable interface based on the gene ID. (There are some hard-coded tables in MBAT that contain the gene IDs available for different strains and diseases). An example of such a query is, http://www.genenetwork.org/cgi-bin/WebQTL.py?cmd=sch&gene=Adora2a&format=text . The results of the initial query are then parsed to obtain expression values for the different structures within each probeset and the expression values for the highest-scoring probeset are then displayed in the atlas image. The user can request additional annotation information in which case a second query will be submitted: http://www.genenetwork.org/cgi-bin/WebQTL.py?cmd=sch&gene=Adora2a&format . By omitting the "format=text", HTML output containing alot more annotation information is returned. This HTML is then parsed so that the information can be displayed to the user.

  • At a recent all-hands meeting, Rob Williams presented a mock-up drawing of an enhanced version of the MBAT GeneNetwork GUI: GeneNetworkClient.jpg We would like to implement the enhanced version but it doesn't currently seem possible to submit expanded queries that contain "Array" and "Transform" information. Rob suggested that it may be necessary to hard-code some of this information into tables. To do this, I just need to know where I can find a list of the different Array and Transform values.

  • We would also like to be able to query the GeneNetwork scriptable interface to obtain annotation information that can be used to filter probeset records. For example, a user could select key words that appear in the annotation and MBAT would then submit a query that would obtain probesets whose annotation contained matches for those keywords. (Rob Williams mentioned that he has such annotation data stored in a MySQL database, but it is not available via the GeneNetwork scriptable interface). I'm not familiar enough with GeneNetwork or the origins of the data to know what appropriate commands might be, but maybe something like, http://www.genenetwork.org/cgi-bin/WebQTL.py?cmd=annot&geneSym=calb1&format=text. The server would then return search the entries in Rob's annotation database and return information for the gene symbol "calb1". Perhaps queries for key words, Unigene ID and accession number could be submitted in the same way.

Use UML modeling for Databases

We would use a modelling tool for our databases with roundtrip support (convert to and from) DDL to generate SQL on the server.

From Bill Bug's 8/1/06 email: I've started using UML (as opposed to standard E-R diagramming) for data modeling, because the open source tools support this much better.

Here are two nice write-ups on how to do this: http://www.tomjewett.com/dbdesign/dbdesign.php http://www.agiledata.org/essays/dataModeling101.html#WhatIsDataModeling

There are a lot of great open source tools listed here for doing round-trip and/or reverse engineer model <--> DDL SQL based design on several aggregation web sites: http://www.gnome.org/projects/dia/links.html (I need to go back to this, I'd not looked in about a year - clearly there are many new tools - some may be a lot better than my current klugey appropriate - which works well for me, because I don't mind tweaking Perl, Python, or Ruby scripts a bit when I need to) http://www.schemamania.org/ http://erw.dsi.unimi.it/ (web-based RDBMS data modeling) http://www.databaseanswers.com/modelling_tools.htm (commercial & open source tools) http://www.cs.uiowa.edu/~rlawrenc/teaching/144/design.html (mostly commercial tools)

Umbrello is my favorite open source UML modeling tool in terms of ease-of-use, use of XMI, and nice graphics. Unfortunately, some of the idiosyncrasies of how you need to relate entities for RDBMS data models are much better supported in DIA than in Umbrello. I also find the DIA layers significantly augment what you can achieve in DIA in a short amount of time - even for fairly complex models. Unfortunately, the current scripts for converting DIA UML data model diagrams into DDL SQL that can be sent directly to the RDBMS to build the database don't take layers into account. I've been working on Ruby port of several pieces from various DDL SQL generation scripts that can take the layers into account (e.g., allow you specify which layers to include in the DDL SQL produced), but I've just not had time to go back an finish this work.

It's really critical to be able to go round-trip from a graphic view to an XMI view to DDL SQL. I've not gotten all of this yet from open source tools, but very nearly. Some of the commercial, enterprise level tools support this, but we just can't afford to have every person who needs to interact at this level across all of these neuroinformatics projects using such tools. The learning curve is not trivial for the integrated commercial products, so the added hoops of bringing the open source pieces together is nearly a wash.

-- StephenPitts - 01 Aug 2006

Topic revision: r4 - 14 Aug 2006 - 19:23:07 - SteveAnderson
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback