BDF2 genotype dataset (this can be used as a sample for adding new genotype data sets):
- InbredSet? Table: insert a record to explain the new dataset
- GenoFreeze? Table: insert a record to create the relationship between the Inbred and the Genotype file
- Strain Table: insert the strains
- StrainXRef? Table: insert the relationship of Strain and InbredSet?
- GenoXRef? Table: insert the relationship of GenoFreeze? , Genotype, and Data
- Data Table: insert the marker's value of each strain
- Create a Genotype file under /node_apps/www/html/webqtl/genotypes
--
HongqiangLi - 28 Sep 2006
Illumina microarray dataset (this is an example for adding a new type of microarray data set):
1.
GeneChip? Table: insert a record to explain the new array type or chip
2.
ProbeFreeze? Table: insert a record to explain the probe or probe set data set.
Note: even we don't have probe data actually, we also need add these records (these records do not relate with real data, since we do not use the Illumina's single bead/probe level data)
3.
ProbeSetFreeze? Table: insert a record to explain the Microarray data set (
ProbeSetData? set), the
ProbeFreezeId? item should be the value that step 1 inserted.
4.
ProbeSetXRef? and Data Table: insert the expression values to Data Table and the statistic values(mean, lrs, pvalue, etc) in
ProbeSetXRef? , the
ProbeSetFreezeId? item should be the value that step 2 inserted.
Ensembl Probe Location Tables:
At first, download Ensembl's dataset (it should be mus_musculus_core_*, * means the version), insert them to the 'test' database using the included sql script files), then use the follow scripts to create the Tables in 'db_webqtl' database
1. use db_webqtl;
2. create table EnsemblChip (Id int not NULL, ProbeSetSize int not NULL, Name varchar(40) not NULL, Type enum('AFFY', 'OLIGO'));
3. insert into EnsemblChip (select oligo_array_id, probe_setsize, name, type from test.oligo_array);
4. create table EnsemblProbe (Id int not NULL, ChipId int not NULL, ProbeSet varchar(40), Name Varchar(40), length int not NULL);
5. insert into EnsemblProbe (select oligo_probe_id, oligo_array_id, probeset, name, length from test.oligo_probe);
6. create table EnsemblProbeLocation (Chr char(2) not NULL, Start int not NULL, End int not NULL,
Strand int not NULL, MisMataches int , ProbeId int not NULL);
7. insert into EnsemblProbeLocation(select name, seq_region_start, seq_region_end, seq_region_strand, mismatches, oligo_probe_id
from test.seq_region as seq_region, test.oligo_feature as oligo_feature
where seq_region.coord_system_id=1 and seq_region.seq_region_id=oligo_feature.seq_region_id);
8. update EnsemblProbe
set name=concat(right(concat('XX', left(name,instr(name,':')-1)),3),
right(concat('YY', mid(name, instr(name, ':')+1, instr(name, ';')-instr(name, ':')-1)),3))
where length(left(name,instr(name,':')-1))<=3 and
length(mid(name, instr(name, ':')+1, instr(name, ';')-instr(name, ':')-1))<=3;
9. update EnsemblProbe
set name=concat(left(name,instr(name,':')-1),
right(concat('YY', mid(name, instr(name, ':')+1, instr(name, ';')-instr(name, ':')-1)),3))
where length(left(name,instr(name,':')-1))>3;
10. update EnsemblProbe
set name=concat(right(concat('XX', left(name,instr(name,':')-1)),3),
mid(name, instr(name, ':')+1, instr(name, ';')-instr(name, ':')-1))
where length(mid(name, instr(name, ':')+1, instr(name, ';')-instr(name, ':')-1))>3;
11. create table GeneChipEnsemblXRef (GeneChipId int not null, EnsemblChipId int not null);
12. insert into GeneChipEnsemblXRef values(1,3);
13. insert into GeneChipEnsemblXRef values(2,11);
14. insert into GeneChipEnsemblXRef values(2,12);
Note: this script is based on ensembl's dataset version 41_36b, for the other version, maybe the table name and fields be changed.
A sample script of getting Ensembl Probe Location:
SELECT EnsemblProbeLocation.*
FROM EnsemblProbeLocation, EnsemblProbe, EnsemblChip, GeneChipEnsemblXRef, ProbeFreeze
WHERE EnsemblProbeLocation.ProbeId=EnsemblProbe.Id and
EnsemblProbe.ChipId=GeneChipEnsemblXRef.EnsemblChipId and
GeneChipEnsemblXRef.GeneChipId=ProbeFreeze.ChipId and
EnsemblProbe.Name=%s1 and
EnsemblProbe.ProbeSet=%s2 and
ProbeFreeze.Name=%s3
group by Chr, Start, End
Note:
%s1: the value match the field 'Name' of the table Probe
%s2: the value match the field 'Name' of the table ProbeSet,
if the name formats '*****_at_A' or '*****_at_B' (ProbeSet is M430AB),
the value should cut '_A'/'_B', it would just be '*****_at'
A The scripts of triggers on table DBList, let it match the updating of table ProbeSetFreeze?
delimiter |
CREATE TRIGGER DBListInsert BEFORE INSERT ON ProbeSetFreeze
FOR EACH ROW BEGIN
INSERT INTO DBList SET DBTypeId=1, FreezeId=NEW.Id, Name=NEW.Name, Code=NEW.Name;
END;
|
CREATE TRIGGER DBListDelete BEFORE DELETE ON ProbeSetFreeze
FOR EACH ROW BEGIN
DELETE FROM DBList where DBTypeId=1 and FreezeID=OLD.Id;
END;
|
CREATE TRIGGER DBListUpdate BEFORE UPDATE ON ProbeSetFreeze
FOR EACH ROW BEGIN
UPDATE DBList set FreezeId=NEW.Id, Name=NEW.Name, Code=NEW.Name
where DBTypeId=1 and FreezeID=OLD.Id;
END;
|
delimiter ;
Note: the scripts only on
local machine and
root privilege
--
HongqiangLi - 24 DEC 2007
Strain name correction
- Data table, update old id to new id
- SE table, update old id to new id
- StrainXRef? table, update old id to new id if new id doesn't exist, delete old id if new id exist
| old name |
old id |
new name |
new id |
| BTBR T+ tf/J |
1872 |
BTBRT<+>tf/J |
940 |
| BTBR_T+_tf/J |
2155 |
BTBRT<+>tf/J |
940 |
| CAST/Ei |
914 |
CAST/EiJ |
947 |
| BALB/cBy |
347 |
BALB/cByJ |
913 |
| C57BL? /6By |
1890 |
C57BL? /6ByJ |
348 |
| ILSXISSF1 |
2156 |
LSF1 |
1098 |
| ISSXILSF1 |
2157 |
SLF1 |
1099 |
| ILS/Bg |
1930 |
ILS |
369 |
| ISS/Bg |
1931 |
ISS |
370 |
| B6D2F1? /J |
1864 |
B6D2F1? |
1 |
| AXB14 |
164 |
AXB13a |
164 |
| Add new strain, BXD101, BXD102, BXD103 |
--
HongqiangLi - Jan04 2008
Automatically update
1.
GeneRIF? , run the scripts file gn/scripts/generif/addRIF.py periodicly to update the content.
-- Main.Hongqiang Li
Upload Human Data (this can be used as a sample for adding new kinkd of data sets):
- GeneChip? Table: insert a record to descript the Chip platform
- InbredSet? Table: insert a record to descript the Inbred Set
- Strain Table: insert the strains
- StrainXRef? Table: insert the relationship of Strain and InbredSet?
- ProbeSet Table: insert the ProbeSets
- ProbeSetXRef? Table: insert the summary of traits
- Data Table: insert the expression values
- Create a Genotype file under gn/web/webqtl/genotypes
--
HongqiangLi - Oct 17, 2008