ArrayExpress consists of the database itself, data loader and data access interface. ArrayExpress runs on Oracle RDBMS. However, we use very few Oracle special features, therefore porting to other DBMS platforms is possible, and only DDL scripts would have to be adapted to a different syntax. MAGEloader uses an Oracle sequence for generating unique object identifiers; therefore some methods (localized inside a single class) would need to be changed to generate identifiers in some other way, where underlying RDBMS does not provide sequences. We have been contacted by groups who intend to port ArrayExpress to other RDBMSs and we know of several such efforts. The database E/R model was auto-generated from a modified MAGE-OM by our own tool. The database contains more than 200 tables, derived from around 150 classes in the MAGE-OM.
The mapping used is relatively straightforward: classes are mapped to tables one-to-one, each object can be distributed across several tables according to the inheritance hierarchy, 1-to-1 and 1-to-many associations are mapped to foreign keys, while many-to-many associations are mapped to link tables. Some local modifications of the object model were done to improve performance of common queries. Database has been described in publication U. Sarkans, H. Parkinson, G. Garcia Lara, A. Oezcimen, A. Sharma, N. Abeygunawardena, S. Contrino, E. Holloway, P. Rocca-Serra, G. Mukherjee, M. Shojatalab, M. Kapushesky, S. Sansone, A. Farne, T. Rayner, and A. Brazma. The ArrayExpress gene expression database: a software engineering and implementation perspective. Bioinformatics, 2005, Vol. 21 No. 8: 1495-1501. The database and all detailed documentation is available from http://www.ebi.ac.uk/arrayexpress