Gareth Williams - October 2013


SPIEDw
version 2.0 October 2013

SPIEDw
SPIEDw (SPIEDw paper) is a web tool designed to facilitate fast and simple quantitative querying of publically available gene expression data. The new resource is a development of the previously published SPIED (version 1.0 for reference), which was designed to be downloaded and queried locally with SPIED software. SPIEDw features: An enhanced search algorithm for speedy real time querying; An extended database of over 200,000 array samples now covering Agilent and Illumina technologies; An abridged dataset comprising the most regulated genes is included for a speedier search and the searching for gene set enrichment. SPIEDw is simple to use, not requiring any expertise in microarray analysis, and the output straightforward to interpret.

Querying SPIEDw
SPIEDw is searched with query profiles consisting of genes and associated expression levels. Query files are txt files with two or more columns, the first column lists the gene names and the second column lists the expression level. The gene names are human and in line with the HUGO Gene Nomenclature Committe (HGNC, www.genomes.org). Expression levels are usually defined relative to control and SPIEDw queries assign opposite signs to up and down regulated genes. If t is the treatment/condition level and c the associated control then the expression level, r, can be: r = +1(t>c)/-1(t<c), log(t/c), (t-c)/(t+c), etc. Example query expression profiles: profileX, profileY. SPIEDw can also be queried for gene set enrichment with for example pathway gene sets. Here, the query genes are assigned a fold of unity and SPIEDw is queried in 'FAST' mode.

SPIEDw output
The correlation of the query profile with the given SPIEDw entry is measured by a simple Pearson regression score and ranked according to the significance. The abridged database query scores are determined by the Fisher exact test. The output lists significantly correlating SPIEDw entries ranked according to the score. The output format consists of the series id, the sample id, the regression score and the significance. Each output entry also has a web link to the NCBI GEO pages for the given series and samples and a 'magnifying glass' link button to access the query scores against every sample in the series. This functionality enables the user to asses whether there is a correlation between the skew of the enrichment and the given treatment or condition being assayed in the series. For example, if the query corresponds to the expression changes upon treatment with a given drug and this query picks out another instance of the same drug in SPIEDw, then one expects there to be a positive enrichment skew with the drug treatment samples and a negative skew with the control samples. The search can be restricted to subsets of SPIEDw based on species. In addition, there is a separate dataset comprising drug treatments. At present this consists of the connectivity map data, but will subsequently be populated with data from other sources.

All queries should be sent to:

Gareth Williams
Bioinformatics
Wolfson Centre for Age-Related Diseases
King's College London
London SE1 1UL
gareth.2.williams@kcl.ac.uk
To start SPIEDw search first upload a query gene expression profile:
For an exhaustive search against all genes select 'FULL', for a speedy search against an abridged database of the top responders in SPIED select 'FAST':

The search can be restricted to a subset of SPIEDw based on species type or restricted to drug treatments only:

If required, the number of output results can be incresed beyond the default 100:

Hit 'start' button to commence SPIEDw search: