Monthly Archives: February 2010

GeneSpring GX 11, or how to make the simple – undocumentable..

I divide my time about equally between GeneSpring GX 11 from Agilent and BioConductor (free, from awesome people) for microarray analysis.  The latter for all the neat tools that GeneSpring doesn’t have, the former because sometimes it is nice to lead a researcher visually through their data, without having to type into a green on black terminal window.

GeneSpring GX 11 is the third iteration after Agilent bought up Silicon Genetics, then decided to throw the unwieldy, quirky, but very functional GeneSpring product in the trash and start again with something built on Strand Life Sciences AVADIS platform.  We’ve been through versions 9, 10 and now we’re on 11.  There’s been plenty of bugs on the way, the most serious (to me) being the one where GeneSpring 10 managed to miscall the quality flags on Illumina data in 50% of the cases.  Not good, but at least fixed.

Many people have been griping on mailing lists about functionality missing in the new GeneSpring that existed in the old version.  I always think it’s a matter of familiarity with the software really.  I hadn’t really come across anything I couldn’t do in GeneSpring 11 that I could do in GeneSpring 7.  That was until yesterday.

I sat down with a customer yesterday to look at some microbial Nimblegen data.  GeneSpring doesn’t really deal with Nimblegen data very well, you are left with the choice of not analysing it in GeneSpring, or accepting there’s going to be a bit of fudging and some extra annotation steps in order to make the data useable as a ‘Custom technology’.  The customer, quite reasonably, asked if we could get the biological genome information (effectively gene annotations that are independent of the chip technology you’re using) loaded into GeneSpring.  And thus started a morning of fun and games.

GeneSpring 11 has a very handy import feature for biological genomes under Annotations>Create Biological Genome.  That is providing you want to choose one of their predefined organisms to download the information from NCBI.  There is *NO* route in the software to add another organism to this list, or do anything than use one of their check box limited organisms.  This is not a bug apparently, because in a separate part of the software (dealing with Pathways for an organism) you can pull this information directly from NCBI using the Taxon ID of the organism you’re interested in.  So why can’t you use it to download a biological genome?  Who knows…

One of the things I really liked about the old GeneSpring was the fact that it came with a manual a foot thick.   It told you how to do every single operation in the UI, it didn’t tell you anything about the order in which to apply them, but you could generally rely on it for an answer.  There was no such answer to this issue in the GeneSpring manual..

It transpires that if you really want to do this, the following, slightly insane process needs to take place:

1) Take this snippet of XMLishness:

<hexff version="1.0">

 type
 plugin.product.TaxID
 id
 TaxID
 data
 
 
 Homo sapiens9606
 Mus musculus10090
 Rattus norvegicus10116
 Anopheles gambiae7165
 Arabidopsis thaliana3702
 Bacillus subtilis1423
 Bos taurus9913
 Caenorhabditis elegans6239
 Canis lupus familiaris9615
 Citrus sinensis2711
 Danio rerio7955
 Drosophila melanogaster7227
 Equus caballus9796
 Escherichia coli562
 Felis catus9685
 Gallus gallus9031
 Glycine max3847
 Gossypium hirsutum3635
 Hordeum vulgare4513
 Macaca mulatta9544
 Magnaporthe grisea148305
 Medicago sativa3879
 Medicago truncatula3880
 Nicotiana tabacum4097
 Oryctolagus cuniculus9986
 Oryza sativa4530
 Ovis aries9940
 Pan troglodytes9598
 Plasmodium falciparum5833
 Pongo abelii9601
 Poplar mosaic virus12166
 Populus sp.3697
 Pseudomonas aeruginosa287
 Saccharomyces cerevisiae4932
 Saccharum officinarum4547
 Salmo salar8030
 Schizosaccharomyces pombe4896
 Staphylococcus aureus1280
 Sus scrofa9823
 Takifugu rubripes31033
 Lycopersicon esculentum4081
 Triticum aestivum4565
 Vitis vinifera29760
 Xenopus laevis8355
 Xenopus tropicalis8364
 Zea mays 4577

 
 

</hexff>

2) Add an entry

<key>Your organism name</key><string>NCBI Taxon ID</string>

after the Zea mays line

3) In your GeneSpring directory under this tree:

GeneSpring  GX11binpackagesmarrayproject2.1

Create a folder called ‘plugins’ and save the edited XML above as a file called TaxID.plg

4) Restart GeneSpring and proceed to update your newly added Biological genome, which now appears in the list!

Actually, I have to say, I’m not sure I ever want to see that in a manual of a piece of software as expensive as GeneSpring…  And besides this still doesn’t work for me as advertised because GeneSpring, whilst aware of what an HTTP proxy might conceivably be, has no concept of what an FTP proxy might be – which is problematic when you need to connect to ftp.ncbi.nlm.nih.gov.  Brilliant!