I divide my time about equally between GeneSpring GX 11 from Agilent and BioConductor (free, from awesome people) for microarray analysis. The latter for all the neat tools that GeneSpring doesn’t have, the former because sometimes it is nice to lead a researcher visually through their data, without having to type into a green on black terminal window.
GeneSpring GX 11 is the third iteration after Agilent bought up Silicon Genetics, then decided to throw the unwieldy, quirky, but very functional GeneSpring product in the trash and start again with something built on Strand Life Sciences AVADIS platform. We’ve been through versions 9, 10 and now we’re on 11. There’s been plenty of bugs on the way, the most serious (to me) being the one where GeneSpring 10 managed to miscall the quality flags on Illumina data in 50% of the cases. Not good, but at least fixed.
Many people have been griping on mailing lists about functionality missing in the new GeneSpring that existed in the old version. I always think it’s a matter of familiarity with the software really. I hadn’t really come across anything I couldn’t do in GeneSpring 11 that I could do in GeneSpring 7. That was until yesterday.
I sat down with a customer yesterday to look at some microbial Nimblegen data. GeneSpring doesn’t really deal with Nimblegen data very well, you are left with the choice of not analysing it in GeneSpring, or accepting there’s going to be a bit of fudging and some extra annotation steps in order to make the data useable as a ‘Custom technology’. The customer, quite reasonably, asked if we could get the biological genome information (effectively gene annotations that are independent of the chip technology you’re using) loaded into GeneSpring. And thus started a morning of fun and games.
GeneSpring 11 has a very handy import feature for biological genomes under Annotations>Create Biological Genome. That is providing you want to choose one of their predefined organisms to download the information from NCBI. There is *NO* route in the software to add another organism to this list, or do anything than use one of their check box limited organisms. This is not a bug apparently, because in a separate part of the software (dealing with Pathways for an organism) you can pull this information directly from NCBI using the Taxon ID of the organism you’re interested in. So why can’t you use it to download a biological genome? Who knows…
One of the things I really liked about the old GeneSpring was the fact that it came with a manual a foot thick. It told you how to do every single operation in the UI, it didn’t tell you anything about the order in which to apply them, but you could generally rely on it for an answer. There was no such answer to this issue in the GeneSpring manual..
It transpires that if you really want to do this, the following, slightly insane process needs to take place:
1) Take this snippet of XMLishness:
<hexff version="1.0"> type plugin.product.TaxID id TaxID data Homo sapiens9606 Mus musculus10090 Rattus norvegicus10116 Anopheles gambiae7165 Arabidopsis thaliana3702 Bacillus subtilis1423 Bos taurus9913 Caenorhabditis elegans6239 Canis lupus familiaris9615 Citrus sinensis2711 Danio rerio7955 Drosophila melanogaster7227 Equus caballus9796 Escherichia coli562 Felis catus9685 Gallus gallus9031 Glycine max3847 Gossypium hirsutum3635 Hordeum vulgare4513 Macaca mulatta9544 Magnaporthe grisea148305 Medicago sativa3879 Medicago truncatula3880 Nicotiana tabacum4097 Oryctolagus cuniculus9986 Oryza sativa4530 Ovis aries9940 Pan troglodytes9598 Plasmodium falciparum5833 Pongo abelii9601 Poplar mosaic virus12166 Populus sp.3697 Pseudomonas aeruginosa287 Saccharomyces cerevisiae4932 Saccharum officinarum4547 Salmo salar8030 Schizosaccharomyces pombe4896 Staphylococcus aureus1280 Sus scrofa9823 Takifugu rubripes31033 Lycopersicon esculentum4081 Triticum aestivum4565 Vitis vinifera29760 Xenopus laevis8355 Xenopus tropicalis8364 Zea mays 4577 </hexff>
2) Add an entry
<key>Your organism name</key><string>NCBI Taxon ID</string>
after the Zea mays line
3) In your GeneSpring directory under this tree:
Create a folder called ‘plugins’ and save the edited XML above as a file called TaxID.plg
4) Restart GeneSpring and proceed to update your newly added Biological genome, which now appears in the list!
Actually, I have to say, I’m not sure I ever want to see that in a manual of a piece of software as expensive as GeneSpring… And besides this still doesn’t work for me as advertised because GeneSpring, whilst aware of what an HTTP proxy might conceivably be, has no concept of what an FTP proxy might be – which is problematic when you need to connect to ftp.ncbi.nlm.nih.gov. Brilliant!