I divide my time about equally between GeneSpring GX 11 from Agilent and BioConductor (free, from awesome people) for microarray analysis. The latter for all the neat tools that GeneSpring doesn’t have, the former because sometimes it is nice to lead a researcher visually through their data, without having to type into a green on black terminal window.
GeneSpring GX 11 is the third iteration after Agilent bought up Silicon Genetics, then decided to throw the unwieldy, quirky, but very functional GeneSpring product in the trash and start again with something built on Strand Life Sciences AVADIS platform. We’ve been through versions 9, 10 and now we’re on 11. There’s been plenty of bugs on the way, the most serious (to me) being the one where GeneSpring 10 managed to miscall the quality flags on Illumina data in 50% of the cases. Not good, but at least fixed.
Many people have been griping on mailing lists about functionality missing in the new GeneSpring that existed in the old version. I always think it’s a matter of familiarity with the software really. I hadn’t really come across anything I couldn’t do in GeneSpring 11 that I could do in GeneSpring 7. That was until yesterday.
I sat down with a customer yesterday to look at some microbial Nimblegen data. GeneSpring doesn’t really deal with Nimblegen data very well, you are left with the choice of not analysing it in GeneSpring, or accepting there’s going to be a bit of fudging and some extra annotation steps in order to make the data useable as a ‘Custom technology’. The customer, quite reasonably, asked if we could get the biological genome information (effectively gene annotations that are independent of the chip technology you’re using) loaded into GeneSpring. And thus started a morning of fun and games.
GeneSpring 11 has a very handy import feature for biological genomes under Annotations>Create Biological Genome. That is providing you want to choose one of their predefined organisms to download the information from NCBI. There is *NO* route in the software to add another organism to this list, or do anything than use one of their check box limited organisms. This is not a bug apparently, because in a separate part of the software (dealing with Pathways for an organism) you can pull this information directly from NCBI using the Taxon ID of the organism you’re interested in. So why can’t you use it to download a biological genome? Who knows…
One of the things I really liked about the old GeneSpring was the fact that it came with a manual a foot thick. It told you how to do every single operation in the UI, it didn’t tell you anything about the order in which to apply them, but you could generally rely on it for an answer. There was no such answer to this issue in the GeneSpring manual..
It transpires that if you really want to do this, the following, slightly insane process needs to take place:
1) Take this snippet of XMLishness:
Canis lupus familiaris9615
Poplar mosaic virus12166
Zea mays 4577
2) Add an entry
<key>Your organism name</key><string>NCBI Taxon ID</string>
after the Zea mays line
3) In your GeneSpring directory under this tree:
Create a folder called ‘plugins’ and save the edited XML above as a file called TaxID.plg
4) Restart GeneSpring and proceed to update your newly added Biological genome, which now appears in the list!
Actually, I have to say, I’m not sure I ever want to see that in a manual of a piece of software as expensive as GeneSpring… And besides this still doesn’t work for me as advertised because GeneSpring, whilst aware of what an HTTP proxy might conceivably be, has no concept of what an FTP proxy might be – which is problematic when you need to connect to ftp.ncbi.nlm.nih.gov. Brilliant!