Tag Archives: microarray

Computational biology post open at OGT

Edit: The position is now filled, and we’ve welcomed Luke Goodsell to the group!

Continuing the theme of job adverts as blog posts, we’re currently seeking someone to join the growing Computational Biology group at OGT.  Microarrays are a big part of OGT’s portfolio, and this post is all about the arrays.  This is a new post, and will report to our design expert Duarte Molha:

A Computational Biologist is required to join Oxford Gene Technology’s (OGT) Computational Biology group. The successful applicant will deliver microarray designs for internal R&D projects, Genefficiency Services and CytoSure products. The individual will also be involved in advancing OGT’s expertise in probe and array designs.

We are seeking a highly motivated and innovative individual with a numerate background combined with a solid understanding of molecular biology. Programming skills (Perl is essential; Java or Python are desirable) and scripting skills (awk, sed, bash) applied in a heterogeneous computing environment (Windows, Linux) are essential, as are good attention to detail and customer focus. Experience with MySQL, PHP and version control (git) is desirable, as are experience in handling large datasets, and experience in a commercially led environment.

You must be a team player with strong communication skills. Well-developed time-management skills and good organisation are important to the role.

An undergraduate degree in Computational Biology, Molecular Biology or a related discipline is required, preferred would be a Master’s degree.

We offer a competitive salary together with an excellent benefits package.

If you are interested in applying for this position, please complete the online application form, attaching your CV and ensuring to state your salary expectations within the covering letter section. Alternatively, email the required information to hr@ogt.com. We will only accept applications from candidates who are legally permitted to work in the UK.

GeneSpring GX 11, or how to make the simple – undocumentable..

I divide my time about equally between GeneSpring GX 11 from Agilent and BioConductor (free, from awesome people) for microarray analysis.  The latter for all the neat tools that GeneSpring doesn’t have, the former because sometimes it is nice to lead a researcher visually through their data, without having to type into a green on black terminal window.

GeneSpring GX 11 is the third iteration after Agilent bought up Silicon Genetics, then decided to throw the unwieldy, quirky, but very functional GeneSpring product in the trash and start again with something built on Strand Life Sciences AVADIS platform.  We’ve been through versions 9, 10 and now we’re on 11.  There’s been plenty of bugs on the way, the most serious (to me) being the one where GeneSpring 10 managed to miscall the quality flags on Illumina data in 50% of the cases.  Not good, but at least fixed.

Many people have been griping on mailing lists about functionality missing in the new GeneSpring that existed in the old version.  I always think it’s a matter of familiarity with the software really.  I hadn’t really come across anything I couldn’t do in GeneSpring 11 that I could do in GeneSpring 7.  That was until yesterday.

I sat down with a customer yesterday to look at some microbial Nimblegen data.  GeneSpring doesn’t really deal with Nimblegen data very well, you are left with the choice of not analysing it in GeneSpring, or accepting there’s going to be a bit of fudging and some extra annotation steps in order to make the data useable as a ‘Custom technology’.  The customer, quite reasonably, asked if we could get the biological genome information (effectively gene annotations that are independent of the chip technology you’re using) loaded into GeneSpring.  And thus started a morning of fun and games.

GeneSpring 11 has a very handy import feature for biological genomes under Annotations>Create Biological Genome.  That is providing you want to choose one of their predefined organisms to download the information from NCBI.  There is *NO* route in the software to add another organism to this list, or do anything than use one of their check box limited organisms.  This is not a bug apparently, because in a separate part of the software (dealing with Pathways for an organism) you can pull this information directly from NCBI using the Taxon ID of the organism you’re interested in.  So why can’t you use it to download a biological genome?  Who knows…

One of the things I really liked about the old GeneSpring was the fact that it came with a manual a foot thick.   It told you how to do every single operation in the UI, it didn’t tell you anything about the order in which to apply them, but you could generally rely on it for an answer.  There was no such answer to this issue in the GeneSpring manual..

It transpires that if you really want to do this, the following, slightly insane process needs to take place:

1) Take this snippet of XMLishness:

<hexff version="1.0">

 type
 plugin.product.TaxID
 id
 TaxID
 data
 
 
 Homo sapiens9606
 Mus musculus10090
 Rattus norvegicus10116
 Anopheles gambiae7165
 Arabidopsis thaliana3702
 Bacillus subtilis1423
 Bos taurus9913
 Caenorhabditis elegans6239
 Canis lupus familiaris9615
 Citrus sinensis2711
 Danio rerio7955
 Drosophila melanogaster7227
 Equus caballus9796
 Escherichia coli562
 Felis catus9685
 Gallus gallus9031
 Glycine max3847
 Gossypium hirsutum3635
 Hordeum vulgare4513
 Macaca mulatta9544
 Magnaporthe grisea148305
 Medicago sativa3879
 Medicago truncatula3880
 Nicotiana tabacum4097
 Oryctolagus cuniculus9986
 Oryza sativa4530
 Ovis aries9940
 Pan troglodytes9598
 Plasmodium falciparum5833
 Pongo abelii9601
 Poplar mosaic virus12166
 Populus sp.3697
 Pseudomonas aeruginosa287
 Saccharomyces cerevisiae4932
 Saccharum officinarum4547
 Salmo salar8030
 Schizosaccharomyces pombe4896
 Staphylococcus aureus1280
 Sus scrofa9823
 Takifugu rubripes31033
 Lycopersicon esculentum4081
 Triticum aestivum4565
 Vitis vinifera29760
 Xenopus laevis8355
 Xenopus tropicalis8364
 Zea mays 4577

 
 

</hexff>

2) Add an entry

<key>Your organism name</key><string>NCBI Taxon ID</string>

after the Zea mays line

3) In your GeneSpring directory under this tree:

GeneSpring  GX11binpackagesmarrayproject2.1

Create a folder called ‘plugins’ and save the edited XML above as a file called TaxID.plg

4) Restart GeneSpring and proceed to update your newly added Biological genome, which now appears in the list!

Actually, I have to say, I’m not sure I ever want to see that in a manual of a piece of software as expensive as GeneSpring…  And besides this still doesn’t work for me as advertised because GeneSpring, whilst aware of what an HTTP proxy might conceivably be, has no concept of what an FTP proxy might be – which is problematic when you need to connect to ftp.ncbi.nlm.nih.gov.  Brilliant!

PLoS Computational Biology : ‘Getting started in’

I notice that PLoS Computational Biology is publishing a series of ‘Getting Started in‘ articles for bioinformatics/computational biology.

“The aim of each article in the “Getting Started in…” series is to introduce the essentials: define the area and what it is about, highlight the debates and issues of relevance, and provide directions to the most relevant books, articles, or Web sites to find out more. The series will not include review articles or detailed tutorials; these are available in the Education section of the Journal. Rather, each “Getting Started in…” article will aim to be a cache of “go to” information for someone for whom the field is completely new.”

The first one is a neat little excursion into tiling array analysis for platforms such as Nimblegen et al., it is relatively brief, but supported by excellent references which people new to the field will find extremely useful.