Wednesday, August 3, 2011

A Bird's Eye View of NCBI GEO Database

NCBI GEO database is the world’s largest public online repository for transcriptome datasets. It includes transcriptome data from several types of experiments – arrays, next-gen sequencing, MPSS, SAGE, RT-PCR, etc., although major share of data comes from array measurements. For short read datasets (NGS), some researchers prefer to use NCBI SRA database as a repository. SRA database includes both transcriptomic and genomic sequences, and we will cover its transcriptomic component in a later post.

Majority of GEO users typically download and analyze only one or two measurement sets related to their own research. Here we plan to look at the entire collection of measurements stored in GEO. This post is introductory, but over the next few days, we will present various interesting charts of GEO data to show trends in transcriptomics.

For non-users, let me first explain the structure of GEO. If you go to GEO website, you will notice the following stats on the right-hand corner near the top. They show the current contents of the GEO database.

Continue here

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.