Wednesday, February 25, 2015

Gene Networks and Gene Clustering : Recent Advancements

Human genetic clustering analysis uses mathematical cluster analysis of the degree of similarity of genetic data between individuals and groups to infer population structures and assign individuals to groups that often correspond with their self-identified geographical ancestry. Gene Network is the complex network relating genes of people around the world.
We later illustrate that similar techniques can be used for analyzing Human Genetic Disease Networks which are of great use for mankind. 

Current state of art:

A Gene Network is an example of a Small World Network. So first of all I want to discuss about what exactly a small world network is and then i will discuss how we can think of a network to get some highly demanded applications .

"A small world network is a class of random graphs where most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps (Perhaps the best known example of this is the 'six degrees of separation' feature discovered by Stanley Milgram in 1967.) . A small world network, where nodes represent people and edge connect people that know each other, captures the small world phenomenon of strangers being linked by a mutual acquaintance. The social network, the connectivity of the Internet, and gene networks all exhibit small-world network characteristics. The small world phenomenon (also known as the small world effect) is the hypothesis that everyone in the world can be reached through a short chain of social acquaintances. The small world problem asks for the probability that two people picked at random have at least one acquaintance in common."


                   Six degrees of separation.

Now lets discuss few ideas on how we will construct gene networks.

Considering people as nodes the degree of similarity between the genes will determine the edge construction between nodes. 

It has been found that there will be clusters of people together based on community and regions and we can analyze the inter-community relationships. We can guess/determine the origin of a particular community by these relationships. Apart from this we can find traits that are common like certain regions average height is bigger than other regions. Analyzing these patterns and relationships can reveal some interesting results.

Recent Advances have made Gene Clustering More Practical:

1) More than 200,000 people have already had their genomes sequenced as per 2014, a leader in this field a silicon valley company, Illumina says that this year 228,000 will sequence their DNA. More sequencing being we have more data to recognize patterns.

2) One argument for quick action is that the amount of genome data is exploding. The largest labs can now sequence human genomes to a high polish at the pace of two per hour.(The first genome took about 13 years)

3) The cost of genome sequencing has gone down from order of billion dollars to few thousand dollars, see the graph below :

Few more implications / Proposed applications of gene networks :

Suppose friend of mine Rahul is suffering from a genetic disorder without a name. Doctors have tried to solve the case but are finding it difficult to cure him. There might be people around the who world have suffered with similar disorder but how to know them ? Since the disorder is genetic hope that we could find related genetic data. In this case clustering/gene networks might help to find similar patients. After finding similar patients doctors will have idea what was the history of the patient who suffered from the similar how were they cured or what was their average life etc. Recent studies have shown Human Disease Network and Disease gene network as shown below are useful( found them in Nature Magazine) .

    Human disease network (HDN).

In the HDN, each node corresponds to a distinct disorder, colored based on the disorder class to which it belongs, the name of the 22 disorder classes being shown on the right. A link between disorders in the same disorder class is colored with the corresponding dimmer color and links connecting different disorder classes are gray. The size of each node is proportional to the number of genes participating in the corresponding disorder (see key), and the link thickness is proportional to the number of genes shared by the disorders it connects. We indicate the name of disorders with 10 associated genes.

Disease gene network (DGN).

In the DGN, each node is a gene, with two genes being connected if they are implicated in the same disorder. The size of each node is proportional to the number of disorders in which the gene is implicated. Nodes representing genes with links to multiple classes are colored dark grey, whereas unclassified genes are colored light grey.  Only nodes with at least one link are shown.

References :

No comments:

Post a Comment