Bioinformatics Asked by i_dont_know on July 10, 2021
I am trying to start a side project where I verify the claim that human and mice/monkey share 99% of their genes. I hope to learn the basics of doing genomic analysis.
I would like guidance about where I can find the sequences I need, and programming languages or applications that I should use. For the sequences, I would like for example, a list of all genes for each species. In the case that the programming language or application is foreign, some basic guidance would be helpful.
I have programming experience with matlab and python, and have learnt R previously, though forgotten. Would be able to learn a new language.
Alternatively, I think this may be a very large subject. If there is a link to a tutorial/lectures it will also be helpful.
Thanks in advance.
For downloading lists of genes, together with associated features, I like using Ensembl Biomart:
http://ensembl.org/biomart/martview/
In this case, you can "CHOOSE DATABASE -> Ensembl Genes", then "CHOOSE DATASET -> Human Genes" or "CHOOSE DATASET -> Chimpanzee genes" to get to a table selection.
Clicking on "Attributes" on the left hand sidebar brings you to a selector for different fields to add to the table.
Clicking on the "Results" button at the top left brings you to a form for choosing how results should be output.
Bear in mind that gene names are not likely to match. Finding the intersection of gene sets between species is a complicated process, usually requiring knowledge about gene copies, paralogs, orthologs, and sequence comparison.
It's also likely that the human genes are more annotated than the chimpanzee genes, and there might be genes present in the human list that are also in the chimpanzee genome, but have not yet been discovered.
Correct answer by gringer on July 10, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP