Bioinformatics Asked by Pasted on May 14, 2021
I currently find Harvard’s RESTful API for ExAC extremely useful and I was hoping that a similar resource is available for Gnomad?
Does anyone know of a public access API for Gnomad or possibly any plans to integrate Gnomad into the Harvard API?
As far as I know, no but the vcf.gz files are behind a http server that supports Byte-Range, so you can use tabix
or any related API:
$ tabix "https://storage.googleapis.com/gnomad-public/release-170228/vcf/exomes/gnomad.exomes.r2.0.1.sites.vcf.gz" "22:17265182-17265182"
22 17265182 . A T 762.04 PASS AC=1;AF=4.78057e-06;AN=209180;BaseQRankSum=-4.59400e+00;ClippingRankSum=2.18000e+00;DP=4906893;FS=1.00270e+01;InbreedingCoeff=4.40000e-03;MQ=3.15200e+01;MQRankSum=1.40000e+00;QD=1.31400e+01;ReadPosRankSum=2.23000e-01;SOR=9.90000e-02;VQSLOD=-5.12800e+00;VQSR_culprit=MQ;GQ_HIST_ALT=0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1;DP_HIST_ALT=0|0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0;AB_HIST_ALT=0|0|0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0;GQ_HIST_ALL=1591|589|120|301|650|589|1854|2745|1815|4297|5061|2921|10164|1008|6489|1560|7017|457|6143|52950;DP_HIST_ALL=2249|1418|6081|11707|16538|9514|28624|23829|7391|853|95|19|1|0|0|1|0|1|0|0;AB_HIST_ALL=0|0|0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0;AC_AFR=0;AC_AMR=0;AC_ASJ=0;AC_EAS=0;AC_FIN=1;AC_NFE=0;AC_OTH=0;AC_SAS=0;AC_Male=1;AC_Female=0;AN_AFR=11994;AN_AMR=31324;AN_ASJ=7806;AN_EAS=13112;AN_FIN=20076;AN_NFE=94516;AN_OTH=4656;AN_SAS=25696;AN_Male=114366;AN_Female=94814;AF_AFR=0.00000e+00;AF_AMR=0.00000e+00;AF_ASJ=0.00000e+00;AF_EAS=0.00000e+00;AF_FIN=4.98107e-05;AF_NFE=0.00000e+00;AF_OTH=0.00000e+00;AF_SAS=0.00000e+00;AF_Male=8.74386e-06;AF_Female=0.00000e+00;GC_AFR=5997,0,0;GC_AMR=15662,0,0;GC_ASJ=3903,0,0;GC_EAS=6556,0,0;GC_FIN=10037,1,0;GC_NFE=47258,0,0;GC_OTH=2328,0,0;GC_SAS=12848,0,0;GC_Male=57182,1,0;GC_Female=47407,0,0;AC_raw=1;AN_raw=216642;AF_raw=4.61591e-06;GC_raw=108320,1,0;GC=104589,1,0;Hom_AFR=0;Hom_AMR=0;Hom_ASJ=0;Hom_EAS=0;Hom_FIN=0;Hom_NFE=0;Hom_OTH=0;Hom_SAS=0;Hom_Male=0;Hom_Female=0;Hom_raw=0;Hom=0;POPMAX=FIN;AC_POPMAX=1;AN_POPMAX=20076;AF_POPMAX=4.98107e-05;DP_MEDIAN=58;DREF_MEDIAN=5.01187e-84;GQ_MEDIAN=99;AB_MEDIAN=6.03448e-01;AS_RF=9.18451e-01;AS_FilterStatus=PASS;CSQ=T|missense_variant|MODERATE|XKR3|ENSG00000172967|Transcript|ENST00000331428|protein_coding|4/4||ENST00000331428.5:c.707T>A|ENSP00000331704.5:p.Phe236Tyr|810|707|236|F/Y|tTc/tAc||1||-1||SNV|1|HGNC|28778|YES|||CCDS42975.1|ENSP00000331704|Q5GH77||UPI000013EFAE||deleterious(0)|benign(0.055)|hmmpanther:PTHR14297&hmmpanther:PTHR14297:SF7&Pfam_domain:PF09815||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000672806|TF_binding_site|||||||||||1||||SNV|1||||||||||||||||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00001729562|CTCF_binding_site|||||||||||1||||SNV|1||||||||||||||||||||||||||||||||||||||||||||
UPDATE: 2019: the current server for gnomad doesn't support Byte-Range requests.
Correct answer by Pierre on May 14, 2021
You can browse gnomAD variants with ClinGen Allele Registry (there is API spec available).
Answered by user1690 on May 14, 2021
The new gnomAD site (as of August 2019) says no, no API yet:
How do I query a batch of variants? Do you have an API?
We currently do not have a way to submit batch queries on the browser, but we are actively working on developing an API for ExAC/gnomAD. If you would like to learn about GraphQL, which we will use to work with the API, an overview can be found at https://graphql.org. You can also obtain information on all variants from the VCFs and Hail Tables available on our downloads page.
But, the web interface itself already makes POST requests to https://gnomad.broadinstitute.org/api to send and receive JSON/GraphQL. So, you can make those same queries programmatically right now, even if it's not officially a public API.
Here's an example in Python to get some basic info on variants for a particular gene. This way you get simple nested Python objects to work with:
{ 'consequence': 'intron_variant',
'pos': 6928442,
'rsid': 'rs782435448',
'variant_id': '12-6928442-C-A'},
{ 'consequence': 'splice_region_variant',
'pos': 6928462,
'rsid': None,
'variant_id': '12-6928462-C-A'},
{ 'consequence': 'splice_acceptor_variant',
'pos': 6928464,
'rsid': 'rs782577109',
'variant_id': '12-6928464-G-A'},
{ 'consequence': 'missense_variant',
'pos': 6928466,
'rsid': 'rs782208003',
'variant_id': '12-6928466-C-T'},
(I found it useful to go this route because then the full metadata visible in the gnomAD web interface is then available, including the per-variant details like allele counts by population. I couldn't find this information in the other APIs described here.)
Answered by Jesse on May 14, 2021
I faced same issue recently, I found those link and python script:
gnomAD GraphQL api https://gnomad.broadinstitute.org/api It works great but it is a kind of different query language. Please check here for the docs: https://graphql.org/learn/queries/
gnomAD Python Api https://github.com/furkanmtorun/gnomad_python_api
Answered by John t_eckerd on May 14, 2021
I found Jesse's code quite usefull ! For those who try to reproduce it, you should now add the reference genome ID, such as :
#!/usr/bin/env python
import requests
# import pprint
# prettyprint = pprint.PrettyPrinter(indent=2).pprint
def fetch(jsondata, url="https://gnomad.broadinstitute.org/api"):
# The server gives a generic error message if the content type isn't
# explicitly set
headers = {"Content-Type": "application/json"}
response = requests.post(url, json=jsondata, headers=headers)
json = response.json()
if "errors" in json:
raise Exception(str(json["errors"]))
return json
def get_variant_list(gene_id, dataset="gnomad_r2_1"):
# Note that this is GraphQL, not JSON.
fmt_graphql = """
{
gene(gene_id: "%s", reference_genome: GRCh38) {
variants(dataset: %s) {
consequence
pos
rsid
variant_id: variantId
}
}
}
"""
# This part will be JSON encoded, but with the GraphQL part left as a
# glob of text.
req_variantlist = {
"query": fmt_graphql % (gene_id, dataset),
"variables": {"withFriends": False}
}
response = fetch(req_variantlist)
return response["data"]["gene"]["variants"]
prettyprint(get_variant_list("ENSG00000010610"))
Answered by BretSnoop on May 14, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP