Stack Overflow Asked by Henry Joseph on January 9, 2021
[THAT IS THE URL THAT AM TRYING TO GET USING BEAUTIFUL SOUP BUT INSTEAD I GET THE ONE IN THE OTHER CODE THAT IS DARK. ][1]
THE CODE IS
import requests
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
def get_dynamic(url):
driver = webdriver.Chrome('/Users/hnry_jsph/Downloads/chromedriver')
driver.get(url)
time.sleep(5)
html = driver.page_source
return html
url = 'https://b-ok.africa/Web-design-cat70'
source = requests.get(url).text
soup = BeautifulSoup(source,'lxml')
card = soup.find('table',attrs={'class':'resItemTable',})
link = card.table.tr.td.h3.a['href']
url_link = f'https://b-ok.africa/{link}'
# print(url_link)
source_2 = requests.get(url_link).text
soup_2 = BeautifulSoup(source_2,'lxml')
image = soup_2.find('div',class_='details-book-cover-container')
image_url = image.a['href']
title = soup_2.find('div',class_='col-sm-9')
title_text = title.h1.text
author = title.i.a.text
description = title.div.p.text
html_for_download = get_dynamic(url_link)
soup_3 = BeautifulSoup(html_for_download,'lxml')
download_link = soup_3.find('div',class_='book-details-button')
print(download_link)
When i run that code the download link that i get does not match the one in the browser
when i inspect
This is what i get when run that code:
<a class="btn btn-primary dlButton addDownloadedBook" data-book_id="1304587" href="/dl/1304587/d67ba1" rel="nofollow" target="">
And this is what i get form the broswer when i inspect:
<a class="btn btn-primary dlButton addDownloadedBook" data-book_id="1304587" href="/dl/1304587/cb6983" rel="nofollow" target="">
When i put the link form the browser it downloads
but when i put the link from the scraper code it gives me
https://b-ok.africa/book/1304587/768897/?wrongHash
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP