开发者社区> 问答> 正文

尽管调用了类,但无法从div获取href

我正在尝试在此网站上获取所有产品的链接:https : //www.officeworks.com.au/shop/officeworks/c/technology/audio- 扬声器/ voice-assistant-speakers

例如,对于Google Home Mini粉笔,我应该获取 https://www.officeworks.com.au/shop/officeworks/p/google-home-mini-chalk- sygminiwe

但是,我什至无法进入href链接之前的div类。我尝试了不同的代码,都使用bs4。这是我确定可以使用的两个代码,但是没有:

第一个代码

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen

url_products = []
url = "https://www.officeworks.com.au/shop/officeworks/c/technology/audio-speakers/voice-assistant-speakers"
req = Request(url)
html_page = urlopen(req)
soup = BeautifulSoup(html_page, "lxml")
data = soup.find_all('div', {'class': 'ProductTile__ProductImageWrapper-sc-1dlojg1-2 gRQAGx'})
for div in data:
    links = div.find_all('a')
    for a in links:
        print('https://www.officeworks.com.au/' + a['href'])
        url_products.append('https://www.officeworks.com.au/' + a['href'])

第二个代码

from bs4 import BeautifulSoup
import requests

r = requests.get('https://www.officeworks.com.au/shop/officeworks/c/technology/audio-speakers/voice-assistant-speakers')
soup = BeautifulSoup(r.content, 'lxml')
links = [item['href'] for item in soup.select('.gRQAGx > a')]

我相信我没有给合适的学生上课,但我无法弄清楚它是什么。提前致谢!

问题来源:stackoverflow

展开
收起
is大龙 2020-03-25 00:18:41 664 0
1 条回答
写回答
取消 提交回答
  • 之所以无法获得预期的输出,是因为该页面是通过JavaScript加载的,因此,在渲染JS之前,您将无法提取预期的输出。

    因此,您可以使用Selenium,但我不建议您这样做,因为它会减慢您的工作速度。

    或使用来自requests_html的HTMLSession进行动态渲染。

    否则,我们仅使用源于API的JS渲染的原点。

    在浏览器开发人员工具CTRLSHIFTE下的FireFox等下通过Network-Tab跟踪XHR请求之后。

    所以在这里我们可以打电话:

    import requests
    
    json = {"requests": [{"indexName": "prod-product-wc-bestmatch-personal", "params": "query=&hitsPerPage=24&maxValuesPerFacet=10&page=0&highlightPreTag=%3Cais-highlight-0000000000%3E&highlightPostTag=%3C%2Fais-highlight-0000000000%3E&clickAnalytics=true&optionalFilters=%5B%5D&sumOrFiltersScores=true&filters=(categorySeoPaths%3A%22technology%2Faudio-speakers%2Fvoice-assistant-speakers%22)&facets=%5B%22rangedOnline%22%2C%22forestProductSchemeName%22%2C%22hardDriveType%22%2C%22bagStyle%22%2C%22socketType%22%2C%22fullSizeInnerDimensions%22%2C%22stapleSize%22%2C%22connectivity%22%2C%22smartHomeCompatibility%22%2C%22industryType%22%2C%22sizeCapacity%22%2C%22performancePrintResolution%22%2C%22handsetIncludedHandsets%22%2C%22usbFlashLidType%22%2C%22videoResolution%22%2C%22maximumPunchingCapacity%22%2C%22rangedRetail%22%2C%22protectionType%22%2C%22rulerLength%22%2C%22sizeNumber%22%2C%22deviceConnectivityTechnology%22%2C%22unitsOfMeasure%22%2C%22selfAdhesive%22%2C%22interfaceHardDrive%22%2C%22sharpenerSize%22%2C%22connectivityWifiBands%22%2C%22microphoneType%22%2C%22labellerKeyboardLayout%22%2C%22numberOfUsb30Ports%22%2C%22operatingSystemEdition%22%2C%22ringRingSize%22%2C%22performanceHealthMonitoringFunctions%22%2C%22connectivityTechnology%22%2C%22dualSimCompatible%22%2C%22audioSource%22%2C%22totalNumberOfLabels%22%2C%22brushShape%22%2C%22maxProcessorClockSpeed%22%2C%22operatingHand%22%2C%22powerBatteryTechnology%22%2C%22travelRegion%22%2C%22capacityBinder%22%2C%22licenceValidityPeriod%22%2C%22storageHardDriveCapacity%22%2C%22spineSize%22%2C%22rollLength%22%2C%22numberOfRings%22%2C%22lightBulbType%22%2C%22colour%22%2C%222SidedCopying%22%2C%22automaticDocumentFeederCapacity%22%2C%22automaticPaperFeed%22%2C%22performanceShredderCutType%22%2C%22performanceBrightness%22%2C%22displayResolution%22%2C%22labellingOfficeUseFacet%22%2C%22securityLevel%22%2C%22maxSupportedDocumentSize%22%2C%22bulkbuyOnline%22%2C%22staplingCapacity%22%2C%22storageIncludedFlashMemory%22%2C%22compatibabilityCustomFitAndroid%22%2C%22drawerNumberOfDrawers%22%2C%22storageInternalMemorySize%22%2C%22ramInstalledSize%22%2C%22100RecycledProduct%22%2C%22placementPlacingMounting%22%2C%22earPlacement%22%2C%22foldedDimensions%22%2C%22portsTotalNumberOfNetworkingPorts%22%2C%22powerBatteryChargeAmpHours%22%2C%22noiseCancelling%22%2C%22surfaceShape%22%2C%22labellingHomeUseFacet%22%2C%22sizeDescription%22%2C%22maxLoadWeight%22%2C%22numberOfPowerPorts%22%2C%22compatibabilityCustomFitApple%22%2C%22tsaApproved%22%2C%22chassisType%22%2C%22surgeSuppression%22%2C%22printingTechnologyPrinters%22%2C%22placementVesaMountCompatibility%22%2C%22boardSizeFacet%22%2C%22frameStyle%22%2C%22serviceProvider%22%2C%22bluetoothCompatibility%22%2C%22scannerType%22%2C%22photoCapacityQuantity%22%2C%22numberOfUsb20Ports%22%2C%22rulingType%22%2C%22learningSkillsFocus%22%2C%22licenceType%22%2C%22connectivityDisplayConnections%22%2C%22performanceMaxThickness%22%2C%22performanceResolution%22%2C%22paperWeightGsm%22%2C%22numberOfProcessorCores%22%2C%22fitsDevice%22%2C%22brushhairtype%22%2C%22opticalZoom%22%2C%22processorClockSpeed%22%2C%22labellingIndustrialUseFacet%22%2C%22performanceApproximateNumberOfImpressions%22%2C%222SidedPrinting%22%2C%22powerPowerType%22%2C%22interfaceType%22%2C%22printerConnectivityTechnology%22%2C%22numberOfReamsPerCarton%22%2C%22baseWheels%22%2C%22performanceEstimatedCartridgeYieldSheets%22%2C%22papersize%22%2C%22processorType%22%2C%22wallStrengthThickness%22%2C%22storageHardDriveCapacityComputingDevices%22%2C%22ciewhiteness%22%2C%22runTime%22%2C%22stampInking%22%2C%22switched%22%2C%22processorManufacturer%22%2C%22deviceCaseCompatibility%22%2C%22caseFeaturesNumberOfCompartments%22%2C%22displaySize%22%2C%222sidedScanning%22%2C%22glutenFree%22%2C%22restTime%22%2C%22operatingPlatformCompatibility%22%2C%22powerSource%22%2C%22touchScreen%22%2C%22displayPanelType%22%2C%22secondaryProcessorType%22%2C%22wastebinCapacityRange%22%2C%22softwareDistributionMedia%22%2C%22learningAgeRange%22%2C%22tapeWidth%22%2C%22storageStorageCapacity%22%2C%22cableLength%22%2C%22skillLevel%22%2C%22flightTime%22%2C%22energyRating%22%2C%22maximumRecommendedDailyUsage%22%2C%22contentLayout%22%2C%22deviceLocation%22%2C%22brand%22%2C%22numberOfUsb31Ports%22%2C%22lidIncluded%22%2C%22scannerScanResolution%22%2C%22portsNumberOfUsbChargePorts%22%2C%22envelopeSize%22%2C%22keyboardCompatibility%22%2C%22primaryCameraVideo%22%2C%22supportedMemoryCards%22%2C%22connectivityDisplayConnectionsPanels%22%2C%22up1Category%22%2C%22price%22%2C%22categorySeoPaths%22%2C%22rangedRetail%22%2C%22rangedOnline%22%2C%22price%22%2C%22brand%22%2C%22colour%22%2C%22audioSource%22%2C%22cableLength%22%2C%22up1Category%22%2C%22bulkbuyOnline%22%2C%22microphoneType%22%2C%22noiseCancelling%22%2C%22bluetoothCompatibility%22%2C%22powerBatteryTechnology%22%2C%22smartHomeCompatibility%22%5D&tagFilters=&facetFilters=%5B%5B%22categorySeoPaths%3Atechnology%2Faudio-speakers%2Fvoice-assistant-speakers%22%5D%5D"}, {"indexName": "prod-product-wc-bestmatch-personal", "params": "query=&hitsPerPage=1&maxValuesPerFacet=10&page=0&highlightPreTag=%3Cais-highlight-0000000000%3E&highlightPostTag=%3C%2Fais-highlight-0000000000%3E&clickAnalytics=false&optionalFilters=%5B%5D&sumOrFiltersScores=true&filters=(categorySeoPaths%3A%22technology%2Faudio-speakers%2Fvoice-assistant-speakers%22)&attributesToRetrieve=%5B%5D&attributesToHighlight=%5B%5D&attributesToSnippet=%5B%5D&tagFilters=&analytics=false&facets=categorySeoPaths"}]}
    r = requests.post("https://k535caawve-dsn.algolia.net/1/indexes/\*queries?x-algolia-agent=Algolia%20for%20JavaScript%20(3.35.1)%3B%20Browser%20(lite)%3B%20react-instantsearch%205.4.0%3B%20JS%20Helper%202.26.1&x-algolia-application-id=K535CAAWVE&x-algolia-api-key=8a831febe0110932cfa06ff0e2024b4f", json=json).json()
    
    for item in r['results'][0]['hits']:
        print("Name: {:<65}, Url: {}".format(
            item['name'], f"https://www.officeworks.com.au/shop/officeworks/p/{item['urlKeyword']}"))
    

    输出:

    Name: Google Home Mini Chalk                                           , Url: https://www.officeworks.com.au/shop/officeworks/p/google-home-mini-chalk-sygminiwe
    Name: Google Home Mini Charcoal                                        , Url: https://www.officeworks.com.au/shop/officeworks/p/google-home-mini-charcoal-sygminibk
    Name: Google Nest Hub Max Charcoal                                     , Url: https://www.officeworks.com.au/shop/officeworks/p/google-nest-hub-max-charcoal-sygnhmaxbk
    Name: Google Nest Hub Max Chalk                                        , Url: https://www.officeworks.com.au/shop/officeworks/p/google-nest-hub-max-chalk-sygnhmaxwe
    Name: Google Home                                                      , Url: https://www.officeworks.com.au/shop/officeworks/p/google-home-sygghome
    Name: Ultimate Ears Megablast Wireless Speaker with Alexa Graphite     , Url: https://www.officeworks.com.au/shop/officeworks/p/ultimate-ears-megablast-wireless-speaker-with-alexa-graphite-inmblastbk
    Name: Google Nest Mini 2nd Generation Charcoal                         , Url: https://www.officeworks.com.au/shop/officeworks/p/google-nest-mini-2nd-generation-charcoal-sygnmini2c
    Name: Google Nest Mini 2nd Generation Chalk                            , Url: https://www.officeworks.com.au/shop/officeworks/p/google-nest-mini-2nd-generation-chalk-sygnmini2w
    Name: Ultimate Ears Blast Wireless Speaker with Alexa Graphite         , Url: https://www.officeworks.com.au/shop/officeworks/p/ultimate-ears-blast-wireless-speaker-with-alexa-graphite-imblastbk
    Name: Amazon 5.5" Echo Show 5 Charcoal                                 , Url: https://www.officeworks.com.au/shop/officeworks/p/amazon-5-5-echo-show-5-charcoal-syecosh5cl
    Name: Amazon Echo 3rd Generation Charcoal                              , Url: https://www.officeworks.com.au/shop/officeworks/p/amazon-echo-3rd-generation-charcoal-syaedotclc
    Name: JBL Flip Essential Bluetooth Speaker Gun Metal                   , Url: https://www.officeworks.com.au/shop/officeworks/p/jbl-flip-essential-bluetooth-speaker-gun-metal-imjblfless
    Name: Ultimate Ears Megablast Wireless Speaker with Alexa Blue         , Url: https://www.officeworks.com.au/shop/officeworks/p/ultimate-ears-megablast-wireless-speaker-with-alexa-blue-inmblastbe
    Name: Amazon Echo Dot 3rd Gen With Clock Sandstone                     , Url: https://www.officeworks.com.au/shop/officeworks/p/amazon-echo-dot-3rd-gen-with-clock-sandstone-syaedotcls
    Name: Ultimate Ears Megablast Wireless Speaker with Alexa Merlot       , Url: https://www.officeworks.com.au/shop/officeworks/p/ultimate-ears-megablast-wireless-speaker-with-alexa-merlot-inmblastrd
    Name: Amazon Echo Dot 3rd Gen Heather Grey                             , Url: https://www.officeworks.com.au/shop/officeworks/p/amazon-echo-dot-3rd-gen-heather-grey-syamdot3ng
    Name: Lenovo Smart Clock E27 Starter Pack                              , Url: https://www.officeworks.com.au/shop/officeworks/p/lenovo-smart-clock-e27-starter-pack-sylsmcbun2
    Name: Amazon 5.5" Echo Show 5 Sandstone                                , Url: https://www.officeworks.com.au/shop/officeworks/p/amazon-5-5-echo-show-5-sandstone-syecosh5ss
    Name: Amazon Echo Studio Black                                         , Url: https://www.officeworks.com.au/shop/officeworks/p/amazon-echo-studio-black-syastudiob
    Name: Lenovo Smart Clock B22 Starter Pack                              , Url: https://www.officeworks.com.au/shop/officeworks/p/lenovo-smart-clock-b22-starter-pack-sylsmcbun1
    Name: JBL Link View Speaker with Google Assistant                      , Url: https://www.officeworks.com.au/shop/officeworks/p/jbl-link-view-speaker-with-google-assistant-injblinkvw
    Name: Ultimate Ears Blast Wireless Speaker with Alexa Blue Steel       , Url: https://www.officeworks.com.au/shop/officeworks/p/ultimate-ears-blast-wireless-speaker-with-alexa-blue-steel-imblastbe
    Name: LG WK7 ThinQ WiFi/Bluetooth Speaker with Google Assistant        , Url: https://www.officeworks.com.au/shop/officeworks/p/lg-wk7-thinq-wifi-bluetooth-speaker-with-google-assistant-inlgthinkq
    

    回答来源:stackoverflow

    2020-03-25 00:18:56
    赞同 展开评论 打赏
问答地址:
问答排行榜
最热
最新

相关电子书

更多
低代码开发师(初级)实战教程 立即下载
冬季实战营第三期:MySQL数据库进阶实战 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载