获取特定作者的所有href_问答-阿里云开发者社区

您好，我正在尝试获取所有/ pubmed /编号，这些编号将我链接到特定作者的文章摘要。问题是，当我尝试这样做时，我只会一遍又一遍地获得相同的数字，直到for循环结束为止。

我要获取的href应当取自for in line in lines循环的输出（具体的href在输出示例中）。该循环似乎工作良好，但随后，“抽象中的抽象”循环仅重复相同的href。

任何建议或想法，我缺少或做错了什么。我对bs4没有太多的经验，所以我可能没有很好地使用该库。

#Obtain all the papers of a scientific author and write its abstract in a new file

    from bs4 import BeautifulSoup

    import re

    import requests

    url="https://www.ncbi.nlm.nih.gov/pubmed/?term=valvano"
    response = requests.get(url)



    soup = BeautifulSoup(response.content, 'lxml')


    lines = soup.find_all("div",{"class": "rslt"})

    authors= soup.find_all("p",{"class": "desc"})

    scientist=[]

    for author in authors:
            #print('\n', author.text)
        scientist.append(author.text)
    s=[]
    for i in scientist:
        L=i.split(',')
        s.append(L)          


    n=0

    for line in lines:

        if ' Valvano MA' in s[n] or 'Valvano MA' in s[n] :
            print('\n',line.text)
#part of one output:
<a \*href="/pubmed/32146294"\* ...


            found = soup.find("a",{"class": "status_icon nohighlight"})
            web_abstract='https://www.ncbi.nlm.nih.gov{}'.format(found['href'])
            response0 = requests.get(web_abstract)
            sopa = BeautifulSoup(response0.content, 'lxml')
            abstracts = sopa.find("div",{"class": "abstr"})
            for abstract in abstracts:
                #print (abstract.text)


                print('https://www.ncbi.nlm.nih.gov{}'.format(found['href']))
#output: 
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31919170




            n=n+1

        else:

            n=n+1
#expected output:
https://www.ncbi.nlm.nih.gov/pubmed/32146294
https://www.ncbi.nlm.nih.gov/pubmed/32064693
https://www.ncbi.nlm.nih.gov/pubmed/31978399
https://www.ncbi.nlm.nih.gov/pubmed/31919170
https://www.ncbi.nlm.nih.gov/pubmed/31896348
https://www.ncbi.nlm.nih.gov/pubmed/31866961
https://www.ncbi.nlm.nih.gov/pubmed/31722994
https://www.ncbi.nlm.nih.gov/pubmed/31350337
https://www.ncbi.nlm.nih.gov/pubmed/31332863
https://www.ncbi.nlm.nih.gov/pubmed/31233657
https://www.ncbi.nlm.nih.gov/pubmed/31133642
https://www.ncbi.nlm.nih.gov/pubmed/30913267

问题来源：stackoverflow

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

获取特定作者的所有href