NLTK基础教程学习笔记(十二)

简介:

构建第一个NLP应用:
信息摘要:对所提供的文章短文故事生成需要针对其内容自动生成摘要。信息摘要需要理解的不只是句子的结构,而是整个文本结构,还要了解文本的体裁和主体主题内容。
下面了一个介绍创建个人版的Google News
通常用于较多实体和名词的句子的重要性往往会比较高,现在的任务是要用某种可能被标准化的统一逻辑来计算重要性成分(importance score),即如果想要获取前n个句子的信息情况,要去选择一个重要性评分阈值。
由于找不到原文的新闻材料所以用wiki上的一段介绍吾王Saber材料代替;

f=open('new.txt','r')
new_content=f.read()
print(new_content)

结果:

Saber's full name is Altria Pendragon, a character inspired by the legends of King Arthur. At her nativity, Uther decides to not publicly 
announce Altria's birth or gender, fearing his subjects will never accept a woman as a legitimate ruler. She is entrusted by Merlin to a loyal knight, Sir 
Ector, who raises her as a surrogate son. When Altria is fifteen, King Uther dies leaving no known eligible heir to the throne. Britain enters a period of 
turmoil following the growing threat of invasion by the Saxons. Merlin soon approaches Altria, explaining that the British people will recognize her as a 
destined ruler if she withdraws Caliburn, a ceremonial sword embedded in a large slab of stone. However, pulling this sword is symbolic of accepting the 
hardships of a monarch, and Altria will be responsible for preserving the welfare of her people. Without hesitation and despite her gender, she draws Caliburn 
and shoulders Britain's mantle of leadership.
Altria rules Britain from her stronghold in Camelot and earns the reputation of a just, yet distant king. Under the guidance of Merlin and with the aid of her 
Knights of the Round Table, she guides Britain into an era of prosperity and tranquillity. Caliburn is destroyed, but Altria soon acquires her holy sword, 
Excalibur, and Avalon, Excalibur's blessed sheath, from Vivian, the Lady of the Lake. While Avalon is in her possession, Altria never ages and is immortal in battle.
Despite her immense strength and fighting abilities, Altria is plagued by feelings of guilt and inferiority throughout her reign; she sacrifices her emotions for 
the good of Britain, yet many of her subjects and knights become critical of her lack of humanity and cold calculation. Excalibur's scabbard is stolen while she 
repels an assault along her country's borders; when Altria returns inland, she discovers Britain is being torn asunder by civil unrest. Despite her valiant efforts 
to placate the dissent, Altria is mortally wounded by a traitorous knight, a homunculus born of her blood named Mordred, during the Battle of Camlann. Her dying body 
is escorted to a holy isle by Morgan le Fay and Sir Bedivere. Altria orders a grieving Bedivere to dispose of Excalibur by throwing it back to Vivian; in her absence, 
she reflects on her personal failures, regretting her life as king. Before her last breath, she appeals to the world; in exchange for services as a Heroic Spirit, she 
asks to be given an opportunity to relive her life, where someone more suitable and effective would lead Britain in her stead.

要对文字进行分析,先要将文章转换成一个句子列表。用句子标识器将内容分成若干个句子,这里提供一些句型的编号,便于识别这些句子并对其进行排名。一旦得到了这些段子,会让其在单词标识器中过一遍,最后再来过NER标注器和POS标注器。

import nltk
f=open('new.txt','r')
new_content=f.read()
results=[]
for sent_no,sentence in enumerate(nltk.sent_tokenize(new_content)):
    no_of_tokens=len(nltk.word_tokenize(sentence))
    #print(no_of_tokens)
    tagged=nltk.pos_tag(nltk.word_tokenize(sentence))
    no_of_nouns=len([word for word ,pos in tagged if pos in ["NN","NNP"]])
    ners=nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sentence)))
    no_of_ners=len([chunk for chunk in ners if hasattr(chunk,'label')])
    score=(no_of_ners+no_of_nouns)/float(no_of_tokens)
    results.append((sent_no,no_of_tokens,no_of_ners,no_of_nouns,score,sentence))
for sent in sorted(results,key=lambda x:x[4],reverse=True):
    print(sent[5])

上面代码中我们对句子列表进行了迭代,并根据公式计算出了这些句子的评分,该公式只是个以被标识实体为分子,以普通标识词为分母的分子式,将这些结果创建成一个元组。降序排列后打印的结果:

Caliburn is destroyed, but Altria soon acquires her holy sword, 
Excalibur, and Avalon, Excalibur's blessed sheath, from Vivian, the Lady of the Lake.
Her dying body 
is escorted to a holy isle by Morgan le Fay and Sir Bedivere.
Saber's full name is Altria Pendragon, a character inspired by the legends of King Arthur.
Britain enters a period of 
turmoil following the growing threat of invasion by the Saxons.
Altria rules Britain from her stronghold in Camelot and earns the reputation of a just, yet distant king.
Without hesitation and despite her gender, she draws Caliburn 
and shoulders Britain's mantle of leadership.
Under the guidance of Merlin and with the aid of her 
Knights of the Round Table, she guides Britain into an era of prosperity and tranquillity.
While Avalon is in her possession, Altria never ages and is immortal in battle.
Excalibur's scabbard is stolen while she 
repels an assault along her country's borders; when Altria returns inland, she discovers Britain is being torn asunder by civil unrest.
Despite her valiant efforts 
to placate the dissent, Altria is mortally wounded by a traitorous knight, a homunculus born of her blood named Mordred, during the Battle of Camlann.
When Altria is fifteen, King Uther dies leaving no known eligible heir to the throne.
She is entrusted by Merlin to a loyal knight, Sir 
Ector, who raises her as a surrogate son.
Merlin soon approaches Altria, explaining that the British people will recognize her as a 
destined ruler if she withdraws Caliburn, a ceremonial sword embedded in a large slab of stone.
Altria orders a grieving Bedivere to dispose of Excalibur by throwing it back to Vivian; in her absence, 
she reflects on her personal failures, regretting her life as king.
At her nativity, Uther decides to not publicly 
announce Altria's birth or gender, fearing his subjects will never accept a woman as a legitimate ruler.
Before her last breath, she appeals to the world; in exchange for services as a Heroic Spirit, she 
asks to be given an opportunity to relive her life, where someone more suitable and effective would lead Britain in her stead.
Despite her immense strength and fighting abilities, Altria is plagued by feelings of guilt and inferiority throughout her reign; she sacrifices her emotions for 
the good of Britain, yet many of her subjects and knights become critical of her lack of humanity and cold calculation.
However, pulling this sword is symbolic of accepting the 
hardships of a monarch, and Altria will be responsible for preserving the welfare of her people.

完成了句子的排序,一旦有no_of_nouns和no_of_ners的评分列表,就可以围绕他们建议一些更加复杂的规则。
v2_a25d2dd7839c74213432d85a9eeed7ab_hd

目录
相关文章
|
机器学习/深度学习 数据采集 数据挖掘
【Python入门系列】第九篇:Python数据分析和处理
Python数据分析和处理是当今数据科学领域中的重要技能之一。随着大数据时代的到来,越来越多的组织和企业需要从海量数据中提取有价值的信息。Python作为一种功能强大且易于上手的编程语言,提供了丰富的数据分析和处理工具和库,如pandas、numpy、matplotlib等。本文将介绍Python数据分析和处理的基础知识和常用操作。
377 1
|
网络协议 安全 网络安全
【Python入门系列】第四篇:Python网络编程总结
Python是一种功能强大且易于学习的编程语言,它在网络编程领域也有广泛的应用。本文将介绍Python网络编程的基础知识和常用技术。
150 0
|
IDE Ubuntu 开发工具
Python入门系列第一章--第一节:环境搭建
Python入门系列第一章--第一节:环境搭建
128 0
Python入门系列第一章--第一节:环境搭建
|
iOS开发 Python
if语句的习题课 | Python从入门到精通:入门篇之十四
本节课公布一下上节课留下的五个练习题的答案。加深对if语句的用法理解。
if语句的习题课 | Python从入门到精通:入门篇之十四
|
开发者 Python
MarkDown语法的使用 | 手把手教你入门Python之四
上节课我们安装了Typora,它就是一个用来编辑笔记的软件,而且它支持的格式和语法是MD形式,这节课我们就来学习该语法的使用。
MarkDown语法的使用 | 手把手教你入门Python之四
|
存储 Python
集合的简介 | Python从入门到精通:进阶篇之十七
本节的重点介绍集合的一些基本操作方法,包括创建、删除、清空、浅复制等。
集合的简介 | Python从入门到精通:进阶篇之十七
字典的使用(下) | Python从入门到精通:进阶篇之十五
本节重点介绍了字典中的一些基本操作。包括删除的几种不同方法,浅复制的方法等。
字典的使用(下) | Python从入门到精通:进阶篇之十五
字典的使用(上) | Python从入门到精通:进阶篇之十四
本节重点介绍了字典中的一些基本操作,包含创建字典,获取字典的个数,检查字典中是否包含/不包含某个键,以及获取value,修改字典等操作方法。
字典的使用(上) | Python从入门到精通:进阶篇之十四
|
机器学习/深度学习 自然语言处理
|
机器学习/深度学习 自然语言处理