这是一篇没有技术性的文章,只要你写过采集程序你就知道怎么做了
QQ唠叨是QQ滔滔的一个名词,也即是平时的杂言碎语.它可在WEB,QQ,QQ签名,手机短信上发表,具体的大家可去滔滔网站了解
http://www.taotao.com/
现在我们要实现的就是将QQ唠叨发表的言语"转"到我们的博客上显示,只要我们发表了一句言语,我们的博客就能实时显示,让我们的访问者都能知道自己最近在唠叨些啥,是不是很Cool啊?(效果可看我的博客标题头)
好了,不说了,直接贴代码:
<%
@ Import Namespace
=
"
System
"
%>
<% @ Import Namespace = " System.Net " %>
<% @ Import Namespace = " System.Text " %>
<% @ Import Namespace = " System.Text.RegularExpressions " %>
< script language = " C# " runat = " server " >
protected override void OnLoad(EventArgs e)
{
string qq = Request.QueryString["qq"]; //对应的QQ号码
string strNum = Request.QueryString["num"]; //对应的数据条数
if (string.IsNullOrEmpty(qq)) return;
int num;
MatchCollection contents = null;
//读取缓存数据
string cacheKey = "TAOTAO_CACHE_CONTENTS_QQ_" + qq;
contents = Cache[cacheKey] as MatchCollection;
if (contents == null)
{
//从页面读取数据
string taotaoUrl = "http://www.taotao.com/v1/space/" + qq + "/t.1/p.1"; //获取淘淘签名数据的URL
string findPattern = @"<div class=[^>]+><p><span>(?<content>.*?)</span>\s*<a[^>]+>(?<time>[^<]+)</a>"; //搜索签名数据块的正则表达式
using (WebClient client = new WebClient())
{
client.Encoding = Encoding.UTF8;
string html = client.DownloadString(taotaoUrl);
contents = Regex.Matches(html, findPattern, RegexOptions.IgnoreCase);
//缓存数据(一小时后失效)
Cache.Insert(cacheKey, contents, null, DateTime.Now.AddHours(1),TimeSpan.Zero);
}
}
num = (int.TryParse(strNum, out num) ? Math.Min(num, contents.Count) : contents.Count);
//输出数据
Response.Clear();
Response.Buffer = true;
for (int i = 0; i < num; i++)
{
Match m = contents[i];
Response.Write("document.write('");
Response.Write(m.Groups["content"].ToString().Replace("'", "\\'"));
Response.Write(string.Format("[<font color=\"#FFFF00\">{0}</font>]", m.Groups["time"].ToString()));
Response.Write("<br />");
Response.Write("');");
}
Response.Flush();
}
</ script >
<% @ Import Namespace = " System.Net " %>
<% @ Import Namespace = " System.Text " %>
<% @ Import Namespace = " System.Text.RegularExpressions " %>
< script language = " C# " runat = " server " >
protected override void OnLoad(EventArgs e)
{
string qq = Request.QueryString["qq"]; //对应的QQ号码
string strNum = Request.QueryString["num"]; //对应的数据条数
if (string.IsNullOrEmpty(qq)) return;
int num;
MatchCollection contents = null;
//读取缓存数据
string cacheKey = "TAOTAO_CACHE_CONTENTS_QQ_" + qq;
contents = Cache[cacheKey] as MatchCollection;
if (contents == null)
{
//从页面读取数据
string taotaoUrl = "http://www.taotao.com/v1/space/" + qq + "/t.1/p.1"; //获取淘淘签名数据的URL
string findPattern = @"<div class=[^>]+><p><span>(?<content>.*?)</span>\s*<a[^>]+>(?<time>[^<]+)</a>"; //搜索签名数据块的正则表达式
using (WebClient client = new WebClient())
{
client.Encoding = Encoding.UTF8;
string html = client.DownloadString(taotaoUrl);
contents = Regex.Matches(html, findPattern, RegexOptions.IgnoreCase);
//缓存数据(一小时后失效)
Cache.Insert(cacheKey, contents, null, DateTime.Now.AddHours(1),TimeSpan.Zero);
}
}
num = (int.TryParse(strNum, out num) ? Math.Min(num, contents.Count) : contents.Count);
//输出数据
Response.Clear();
Response.Buffer = true;
for (int i = 0; i < num; i++)
{
Match m = contents[i];
Response.Write("document.write('");
Response.Write(m.Groups["content"].ToString().Replace("'", "\\'"));
Response.Write(string.Format("[<font color=\"#FFFF00\">{0}</font>]", m.Groups["time"].ToString()));
Response.Write("<br />");
Response.Write("');");
}
Response.Flush();
}
</ script >
将上面的代码,直接保存为一个.aspx文件(注意文件编码!一般是UTF-8),再将此文件传到您的服务器空间上,然后将地址连接到您的博客上即可.
注意:一般在博客里是要以脚本形式调用,如我在我博客头上加的代码则是:
最新QQ唠叨: <script src=" http://www.xxx.com/taotao.aspx?qq=QQ号码&num=1" charset="gb2312"></script>
注:博客园的页面编码是UTF-8,如果你的页面输出不是UTF-8编码,一定要在script里加入对应的页面编码,否则显示乱码!
再注:因为是采集,所以如果淘淘网站的页面格式更改了的话,你也要随时更新正则表达式。否则就取不到签名数据了:(
本文转自Kingthy博客园博客,原文链接:http://www.cnblogs.com/kingthy/archive/2008/04/22/1165489.html
,如需转载请自行联系原作者