有个程序要分析FTP服务器返回的目录列表,本来以为比较简单,也在网上查了几个帖子,可都是一知半解的。于是下载了Filezilla的源代码,她的源文件
directorylistingparser.h
directorylistingparser.cpp
就是解析目录列表的,有同样需求的不妨看一看,还是挺费事的,不同平台要都要特殊处理。我把头文件贴出来:
#ifndef __DIRECTORYLISTINGPARSER_H__
#define __DIRECTORYLISTINGPARSER_H__
/* This class is responsible for parsing the directory listings returned by
* the server.
* Unfortunatly, RFC959 did not specify the format of directory listings, so
* each server uses its own format. In addition to that, in most cases the
* listings were not designed to be machine-parsable, they were meant to be
* human readable by users of that particular server.
* By far the most common format is the one returned by the Unix "ls -l"
* command. However, legacy systems are still in place, especially in big
* companies. These often use very exotic listing styles.
* Another problem are localized listings containing date strings. In some
* cases these listings are ambiguous and cannot be distinguished.
* Example for an ambiguous date: 04-05-06. All of the 6 permutations for
* the location of year, month and day are valid dates.
* Some servers send multiline listings where a single entry can span two
* lines, this has to be detected as well, as far as possible.
*
* Some servers send MVS style listings which can consist of just the
* filename without any additional data. In order to prevent problems, this
* format is only parsed if the server is in fact recognizes as MVS server.
*
* Please see tests/dirparsertest.cpp for a list of supported formats and the
* expected parser result.
*
* If adding data to the parser, it first decomposes the raw data into lines,
* which then are processed further. Each line gets consecutively tested for
* different formats, starting with the most common Unix style format.
* Lines not containing a recognized format (e.g. a part of a multiline
* entry) are rememberd and if the next line cannot be parsed either, they
* get concatenated to be parsed again (and discarded if not recognized).
*/
class CLine;
class CToken;
class CControlSocket;
class CDirectoryListingParser
{
public:
CDirectoryListingParser(CControlSocket* pControlSocket,
const CServer& server);
~CDirectoryListingParser();
CDirectoryListing Parse( const CServerPath &path);
void AddData( char *pData, int len);
void AddLine(
const wxChar* pLine);
void Reset();
void SetTimezoneOffset( const wxTimeSpan& span) { m_timezoneOffset = span; }
void SetServer( const CServer& server) { m_server = server; };
protected:
CLine *GetLine(
bool breakAtEnd =
false);
void ParseData( bool partial);
bool ParseLine(CLine *pLine, const enum ServerType serverType, bool concatenated);
bool ParseAsUnix(CLine *pLine, CDirentry &entry, bool expect_date);
bool ParseAsDos(CLine *pLine, CDirentry &entry);
bool ParseAsEplf(CLine *pLine, CDirentry &entry);
bool ParseAsVms(CLine *pLine, CDirentry &entry);
bool ParseAsIbm(CLine *pLine, CDirentry &entry);
bool ParseOther(CLine *pLine, CDirentry &entry);
bool ParseAsWfFtp(CLine *pLine, CDirentry &entry);
bool ParseAsIBM_MVS(CLine *pLine, CDirentry &entry);
bool ParseAsIBM_MVS_PDS(CLine *pLine, CDirentry &entry);
bool ParseAsIBM_MVS_PDS2(CLine *pLine, CDirentry &entry);
bool ParseAsIBM_MVS_Migrated(CLine *pLine, CDirentry &entry);
bool ParseAsMlsd(CLine *pLine, CDirentry &entry);
bool ParseAsOS9(CLine *pLine, CDirentry &entry);
// Only call this if servertype set to ZVM since it conflicts
// with other formats.
bool ParseAsZVM(CLine *pLine, CDirentry &entry);
// Only call this if servertype set to HPNONSTOP since it conflicts
// with other formats.
bool ParseAsHPNonstop(CLine *pLine, CDirentry &entry);
// Date / time parsers
bool ParseUnixDateTime(CLine *pLine,
int &index, CDirentry &entry);
bool ParseShortDate(CToken &token, CDirentry &entry,
bool saneFieldOrder =
false);
bool ParseTime(CToken &token, CDirentry &entry);
// Parse file sizes given like this: 123.4M
bool ParseComplexFileSize(CToken& token, wxLongLong& size,
int blocksize = -1);
bool GetMonthFromName( const wxString& name, int &month);
CControlSocket* m_pControlSocket;
static std::map<wxString, int> m_MonthNamesMap;
struct t_list
{
char *p;
int len;
};
int m_currentOffset;
std::list<t_list> m_DataList;
std::list<CDirentry> m_entryList;
CLine *m_prevLine;
CServer m_server;
bool m_fileListOnly;
std::list<wxString> m_fileList;
bool m_maybeMultilineVms;
wxTimeSpan m_timezoneOffset;
};
#endif
directorylistingparser.h
directorylistingparser.cpp
就是解析目录列表的,有同样需求的不妨看一看,还是挺费事的,不同平台要都要特殊处理。我把头文件贴出来:
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
/* This class is responsible for parsing the directory listings returned by
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
class CLine;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
CDirectoryListing Parse( const CServerPath &path);
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
void AddData( char *pData, int len);
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
void Reset();
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
void SetTimezoneOffset( const wxTimeSpan& span) { m_timezoneOffset = span; }
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
void SetServer( const CServer& server) { m_server = server; };
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
protected:
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
void ParseData( bool partial);
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
bool ParseLine(CLine *pLine, const enum ServerType serverType, bool concatenated);
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
bool ParseAsUnix(CLine *pLine, CDirentry &entry, bool expect_date);
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
// Only call this if servertype set to HPNONSTOP since it conflicts
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
// Date / time parsers
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
// Parse file sizes given like this: 123.4M
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
bool GetMonthFromName( const wxString& name, int &month);
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
CControlSocket* m_pControlSocket;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
static std::map<wxString, int> m_MonthNamesMap;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
std::list<t_list> m_DataList;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
CLine *m_prevLine;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
CServer m_server;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
bool m_fileListOnly;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
wxTimeSpan m_timezoneOffset;
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
![](https://ucc.alicdn.com/notfound.png?x-oss-process=image/resize,w_1400/format,webp)
#endif
本文转自 h2appy 51CTO博客,原文链接:http://blog.51cto.com/h2appy/122279,如需转载请自行联系原作者