fileinput
模块提供处理一个或多个文本文件的功能
,
可以通过使用
for..in
来循环读取一个或多个文本文件内容
.
例子中的文件,
1.txt
1a
1a
2a
3a
4a
2.txt
1b
2b
3a
4a
2.txt
1b
2b
DESCRIPTION
Typical use is:
import fileinput
for line in fileinput.input():
process(line)
This iterates over the lines of all files listed in sys.argv[1:],
defaulting to sys.stdin if the list is empty. If a filename is '-' it
is also replaced by sys.stdin. To specify an alternative list of
filenames, pass it as the argument to input(). A single file name is
also allowed.
[
译
]
这个迭代了所有文件的行在
sys.argv[1:]
中,如果列表为空则默认为标准输入,如果文件名为
”-”
它也为标准输入。指定一个文件名列表来做为参数传递给
input
,一个单独的文件名也是允许的。
[
例
]
(1)#!/usr/bin/env python
import fileinput,sys
for line in fileinput.input(sys.argv[1:]):
pass
print fileinput.lineno(),
命令行下,输入
python test.py 1.txt 2.txt
(2)#!/usr/bin/env python
import fileinput
for line in fileinput.input(['1.txt','2.txt']):
pass
print fileinput.lineno(),
(3) #!/usr/bin/env python
import fileinput
for line in fileinput.input(“1.txt”):
pass
print fileinput.lineno(),
Functions filename(), lineno() return the filename and cumulative line
number of the line that has just been read; filelineno() returns its
line number in the current file; isfirstline() returns true iff the
line just read is the first line of its file; isstdin() returns true
iff the line was read from sys.stdin. Function nextfile() closes the
current file so that the next iteration will read the first line from
the next file (if any); lines not read from the file will not count
towards the cumulative line count; the filename is not changed until
after the first line of the next file has been read. Function close()
closes the sequence.
[
译
]
函数
filename,lineno
返回的读取的文件名与已经读的累计的行数;
filelineno
返回当前文件的行数的函数
;
如果读的是它自己的文件第一行,那么
isfirstline
是正确的。如果读的是来自标准输入那么
isstdin
返回真。函数
nextfile
关闭当前文件以致下一个迭代器将从下一个文件第一行读起;将不会累计上一个文件的行数
.
这个文件名不会改变,直到读取下一个文件的第一行。函数
close
关闭序列。
[
例
]
(1) #!/usr/bin/env python
import fileinput
for line in fileinput.input(['1.txt']):
pass
print fileinput.filename(),fileinput.lineno()
[root@newpatch3 /home/python]#python test.py
1.txt 4
(2) #!/usr/bin/env python
import fileinput
for line in fileinput.input(['1.txt','2.txt']):
pass
print fileinput.filename(),":",fileinput.filelineno()
print "1.txt and 2.txt total line:",fileinput.lineno()
[root@newpatch3 /home/python]#python test.py
2.txt : 2
1.txt and 2.txt total line: 6
大家看到没,
filelineno
与
lineno
的差异了吧?
(3) #!/usr/bin/env python
import fileinput,sys
for line in fileinput.input([‘1.txt’]):
if fileinput.isfirstline():
print line,
sys.exit(0)
[root@newpatch3 /home/python]#python test.py
1a
原
1.txt
中有
1a,2a,3a,4a
四行数,但我们通过条件判断,只取第一条,来演示
isfirstline
功能
.
(4) #!/usr/bin/env python
import fileinput
for line in fileinput.input():
if fileinput.isstdin():
print "isstdin"
[root@newpatch3 /home/python]#python test.py
This is stdin
Isstdin
Before any lines have been read, filename() returns None and both line
numbers are zero; nextfile() has no effect. After all lines have been
read, filename() and the line number functions return the values
pertaining to the last line read; nextfile() has no effect.
[
译
]
没有行读取前,
filename
返回
None
和行号为
0
,
nextfile
也不起作用。在所有行被读后,
filename
和获取行号的函数才能返回到已经读的当前行。
Nextfile
才能起作用。
[
例
]
自己测试。
All files are opened in text mode. If an I/O error occurs during
opening or reading a file, the IOError exception is raised.
[
译
]
所有的文件以文本模式打开,如果在打开或者读一个文件时发生了一个
I/O
错误,将会产生一个
IOError
异常。
If sys.stdin is used more than once, the second and further use will
return no lines, except perhaps for interactive use, or if it has been
explicitly reset (e.g. using sys.stdin.seek(0)).
[
译
]
如果标准输入用了多次,第二次将不会返回任何行,除在交互模式下,或者将其重置。
[
例
]
#!/usr/bin/env python
import fileinput,sys
for line in fileinput.input():
print "line1:",line,
fileinput.close()
sys.stdin.seek(0)
for line in fileinput.input():
print "line2:",line,
fileinput.close()
[root@newpatch3 /home/python]#python test.py
a
b
c
line1: a
line1: b
line1: c
e
f
g
line2: e
line2: f
line2: g
Empty files are opened and immediately closed; the only time their
presence in the list of filenames is noticeable at all is when the
last file opened is empty.
It is possible that the last line of a file doesn't end in a newline
character; otherwise lines are returned including the trailing
newline.
Class FileInput is the implementation; its methods filename(),
lineno(), fileline(), isfirstline(), isstdin(), nextfile() and close()
correspond to the functions in the module. In addition it has a
readline() method which returns the next input line, and a
__getitem__() method which implements the sequence behavior. The
sequence must be accessed in strictly sequential order; sequence
access and readline() cannot be mixed.
[
译
]
类
fileinput
是这个的实例;它的方法有
filename(),….
对应的功能在这个模块中。除此之外还有
readline
方法,返回下一行输入,和
__getitem__()
方法。该序列中,必须严格按顺序访问;序例访问和
readline
不能混淆。
Optional in-place filtering: if the keyword argument inplace=1 is
passed to input() or to the FileInput constructor, the file is moved
to a backup file and standard output is directed to the input file.
This makes it possible to write a filter that rewrites its input file
in place. If the keyword argument backup=".<some extension>" is also
given, it specifies the extension for the backup file, and the backup
file remains around; by default, the extension is ".bak" and it is
deleted when the output file is closed. In-place filtering is
disabled when standard input is read. XXX The current implementation
does not work for MS-DOS 8+3 filesystems.
[
译
]
这段话总的意思是说
,inplace
如果设为
1
,那么就将读到的行,输出到输入文件中。如果有
backup
这个参数,就会将源文件内容输入到备份文件中,输出还会输出到输入文件中。
[
例
]
(1A)#!/usr/bin/env python
import fileinput,sys
for line in fileinput.input("1.txt", inplace=0):
print line,
[root@newpatch3 /home/python]#python test.py
1a
2a
3a
4a
[root@newpatch3 /home/python]#more 1.txt
1a
2a
3a
4a
(1B) #!/usr/bin/env python
import fileinput,sys
for line in fileinput.input("1.txt",inplace=1):
print “test”,
[root@newpatch3 /home/python]#python test.py
[root@newpatch3 /home/python]#more 1.txt
test test test test
通过
1A
与
1B
可以发现,我们如果不指定
backup
的话,就会将输出直接写入到
1.txt
文件中。
(2) #!/usr/bin/env python
import fileinput,sys
for line in fileinput.input("1.txt",inplace=1,backup=".bak"):
print "test\n",
[root@newpatch3 /home/python]#ls
1.txt
1.txt.bak 2.txt test.py
[root@newpatch3 /home/python]#more 1.txt
test
test
test
test
[root@newpatch3 /home/python]#more 1.txt.bak
1a
2a
3a
4a
Performance: this module is unfortunately one of the slower ways of
processing large numbers of input lines. Nevertheless, a significant
speed-up has been obtained by using readlines(bufsize) instead of
readline(). A new keyword argument, bufsize=N, is present on the
input() function and the FileInput() class to override the default
buffer size.
[
译
]
对与处理大量的输入行的处理是不理想的。然而,一个重要的加速已使用
readlines
(bufsize
)来代替ReadLine
()。一个新的关键参数,
bufsize=N,
是在input()
函数中存在的和FileInput()
类中覆盖buffer size
的默认值。总的一句话,通过buffer size
这个关键函数可以提高大量的输入。
本文转自hahazhu0634 51CTO博客,原文链接:http://blog.51cto.com/5ydycm/305488,如需转载请自行联系原作者