https://pypi.org/project/hdfs3 已经不维护
PyArrow
https://pypi.org/project/hdfs/
https://pypi.org/project/snakebite/ python2中比较好,对python3支持不好。
hdfs和PyArrow比较常用,这里以hdfs为例:
快速入门
from hdfs import InsecureClient
client = InsecureClient('http://localhost:50070', user='hduser_')
fs_folders_list = client.list("/")
print(fs_folders_list)
with client.read('/user/hduser/input.txt', encoding='utf-8') as reader:
for line in reader:
print(line)
执行结果:
['user']
https://china-testing.github.io/
https://diogoalexandrefranco.github.io/interacting-with-hdfs-from-pyspark/
https://www.thomashenson.com/hadoop-python-example/
https://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/
https://community.hortonworks.com/articles/92321/interacting-with-hadoop-hdfs-using-python-codes.html
http://yizhanggou.top/python%E8%AE%BF%E9%97%AEhdfs%E7%9A%84%E5%87%A0%E7%A7%8D%E6%96%B9%E5%BC%8F/
https://blog.csdn.net/Gamer_gyt/article/details/52446757