《Python极客项目编程 》——1.4 完整代码

简介:

本节书摘来自异步社区《Python极客项目编程 》一书中的第1章,第1.4节,作者 [美] Mahesh Venkitachalam,王海鹏 译,更多章节内容可以访问云栖社区“异步社区”公众号查看。

1.4 完整代码

下面是完整的程序。在https://github.com/electronut/pp/tree/master/playlist/ ,你也可以找到本项目的代码和一些测试数据。

import re, argparse
import sys
from matplotlib import pyplot
import plistlib
import numpy as np

def findCommonTracks(fileNames):
    """
    Find common tracks in given playlist files,
    and save them to common.txt.
    """
    # a list of sets of track names
    trackNameSets = []
    for fileName in fileNames:
        # create a new set
        trackNames = set()
        # read in playlist
        plist = plistlib.readPlist(fileName)
        # get the tracks
        tracks = plist['Tracks']
        # iterate through the tracks
        for trackId, track in tracks.items():
            try:
                # add the track name to a set
                trackNames.add(track['Name'])
            except:
                # ignore
                pass
        # add to list
        trackNameSets.append(trackNames)
        # get the set of common tracks
        commonTracks = set.intersection(*trackNameSets)
        # write to file
        if len(commonTracks) > 0:
            f = open("common.txt", 'w')
            for val in commonTracks:
                s = "%s\n" % val
                f.write(s.encode("UTF-8"))
            f.close()
            print("%d common tracks found. "
                  "Track names written to common.txt." % len(commonTracks))
        else:
            print("No common tracks!")

    def plotStats(fileName):
        """
        Plot some statistics by reading track information from playlist.
        """
        # read in a playlist
        plist = plistlib.readPlist(fileName)
        # get the tracks from the playlist
        tracks = plist['Tracks']
        # create lists of song ratings and track durations
        ratings = []
        durations = []
        # iterate through the tracks
        for trackId, track in tracks.items():
            try:
                ratings.append(track['Album Rating'])
                durations.append(track['Total Time'])
            except:
                # ignore
                pass
        # ensure that valid data was collected
        if ratings == [] or durations == []:
            print("No valid Album Rating/Total Time data in %s." % fileName)
            return

        # scatter plot
        x= np.array(durations, np.int32)
        # convert to minutes
        x = x/60000.0
        y = np.array(ratings, np.int32)
        pyplot.subplot(2, 1, 1)
        pyplot.plot(x, y, 'o')
        pyplot.axis([0, 1.05*np.max(x), -1, 110])
        pyplot.xlabel('Track duration')
        pyplot.ylabel('Track rating')

        # plot histogram
        pyplot.subplot(2, 1, 2)
        pyplot.hist(x, bins=20)
        pyplot.xlabel('Track duration')
        pyplot.ylabel('Count')
        # show plot
        pyplot.show()

    def findDuplicates(fileName):
        """
        Find duplicate tracks in given playlist.
        """
        print('Finding duplicate tracks in %s...' % fileName)
        # read in playlist
        plist = plistlib.readPlist(fileName)
        # get the tracks from the Tracks dictionary
        tracks = plist['Tracks']
        # create a track name dictionary
        trackNames = {}
        # iterate through tracks
        for trackId, track in tracks.items():
            try:
                name = track['Name']
                duration = track['Total Time']
                # look for existing entries
                if name in trackNames:
                    # if a name and duration match, increment the count
                    # round the track length to the nearest second
                    if duration//1000 == trackNames[name][0]//1000:
                        count = trackNames[name][1]
                        trackNames[name] = (duration, count+1)
                else:
                    # add dictionary entry as tuple (duration, count)
                    trackNames[name] = (duration, 1)
            except:
                # ignore
                pass
        # store duplicates as (name, count) tuples
        dups = []
        for k, v in trackNames.items():
            if v[1] > 1:
                dups.append((v[1], k))
        # save duplicates to a file
        if len(dups) > 0:
            print("Found %d duplicates. Track names saved to dup.txt" % len(dups))
        else:
            print("No duplicate tracks found!")
        f = open("dups.txt", 'w')
        for val in dups:
            f.write("[%d] %s\n" % (val[0], val[1]))
        f.close()

    # gather our code in a main() function
    def main():
        # create parser
        descStr = """
        This program analyzes playlist files (.xml) exported from iTunes.
        """
        parser = argparse.ArgumentParser(description=descStr)
        # add a mutually exclusive group of arguments
        group = parser.add_mutually_exclusive_group()

        # add expected arguments
        group.add_argument('--common', nargs='*', dest='plFiles', required=False)
        group.add_argument('--stats', dest='plFile', required=False)
        group.add_argument('--dup', dest='plFileD', required=False)

        # parse args
        args = parser.parse_args()

        if args.plFiles:
            # find common tracks
            findCommonTracks(args.plFiles)
        elif args.plFile:
            # plot stats
            plotStats(args.plFile)
    elif args.plFileD:
        # find duplicate tracks
        findDuplicates(args.plFileD)
    else:
        print("These are not the tracks you are looking for.")

# main method
if __name__ == '__main__':
    main()
相关文章
|
4月前
|
测试技术 Python
Python装饰器:为你的代码施展“魔法”
Python装饰器:为你的代码施展“魔法”
312 100
|
4月前
|
开发者 Python
Python列表推导式:一行代码的艺术与力量
Python列表推导式:一行代码的艺术与力量
483 95
|
5月前
|
Python
Python的简洁之道:5个让代码更优雅的技巧
Python的简洁之道:5个让代码更优雅的技巧
299 104
|
5月前
|
开发者 Python
Python神技:用列表推导式让你的代码更优雅
Python神技:用列表推导式让你的代码更优雅
550 99
|
4月前
|
缓存 Python
Python装饰器:为你的代码施展“魔法
Python装饰器:为你的代码施展“魔法
215 88
|
4月前
|
监控 机器人 编译器
如何将python代码打包成exe文件---PyInstaller打包之神
PyInstaller可将Python程序打包为独立可执行文件,无需用户安装Python环境。它自动分析代码依赖,整合解释器、库及资源,支持一键生成exe,方便分发。使用pip安装后,通过简单命令即可完成打包,适合各类项目部署。
886 68
|
4月前
|
Python
Python编程:运算符详解
本文全面详解Python各类运算符,涵盖算术、比较、逻辑、赋值、位、身份、成员运算符及优先级规则,结合实例代码与运行结果,助你深入掌握Python运算符的使用方法与应用场景。
350 3
|
4月前
|
数据处理 Python
Python编程:类型转换与输入输出
本教程介绍Python中输入输出与类型转换的基础知识,涵盖input()和print()的使用,int()、float()等类型转换方法,并通过综合示例演示数据处理、错误处理及格式化输出,助你掌握核心编程技能。
578 3
|
5月前
|
异构计算 Python
ERROR: pip’s dependency resolver does not currently take into 报错-Python项目依赖冲突的解决方案-优雅草优雅草卓伊凡
ERROR: pip’s dependency resolver does not currently take into 报错-Python项目依赖冲突的解决方案-优雅草优雅草卓伊凡
470 1
|
4月前
|
并行计算 安全 计算机视觉
Python多进程编程:用multiprocessing突破GIL限制
Python中GIL限制多线程性能,尤其在CPU密集型任务中。`multiprocessing`模块通过创建独立进程,绕过GIL,实现真正的并行计算。它支持进程池、队列、管道、共享内存和同步机制,适用于科学计算、图像处理等场景。相比多线程,多进程更适合利用多核优势,虽有较高内存开销,但能显著提升性能。合理使用进程池与通信机制,可最大化效率。
382 3

推荐镜像

更多