Machine and Deep Learning with Python
Education
Tutorials and courses
- Supervised learning superstitions cheat sheet
- Introduction to Deep Learning with Python
- How to implement a neural network
- How to build and run your first deep learning network
- Neural Nets for Newbies by Melanie Warrick
- Data Science 101: Interactive Analysis with Jupyter, Pandas and Treasure Data
- Deep Learning Lecture - University of Oxford
- Deep Learning Tutorial
- Python Programming for the Humanities - Course for Python programming for the Humanities, assuming no prior knowledge. Heavy focus on text processing / NLP.
- Oxford: Machine Learning Course
Pyplot
Material Databases
- Materials for Learning Machine Learning
- On Deep Learning A Tweeted Bibliography
- Continually updated Data Science Python Notebooks
- http://people.duke.edu/~ccc14/sta-663/index.html
- Stanford Reports for 2015
- Data Science Specialization
- Unsupervised Feature Learning and Deep Learning
- Awesome Deep Vistion
- nut - Natural language Understanding Toolkit
- SuperLearner and subsemble - Multi-algorithm ensemble learning packages.
- Bolt - Bolt Online Learning Toolbox
- Shogun - The Shogun Machine Learning Toolbox
Algorithms
- Boruta - Boruta: A wrapper algorithm for all-relevant feature selection
Cheatsheets
Theory and Use Cases
Astronomy
- astroML - Machine Learning and Data Mining for Astronomy.
Law
- Machine Learning and Law - Harry Surden
- eBrevia Applies Machine Learning To Contract Review
- Professor Harry Surden Discusses Machine Learning within Law
Fraud Detection
Chat
Business and money
- Estimating a Real Business Cycle DSGE Model by Maximum Likelihood in Python
- Predicting Heavy and Extreme Losses in Real-Time for Portfolio Holders
Bullying
Gaming
Recommendations
- Collaborative filtering recommendation engine implementation in python
- NLP in python -- predicting HN upvotes from headlines
Text Analysis
- Adam Palay - "Words, words, words": Reading Shakespeare with Python - PyCon 2015
- High-quality XML versions of the complete works of Shakespeare
- The Unreasonable Effectiveness of Recurrent Neural Networks
- Document Clustering with Python
Natural Language Processing
- BLLIP Parser - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
- TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
Sport
Image Recognition
- Generative Image Modeling Using Spatial LSTMs
- Suddenly, a leopard print sofa appears
- What’s in This Picture? AI Becomes as Smart as a Toddler
- Bringing Deep Learning to the Grocery Store
- PyImageSearch and Computer Vision
Kaggle Competition
- spaCy - Industrial strength NLP with Python and Cython.
- PyStanfordDependencies - Python interface for converting Penn Treebank trees to Stanford Dependencies.
- wiki challenge - An implementation of Dell Zhang's solution to Wikipedia's Participation Challenge on Kaggle
- kaggle insults - Kaggle Submission for "Detecting Insults in Social Commentary"
- kaggle_acquire-valued-shoppers-challenge - Code for the Kaggle acquire valued shoppers challenge
- kaggle-cifar - Code for the CIFAR-10 competition at Kaggle, uses cuda-convnet
- kaggle-blackbox - Deep learning made easy
- kaggle-accelerometer - Code for Accelerometer Biometric Competition at Kaggle
- kaggle-advertised-salaries - Predicting job salaries from ads - a Kaggle competition
- kaggle amazon - Amazon access control challenge
- kaggle-bestbuy_big - Code for the Best Buy competition at Kaggle
- kaggle-bestbuy_small
- Kaggle Dogs vs. Cats - Code for Kaggle Dovs vs. Cats competition
- Kaggle Galaxy Challenge - Winning solution for the Galaxy Challenge on Kaggle
- Kaggle Gender - A Kaggle competition: discriminate gender based on handwriting
- Kaggle Merck - Merck challenge at Kaggle
- Kaggle Stackoverflow - Predicting closed questions on Stack Overflow
- kaggle_acquire-valued-shoppers-challenge - Code for the Kaggle acquire valued shoppers challenge
- wine-quality - Predicting wine quality
General-Purpose Machine Learning
- gensim - Topic Modelling for Humans.
- Restricted Boltzmann Machines -Restricted Boltzmann Machines in Python.
- CoverTree - Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree
- nilearn - Machine learning for NeuroImaging in Python
- SKLL - A wrapper around scikit-learn that makes it simpler to conduct experiments.
- neurolab - https://code.google.com/p/neurolab/
- Pebl - Python Environment for Bayesian Learning
- yahmm - Hidden Markov Models for Python, implemented in Cython for speed and efficiency.
- pydeep - Deep Learning In Python
Data Analysis / Data Visualization
- pycascading
- SparklingPandas Pandas on PySpark (POPS)
- ahaz - ahaz: Regularization for semiparametric additive hazards regression
- arules - arules: Mining Association Rules and Frequent Itemsets
- bigrf - bigrf: Big Random Forests: Classification and Regression Forests for Large Data Sets
- bst - bst: Gradient Boosting
- C50 - C50: C5.0 Decision Trees and Rule-Based Models
- Clever Algorithms For Machine Learning
- CORElearn - CORElearn: Classification, regression, feature evaluation and ordinal evaluation
- CoxBoost - CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks
- Cubist - Cubist: Rule- and Instance-Based Regression Modeling
- e1071 - e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
- earth - earth: Multivariate Adaptive Regression Spline Models
- elasticnet - elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA
- ElemStatLearn - ElemStatLearn: Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman
- evtree - evtree: Evolutionary Learning of Globally Optimal Trees
- fpc - fpc: Flexible procedures for clustering
- frbs - frbs: Fuzzy Rule-based Systems for Classification and Regression Tasks
- GAMBoost - GAMBoost: Generalized linear and additive models by likelihood based boosting
- gamboostLSS - gamboostLSS: Boosting Methods for GAMLSS
- gbm - gbm: Generalized Boosted Regression Models
- glmnet - glmnet: Lasso and elastic-net regularized generalized linear models
- glmpath - glmpath: L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
- GMMBoost - GMMBoost: Likelihood-based Boosting for Generalized mixed models
- grplasso - grplasso: Fitting user specified models with Group Lasso penalty
- grpreg - grpreg: Regularization paths for regression models with grouped covariates
- hda - hda: Heteroscedastic Discriminant Analysis
- Introduction to Statistical Learning
- ipred - ipred: Improved Predictors
- kernlab - kernlab: Kernel-based Machine Learning Lab
- klaR - klaR: Classification and visualization
- lars - lars: Least Angle Regression, Lasso and Forward Stagewise
- lasso2 - lasso2: L1 constrained estimation aka ‘lasso’
- LogicReg - LogicReg: Logic Regression
- Machine Learning For Hackers
- maptree - maptree: Mapping, pruning, and graphing tree models
- mboost - mboost: Model-Based Boosting
- medley - medley: Blending regression models, using a greedy stepwise approach
- mvpart - mvpart: Multivariate partitioning
- ncvreg - ncvreg: Regularization paths for SCAD- and MCP-penalized regression models
- nnet - nnet: Feed-forward Neural Networks and Multinomial Log-Linear Models
- oblique.tree - oblique.tree: Oblique Trees for Classification Data
- pamr - pamr: Pam: prediction analysis for microarrays
- party - party: A Laboratory for Recursive Partytioning
- partykit - partykit: A Toolkit for Recursive Partytioning
- penalized - penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
- penalizedLDA - penalizedLDA: Penalized classification using Fisher's linear discriminant
- penalizedSVM - penalizedSVM: Feature Selection SVM using penalty functions
- quantregForest - quantregForest: Quantile Regression Forests
- randomForest - randomForest: Breiman and Cutler's random forests for classification and regression
- randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)
- rattle - rattle: Graphical user interface for data mining in R
- rda - rda: Shrunken Centroids Regularized Discriminant Analysis
- rdetools - rdetools: Relevant Dimension Estimation (RDE) in Feature Spaces
- REEMtree - REEMtree: Regression Trees with Random Effects for Longitudinal (Panel) Data
- relaxo - relaxo: Relaxed Lasso
- rgenoud - rgenoud: R version of GENetic Optimization Using Derivatives
- Rmalschains - Rmalschains: Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R
- rminer - rminer: Simpler use of data mining methods (e.g. NN and SVM) in classification and regression
- ROCR - ROCR: Visualizing the performance of scoring classifiers
- RoughSets - RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories
- rpart - rpart: Recursive Partitioning and Regression Trees
- RPMM - RPMM: Recursively Partitioned Mixture Model
- RSNNS - RSNNS: Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
- RWeka - RWeka: R/Weka interface
- RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
- sda - sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection
- SDDA - SDDA: Stepwise Diagonal Discriminant Analysis
- svmpath - svmpath: svmpath: the SVM Path algorithm
- tgp - tgp: Bayesian treed Gaussian process models
- tree - tree: Classification and regression trees
- varSelRF - varSelRF: Variable selection using random forests
Video Streaming
- Target acquired: Finding targets in drone and quadcopter video streams using Python and OpenCV
- Visualization of taxi trip end points
- Basic motion detection and tracking with Python and OpenCV
- Home surveillance and motion detection with the Raspberry Pi, Python, OpenCV, and Dropbox
Time
Audio
Python & Machine Learning
- Python-Powered Machine Learning in the Cloud ###Generic and unclassified
- Introduction to Neural Machine Translation with GPUs
- On the accuracy of self-normalized log-linear models
- Bayesian Dark Knowledge
- Humans Need Not Apply
- Robot Economics
- Machines that think for themselves
- How Artificial Intelligence Will Make Technology Disappear
- Deep Learning Machine Beats Humans in IQ Test
Misc Scripts / iPython Notebooks / Codebases
- BioPy - Biologically-Inspired and Machine Learning Algorithms in Python.
- pattern_classification
- thinking stats 2
- hyperopt
- numpic
- 2012-paper-diginorm
- A gallery of interesting IPython notebooks
- ipython-notebooks
- decision-weights
- Sarah Palin LDA - Topic Modeling the Sarah Palin emails.
- Diffusion Segmentation - A collection of image segmentation algorithms based on diffusion methods
- Scipy Tutorials - SciPy tutorials. This is outdated, check out scipy-lecture-notes
- Crab - A recommendation engine library for Python
- BayesPy - Bayesian Inference Tools in Python
- scikit-learn tutorials - Series of notebooks for learning scikit-learn
- sentiment-analyzer - Tweets Sentiment Analyzer
- sentiment_classifier - Sentiment classifier using word sense disambiguation.
- group-lasso - Some experiments with the coordinate descent algorithm used in the (Sparse) Group Lasso model
- jProcessing - Kanji / Hiragana / Katakana to Romaji Converter. Edict Dictionary & parallel sentences Search. Sentence Similarity between two JP Sentences. Sentiment Analysis of Japanese Text. Run Cabocha(ISO--8859-1 configured) in Python.
- mne-python-notebooks - IPython notebooks for EEG/MEG data processing using mne-python
- pandas cookbook - Recipes for using Python's pandas library
- Bayesian Methods for Hackers - Book/iPython notebooks on Probabilistic Programming in Python
Tools
Deep Learning Frameworks
- deap - Evolutionary algorithm framework.
- NErvana's pythON based Deep Learning Framework
- Pyevolve - Genetic algorithm framework.
- Caffe - A deep learning framework developed with cleanliness, readability, and speed in mind.
- DLib - A suite of ML tools designed to be easy to imbed in other applications
- encog-cpp
- shark
- Vowpal Wabbit (VW) - A fast out-of-core learning system.
- sofia-ml - Suite of fast incremental algorithms.
- Shogun - The Shogun Machine Learning Toolbox
- Caffe - A deep learning framework developed with cleanliness, readability, and speed in mind. [DEEP LEARNING]
- CXXNET - Yet another deep learning framework with less than 1000 lines core code [DEEP LEARNING]
- XGBoost - A parallelized optimized general purpose gradient boosting library.
- Stan - A probabilistic programming language implementing full Bayesian statistical inference with Hamiltonian Monte Carlo sampling
- BanditLib - A simple Multi-armed Bandit library.
Libraries
- A library to build and test machine learning features
-
deepy: Highly extensible deep learning framework based on Theano
-
Featureforge A set of tools for creating and testing machine learning features, with a scikit-learn compatible API
- scikit-learn - A Python module for machine learning built on top of SciPy.
- SimpleAI Python implementation of many of the artificial intelligence algorithms described on the book "Artificial Intelligence, a Modern Approach". It focuses on providing an easy to use, well documented and tested library.
- graphlab-create - A library with various machine learning models (regression, clustering, recommender systems, graph analytics, etc.) implemented on top of a disk-backed DataFrame.
- BigML - A library that contacts external servers.
- pattern - Web mining module for Python.
- NuPIC - Numenta Platform for Intelligent Computing.
Environment Management
- p - Dead Simple Interactive Python Version Management.
- pyenv - Simple Python version management.
- virtualenv - A tool to create isolated Python environments.
- virtualenvwrapper - A set of extensions to virtualenv.
- virtualenv-api - An API for virtualenv and pip.
- pew - A set of tools to manage multiple virtual environments.
- Vex - Run a command in the named virtualenv.
- PyRun - A one-file, no-installation-needed version of Python.
Package Management
- pip - The Python package and dependency manager.conda - Cross-platform, Python-agnostic binary package manager.
- Curdling - Curdling is a command line tool for managing Python packages.
- wheel - The new standard of Python distribution and are intended to replace eggs.
Package Repositories
- warehouse - Next generation Python Package Repository (PyPI).devpi - PyPI server and packaging/testing/release tool.
- localshop - PyPI server which mirrors official packages on-demand, and also supports local (private) package uploads.
- bandersnatch - PyPI mirroring tool provided by Python Packaging Authority (PyPA)
Distribution
- cx-Freeze - Freezes Python scripts (cross-platform).
- py2exe - Freezes Python scripts (Windows).
- pynsist - A tool to build Windows installers, installers bundle Python itself.
- py2app - Freezes Python scripts (Mac OS X).
- PyInstaller - Converts Python programs into stand-alone executables (cross-platform).
- dh-virtualenv - Build and distribute a virtualenv as a Debian package.
- Nuitka - Compile scripts, modules, packages to an executable or extension module.
Build Tools
- buildout - A build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based.
- SCons - A software construction tool.
- PlatformIO - A console tool to build code with different development platforms.
- BitBake - A make-like build tool with the special focus of distributions and packages for embedded Linux.
- fabricate - A build tool that finds dependencies automatically for any language.
Interactive Interpreter
- IPython - A rich toolkit to help you make the most out of using Python interactively.
- bpython – A fancy interface to the Python interpreter.
- ptpython - Advanced Python REPL built on top of the python-prompt-toolkit.
Files
- mimetypes - (Python standard library) Map filenames to MIME types.
- imghdr - (Python standard library) Determine the type of an image.
- python-magic - A Python interface to the libmagic file type identification library.
- path.py - A module wrapper for os.path.
- watchdog - API and shell utilities to monitor file system events.
- Unipath - An object-oriented approach to file/directory operations.
- pathlib - (Python standard library in Python 3.4+) An cross-platform, object-oriented path library.
Date and Time
- arrow - Better dates & times for Python.
- Chronyk - A Python 3 library for parsing human-written times and dates.
- dateutil - Extensions to the standard Python datetime module.
- delorean - A library for clearing up the inconvenient truths that arise dealing with datetimes.
- when.py - Providing user-friendly functions to help perform common date and time actions.
- moment - A Python library for dealing with dates/times. Inspired by Moment.js.
- pytz - World timezone definitions, modern and historical. Brings the tz database into Python.
- PyTime - A easy-use Python module which aims to operate date/time/datetime by string.
Text Processing
- XGBoost - Python bindings for eXtreme Gradient Boosting (Tree) Library
- difflib - (Python standard library) Helpers for computing deltas.
- Levenshtein - Fast computation of Levenshtein distance and string similarity.
- fuzzywuzzy - Fuzzy String Matching.
- esmre - Regular expression accelerator.
- shortuuid - A generator library for concise, unambiguous and URL-safe UUIDs.
- ftfy - Makes Unicode text less broken and more consistent automagically.
- unidecode - ASCII transliterations of Unicode text.
- chardet - Python 2/3 compatible character encoding detector.
- xpinyin - A library to translate Chinese hanzi (漢字) to pinyin (拼音).
- pangu.py - Spacing texts for CJK and alphanumerics.
- pyfiglet - An implementation of figlet written in Python.
- uniout - Print readable chars instead of the escaped string.
- Slugify
- awesome-slugify - A Python slugify library that can preserve unicode.
- python-slugify - A Python slugify library that translates unicode to ASCII.
- PLY - Implementation of lex and yacc parsing tools for Python
- phonenumbers - Parsing, formatting, storing and validating international phone numbers.
- python-user-agents - Browser user agent parser.
- sqlparse - A non-validating SQL parser.
- Pygments - A generic syntax highlighter.
- python-nameparser - Parsing human names into their individual components.
- pyparsing - A general purpose framework for generating parsers.
Specific Formats Processing
Libraries for parsing and manipulating specific text formats.
- General
- tablib - A module for Tabular Datasets in XLS, CSV, JSON, YAML.
- Office
- python-docx - Reads, queries and modifies Microsoft Word 2007/2008 docx files.
- xlwt / xlrd - Writing and reading data and formatting information from Excel files.
- XlsxWriter - A Python module for creating Excel .xlsx files.
- xlwings - A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
- openpyxl - A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
- Marmir - Takes Python data structures and turns them into spreadsheets.
- unoconv - Convert between any document format supported by LibreOffice/OpenOffice.
- Markdown
- Python-Markdown - A Python implementation of John Gruber’s Markdown.
- Mistune - Fastest and full featured pure Python parsers of Markdown.
- YAML
- PyYAML - YAML implementations for Python.
- CSV
- csvkit - Utilities for converting to and working with CSV.
- Archive
- unp - A command line tool that can unpack archives easily.
Natural Language Processing
- CRF++ - Open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data & other Natural Language Processing tasks.
- frog - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
- NLTK - A leading platform for building Python programs to work with human language data.
- Pattern - A web mining module for the Python programming language. It has tools for natural language processing, machine learning, among others.
- Quepy - A python framework to transform natural language questions to queries in a database query language
- YAlign - A sentence aligner, a friendly tool for extracting parallel sentences from comparable corpora.
- jieba - Chinese Words Segmentation Utilities.
- SnowNLP - A library for processing Chinese text.
- loso - Another Chinese segmentation library.
- genius - A Chinese segment base on Conditional Random Field.
- Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)
- BLLIP Parser - Python bindings for the BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
- PyNLPl - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for FoLiA, but also ARPA language models, Moses phrasetables, GIZA++ alignments.
- python-ucto - Python binding to ucto (a unicode-aware rule-based tokenizer for various languages)
- python-frog - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
- colibri-core - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
- NLTK - A leading platform for building Python programs to work with human language data.
- Pattern - A web mining module for the Python. It has tools for natural language processing, machine learning, among others.
- TextBlob - Providing a consistent API for diving into common NLP tasks. Stands on the giant shoulders of NLTK and Pattern.
- jieba - Chinese Words Segmentation Utilities.
- SnowNLP - A library for processing Chinese text.
- loso - Another Chinese segmentation library.
- genius - A Chinese segment base on Conditional Random Field.
- langid.py - Stand-alone language identification system.
Documentation
- Sphinx - Python Documentation generator.reStructuredText - Markup Syntax and Parser Component of Docutils.
- MkDocs - Markdown friendly documentation generator.
- Pycco - The original quick-and-dirty, hundred-line-long, literate-programming-style documentation generator.
- pdoc - Epydoc replacement to auto generate API documentation for Python libraries.
Configuration
- ConfigParser - (Python standard library) INI file parser.
- ConfigObj - INI file parser with validation.
- config - Hierarchical config from the author of logging.
- profig - Config from multiple formats with value conversion.
Command-line Tools
- Command-line Application Development
- cement - Cement provides a light-weight and fully featured foundation to build anything from single file scripts to complex and intricately designed applications.
- click - A package for creating beautiful command line interfaces in a composable way.
- clint - Python Command-line Application Tools.
- cliff - A framework for creating command-line programs with multi-level commands.
- Clime – Clime lets you convert any module into a multi-command CLI program without any configuration.
- docopt - Pythonic command line arguments parser.
- colorama - Cross-platform colored terminal text.
- pyCLI - Command-line applications supporting standard command line parsing, logging, unit and functional testing.
- Gooey - Turn command line programs into a full GUI application with one line
- python-prompt-toolkit - A Library for building powerful interactive command lines.
- Productivity Tools
- cookiecutter - A command-line utility that creates projects from cookiecutters (project templates). E.g. Python package projects, jQuery plugin projects.
- httpie - A command line HTTP client, a user-friendly cURL replacement.
- percol - Adds flavor of interactive selection to the traditional pipe concept on UNIX.
- RainbowStream - Smart and nice Twitter client on terminal.
- caniusepython3 - Determine what projects are blocking you from porting to Python 3.
- thefuck - Correcting your previous console command.
- doitlive - A tool for live presentations in the terminal.
- PathPicker - Select files out of bash output.
- bashplotlib - Making basic plots in the terminal. It's a quick way to visualize data without GUI.
Downloader
- s3cmd - A command line tool for managing Amazon S3 and CloudFront.
- s4cmd - Super S3 command line tool, good for higher performance.
- youtube-dl - A small command-line program to download videos from YouTube.
- you-get - A YouTube/Youku/Niconico video downloader written in Python 3.
- coursera - Script for downloading Coursera.org videos and naming them.
- WikiTeam - Tools for downloading and preserving wikis.
- subliminal - Library and command line tool to search and download subtitles.
Imagery
- pillow - Pillow is the friendly PIL fork.
- wand - Python bindings for MagickWand, C API for ImageMagick.
- thumbor - A smart imaging service. It enables on-demand crop, re-sizing and flipping of images.
- imgSeek - A project for searching a collection of images using visual similarity.
- python-qrcode - A pure Python QR Code generator.
- pyBarcode - Create barcodes in Python without needing PIL.
- pygram - Instagram-like image filters.
- Quads - Computer art based on quadtrees.
- nude.py - Nudity detection.
- scikit-image - A Python library for (scientific) image processing.
- hmap - Image histogram remapping.
OCR
- python-tesseract - A wrapper class for Google Tesseract OCR.
- pytesseract - Another wrapper for Google Tesseract OCR.
- pyocr - A wrapper for Tesseract and Cuneiform.
Audio
- audiolazy - Expressive Digital Signal Processing (DSP) package for Python.
- audioread - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
- beets - A music library manager and MusicBrainz tagger.
- dejavu - Audio fingerprinting and recognition.
- eyeD3 - A tool for working with audio files, specifically MP3 files containing ID3 metadata.
- id3reader - A Python module for reading MP3 meta data.
- mutagen - A Python module to handle audio metadata.
- pydub - Manipulate audio with a simple and easy high level interface.
- pyechonest - Python client for the Echo Nest API.
- talkbox - A Python library for speech/signal processing.
- TimeSide - Open web audio processing framework.
- tinytag - A library for reading music meta data of MP3, OGG, FLAC and Wave files.
- m3u8 - A module for parsing m3u8 file.
- ggplot2 - A data visualization package based on the grammar of graphics.
Video
- moviepy - A module for script-based movie editing with many formats, including animated GIFs.
- shorten.tv - Video summarization.
- scikit-video - Video processing routines for SciPy.
Geolocation
- geopy - Python Geocoding Toolbox.
- pygeoip - Pure Python GeoIP API.
- GeoIP - Python API for MaxMind GeoIP Legacy Database.
- geojson - Python bindings and utlities for GeoJSON.
HTTP
- requests - HTTP Requests for Humans™.
- grequests - requests + gevent for asynchronous HTTP requests.
- urllib3 - A HTTP library with thread-safe connection pooling, file post support, sanity friendly.
- httplib2 - Comprehensive HTTP client library.
- treq - Python requests like API built on top of Twisted's HTTP client.
Database
- ZODB - A native object database for Python. A key-value and object graph database.
- pickleDB - A simple and lightweight key-value store for Python.
- TinyDB - A tiny, document-oriented database.
Database Drivers
- Relational Databases
- mysql-python - The MySQL database connector for Python.
- mysqlclient - mysql-python fork supporting Python 3.
- PyMySQL - Pure Python MySQL driver compatible to mysql-python.
- mysql-connector-python - A pure Python MySQL driver from Oracle.
- oursql - A better MySQL connector with support for native prepared statements and BLOBs.
- psycopg2 - The most popular PostgreSQL adapter for Python.
- txpostgres - Twisted based asynchronous driver for PostgreSQL.
- queries - A wrapper of the psycopg2 library for interacting with PostgreSQL.
- dataset - Store Python dicts in a database - works with SQLite, MySQL, and PostgreSQL.
- apsw - Another Python SQLite wrapper.
- NoSQL Databases
- cassandra-python-driver - Python driver for Cassandra.
- pycassa - Python Thrift driver for Cassandra.
- HappyBase - A developer-friendly library for Apache HBase.
- PyMongo - The official Python client for MongoDB.
- Plyvel - A fast and feature-rich Python interface to LevelDB.
- redis-py - The Redis Python Client.
- py2neo - Python wrapper client for Neo4j's restful interface.
- telephus - Twisted based client for Cassandra.
- txRedis - Twisted based client for Redis.
ORM
- Relational Databases
- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper.
- peewee - A small, expressive ORM.
- PonyORM - ORM that provides a generator-oriented interface to SQL.
- NoSQL Databases
- MongoEngine - A Python Object-Document-Mapper for working with MongoDB.
- django-mongodb-engine - Django MongoDB Backend.
- redisco - A Python Library for Simple Models and Containers Persisted in Redis.
- flywheel - Object mapper for Amazon DynamoDB.
- Others
- butterdb - A Python ORM for Google Drive Spreadsheets.
Computer Vision
- SimpleCV - An open source computer vision framework that gives access to several high-powered computer vision libraries, such as OpenCV. Written on Python and runs on Mac, Windows, and Ubuntu Linux.
- Vigranumpy - Python bindings for the VIGRA C++ computer vision library.
Web Frameworks
- Django - The most popular web framework in Python.
- Flask - A microframework for Python.Bottle - A fast, simple and lightweight WSGI micro web-framework.
- Pyramid - A small, fast, down-to-earth, open source Python web framework.web2py - A full stack web framework and platform focused in the ease of use.
- web.py - A web framework for Python that is as simple as it is powerful.
- TurboGears - The Web Framework that starts as a microframework and scales up to a full stack solution.
- CherryPy - A Minimalist Python Web Framework, HTTP/1.1-compliant and WSGI thread-pooled.
- Grok - A framework built on the existing Zope 3 libraries.
- Bluebream - An open-source web application server, framework and library, formerly known as Zope 3.
- guava - A lightweight and high performance web framework for Python written in C.
Permissions
- django-guardian - Implementation of per object permissions for Django 1.2+
- django-rules - A tiny but powerful app providing object-level permissions to Django, without requiring a database.
- Carteblanche - Module to align code with thoughts of users and designers. Also magically handles navigation and permissions.
CMS
- django-cms - An Open source enterprise CMS based on the Django.
- djedi-cms - A lightweight but yet powerful Django CMS with plugins, inline editing and performance in mind.
- FeinCMS - One of the most advanced Content Management Systems built on Django.
- Kotte - A high-level, Pythonic web application framework built on Pyramid.
- Mezzanine - A powerful, consistent, and flexible content management platform.
- Opps - A Django-based CMS for magazines, newspapers websites and portals with high-traffic.
- Plone - A CMS built on top of the open source application server Zope.
- Quokka - Flexible, extensible, small CMS powered by Flask and MongoDB.
- Wagtail - A Django content management system.
- Widgy - Last CMS framework, based on Django.
E-commerce
- django-oscar - An open-source e-commerce framework for Django.
- django-shop - A Django based shop system.
- merchant - A Django app to accept payments from various payment processors.
- money - Money class with optional CLDR-backed locale-aware formatting and an extensible currency exchange solution.
- python-currencies - Display money format and its filthy currencies.
- alipay - Unofficial Alipay API for Python.
RESTful API
- cornice - A REST framework for Pyramid.
- django-rest-framework - A powerful and flexible toolkit that makes it easy to build Web APIs.
- django-tastypie - Creating delicious APIs for Django apps.
- django-formapi - Create JSON APIs with HMAC authentication and Django form-validation.
- flask-api - An implementation of the same web browsable APIs that django-rest-framework provides.
- flask-restful - An extension for Flask that adds support for quickly building REST APIs.
- flask-restless - A Flask extension for generating ReSTful APIs for database models defined with SQLAlchemy (or Flask-SQLAlchemy).
- flask-api-utils - Flask extension that takes care of API representation and authentication.
- falcon - A high-performance Python framework for building cloud APIs and web app backends.
- eve - REST API framework powered by Flask, MongoDB and good intentions.
- sandman - Automated REST APIs for existing database-driven systems.
- restless - Framework agnostic REST framework based on lessons learned from TastyPie.
- savory-pie - REST API building library (django, and others)
- ripozo - A tool for quickly creating REST/HATEOAS/Hypermedia APIs with extensions for Flask and Django.
Authentication
- OAuth
- Authomatic - Simple but powerful framework agnostic authentication/authorization client package.
- OAuthLib - A generic, spec-compliant, thorough implementation of the OAuth request-signing logic.
- rauth - A Python library for OAuth 1.0/a, 2.0, and Ofly.
- python-oauth2 - A fully tested, abstract interface to creating OAuth clients and servers.
- python-social-auth - An easy-to-setup social authentication mechanism.
- django-oauth-toolkit - OAuth2 goodies for the Djangonauts.
- django-oauth2-provider - Providing OAuth2 access to Django app.
- django-allauth - Authentication app for Django that "just works."
- Flask-OAuthlib - OAuth 1.0/a, 2.0 implementation of client and provider for Flask.
- sanction - A dead simple OAuth2 client implementation.
- Others
- PyJWT - Implementation of the JSON Web Token draft 01.
- python-jwt - Module for generating and verifying JSON Web Tokens.
- python-jws - Implementation of JSON Web Signatures draft 02.
Template Engine
- Jinja2 - A modern and designer friendly templating language.
- Genshi - Python templating toolkit for generation of web-aware output.
- Mako - Hyperfast and lightweight templating for the Python platform.
- Chameleon - An HTML/XML template engine. Modeled after ZPT, optimized for speed.
- Spitfire - A very fast Python template compiler.
Queue
- celery - An asynchronous task queue/job queue based on distributed message passing.
- huey - Little multi-threaded task queue.
- mrq - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.
- rq - Simple job queues for Python.
- simpleq - A simple, infinitely scalable, Amazon SQS based queue.
Search
- django-haystack - Modular search for Django.
- elasticsearch-py - The official low-level Python client for Elasticsearch.
- elasticsearch-dsl-py - The official high-level Python client for Elasticsearch.
- solrpy - A Python client for solr.
- Whoosh - A fast, pure Python search engine library.
News Feed
- Feedly - A library to build newsfeed and notification systems using Cassandra and Redis.
- django-activity-stream - Generate generic activity streams from the actions on your site.
Asset Management
- jinja-assets-compressor - A Jinja extension to compile and compress your assets.
- webassets - Bundles, optimizes, and manages unique cache-busting URLs for static resources.
- fanstatic - Packages, optimizes, and serves static file dependencies as Python packages.
- fileconveyor - Monitors changes, processes, and transports assets to CDNs and file storage systems.
- django-storages - A collection of custom storage back ends for Django.
- glue - Glue is a simple command line tool to generate CSS sprites.
- libsass-python - A Python binding of libsass, the reference implementation of SASS/SCSS.
- Flask-Assets - Helps you integrate webassets into your Flask app.
Caching
- Beaker - A library for caching and sessions for use with web applications and stand-alone Python scripts and applications.
- dogpile.cache - dogpile.cache is next generation replacement for Beaker made by same authors.
- HermesCache - Python caching library with tag-based invalidation and dogpile effect prevention.
- django-cache-machine - Automatic caching and invalidation for Django models through the ORM.
- django-cacheops - A slick ORM cache with automatic granular event-driven invalidation.
- johnny-cache - A caching framework for django applications.
- django-viewlet - Render template parts with extended cache control.
- pylibmc - A Python wrapper around the libmemcached interface.
- inbox.py - Python SMTP Server for Humans.
- imbox - Python IMAP for Humans.
- inbox - The open source email toolkit.
- lamson - Pythonic SMTP Application Server.
- flanker - A email address and Mime parsing library.
- marrow.mailer - High-performance extensible mail delivery framework.
- django-celery-ses - Django email back end with AWS SES and Celery.
- modoboa - A mail hosting and management platform including a modern and simplified Web UI.
- envelopes - Mailing for human beings.
- mailjet - Mailjet API implementation for batch mailing, statistics and more.
- Talon - Mailgun library to extract message quotations and signatures.
- pyzmail - Compose, send and parse emails.
Internationalization
URL Manipulation
- furl - A small Python library that makes manipulating URLs simple.
- purl - A simple, immutable URL class with a clean API for interrogation and manipulation.
- pyshorteners - A pure Python URL shortening lib.
- short_url - Python implementation for generating Tiny URL and bit.ly-like URLs.
- webargs - A friendly library for parsing HTTP request arguments, with built-in support for popular web frameworks, including Flask, Django, Bottle, Tornado, and Pyramid.
HTML Manipulation
- BeautifulSoup - Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
- lxml - A very fast, easy-to-use and versatile library for handling HTML and XML.
- html5lib - A standards-compliant library for parsing and serializing HTML documents and fragments.
- pyquery - A jQuery-like library for parsing HTML.
- cssutils - A CSS library for Python.
- MarkupSafe - Implements a XML/HTML/XHTML Markup safe string for Python.
- bleach - A whitelist-based HTML sanitization and text linkification library.
- xmltodict - Working with XML feel like you are working with JSON.
- xhtml2pdf - HTML/CSS to PDF converter.
- untangle - Converts XML documents to Python objects for easy access.
Web Crawling
- Scrapy - A fast high-level screen scraping and web crawling framework.
- portia - Visual scraping for Scrapy.
- feedparser - Universal feed parser.
- RoboBrowser - A simple, Pythonic library for browsing the web without a standalone web browser.
- MechanicalSoup - A Python library for automating interaction with websites.
- mechanize - Stateful programmatic web browsing.
- Demiurge - PyQuery-based scraping micro-framework.
- cola - A distributed crawling framework.
- pyspider - A powerful spider system.
- Grab - Site scraping framework.
Web Content Extracting
- newspaper - News extraction, article extraction and content curation in Python.
- html2text - Convert HTML to Markdown-formatted text.
- python-goose - HTML Content/Article Extractor.
- lassie - Web Content Retrieval for Humans.
- micawber - A small library for extracting rich content from URLs.
- sumy - A module for automatic summarization of text documents and HTML pages.
- Haul - An Extensible Image Crawler.
- python-readability - Fast Python port of arc90's readability tool.
- opengraph - A Python module to parse the Open Graph Protocol
- textract - Extract text from any document, Word, PowerPoint, PDFs, etc.
- sanitize - Bringing sanity to world of messed-up data.
Forms
- WTForms - A flexible forms validation and rendering library.
- WTForms-JSON - A WTForms extension for JSON data handling.
- Deform - Python HTML form generation library influenced by the formish form generation library.
- django-bootstrap3 - Bootstrap 3 integration with Django.
- django-crispy-forms - A Django app which lets you create beautiful forms in a very elegant and DRY way.
- django-remote-forms - A platform independent Django form serializer.
Data Validation
- Cerberus - A mappings-validator with a variety of rules, normalization-features and simple customization that uses a pythonic schema-definition.
- voluptuous - A Python data validation library. It is primarily intended for validating data coming into Python as JSON, YAML, etc.
- colander - A system for validating and deserializing data obtained via XML, JSON, an HTML form post or any other equally simple data serialization.
- schema - A library for validating Python data structures.
- Schematics - Data Structure Validation.
- kmatch - A language for matching/validating/filtering Python dictionaries.
- valideer - Lightweight extensible data validation and adaptation library.
Anti-spam
- django-simple-spam-blocker - Simple spam blocker for Django.
- django-simple-captcha - A simple and highly customizable Django app to add captcha images to any Django form.
Tagging
- django-taggit - Simple tagging for Django.
Admin Panels
- Ajenti - The admin panel your servers deserve.
- Grappelli – A jazzy skin for the Django Admin-Interface.
- django-suit - Alternative Django Admin-Interface (free only for Non-commercial use).
- django-xadmin - Drop-in replacement of Django admin comes with lots of goodies.
- flask-admin - Simple and extensible administrative interface framework for Flask.
- flower - Real-time monitor and web admin for Celery.
Static Site Generator
- Pelican - Uses Markdown or ReST for content and Jinja 2 for themes. Supports DVCS, Disqus. AGPL.
- Cactus – Static site generator for designers.
- Hyde - Jinja2-based static web site generator.
- Nikola - A static website and blog generator.
- Tinkerer - Tinkerer is a blogging engine/.static website generator powered by Sphinx.
Processes and Threads
- multiprocessing - (Python standard library) Process-based "threading" interface.
- threading - (Python standard library) Higher-level threading interface.
- envoy - Python Subprocesses for Humans™.
- sh - A full-fledged subprocess replacement for Python.
- sarge - A wrapper for subprocess.
Competition and Networking
- asyncio - (Python standard library in Python 3.4+) Asynchronous I/O, event loop, coroutines and tasks.
- gevent - A coroutine-based Python networking library that uses greenlet.
- Twisted - An event-driven networking engine.
- Tornado - A Web framework and asynchronous networking library.
- pulsar - Event-driven concurrent framework for Python.
- diesel - Greenlet-based event I/O Framework for Python.
- eventlet - Asynchronous framework with WSGI support.
- pyzmq - A Python wrapper for the 0MQ message library.
- txZMQ - Twisted based wrapper for the 0MQ message library.
- Crossbar - Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).
WebSocket
- AutobahnPython - WebSocket & WAMP for Python on Twisted and asyncio.
- WebSocket-for-Python - WebSocket client and server library for Python 2 and 3 as well as PyPy.
WSGI Servers
- uwsgi - A project aims at developing a full stack for building hosting services, written in C.
- Werkzeug - A WSGI utility library for Python that powers Flask and can easily be embedded into your own projects.
- paste - Multi-threaded, stable, tried and tested.
- rocket - Multi-threaded.
- waitress - Multi-threaded, poweres Pyramid.
- netius - Asynchronous, very fast.
- gunicorn - Pre-forked, partly written in C.
- fapws3 - Asynchronous (network side only), written in C.
- meinheld - Asynchronous, partly written in C.
- bjoern - Asynchronous, very fast and written in C.
RPC Servers
- SimpleXMLRPCServer - (Python standard library) Simple XML-RPC server implementation, single-threaded.
- SimpleJSONRPCServer - This library is an implementation of the JSON-RPC specification.
- zeroRPC - zerorpc is a flexible RPC implementation based on ZeroMQ and MessagePack.
Cryptography
- PyCrypto - The Python Cryptography Toolkit.
- Paramiko - A Python (2.6+, 3.3+) implementation of the SSHv2 protocol, providing both client and server functionality.
- cryptography - A package designed to expose cryptographic primitives and recipes to Python developers.
- PyNacl - Python binding to the Networking and Cryptography (NaCl) library.
- hashids - Implementation of hashids in Python.
- Passlib - Secure password storage/hashing library, very high level.
GUI
- PyQt - Python bindings for the Qt cross-platform application and UI framework, with support for both Qt v4 and Qt v5 frameworks.
- PySide - Python bindings for the Qt cross-platform application and UI framework, supporting the Qt v4 framework.
- wxPython - A blending of the wxWidgets C++ class library with the Python.
- kivy - A library for creating NUI applications, running on Windows, Linux, Mac OS X, Android and iOS.
- curses - Built-in wrapper for ncurses used to create terminal GUI applications.
- urwid - A library for creating terminal GUI applications with strong support for widgets, events, rich colors, etc.
- pyglet - A cross-platform windowing and multimedia library for Python.
- Tkinter - Tkinter is Python's de-facto standard GUI package.
- enaml - Creating beautiful user-interfaces with Declaratic Syntax like QML.
- Toga - A Python native, OS native GUI toolkit.
Game Development
- Pygame - Pygame is a set of Python modules designed for writing games.
- Cocos2d - cocos2d is a framework for building 2D games, demos, and other graphical/interactive applications. It is based on pyglet.
- PySDL2 - A ctypes based wrapper for the SDL2 library.
- Panda3D - 3D game engine developed by Disney and maintained by Carnegie Mellon's Entertainment Technology Center. Written in C++, completely wrapped in Python.
- PyOgre - Python bindings for the Ogre 3D render engine, can be used for games, simulations, anything 3D.
- PyOpenGL - Python ctypes bindings for OpenGL and it's related APIs.
- PySFML - Python bindings for SFML
- RenPy - A Visual Novel engine.
Logging
- logging - (Python standard library) Logging facility for Python.
- logbook - Logging replacement for Python.
- Sentry - A realtime logging and aggregation server.
- Raven - The Python client for Sentry.
- Eliot - Logging for complex & distributed systems.
Testing
- Testing Frameworks
- unittest - (Python standard library) Unit testing framework.
- nose - nose extends unittest.
- pytest - A mature full-featured Python testing tool.
- mamba - The definitive testing tool for Python. Born under the banner of BDD.
- contexts - A BDD framework for Python 3.3+. Inspired by C#'s
Machine.Specifications
. - pyshould - Should style asserts based on PyHamcrest.
- pyvows - BDD style testing for Python. Inspired by Vows.js.
- hypothesis - Hypothesis is an advanced Quickcheck style property based testing library.
- Web Testing
- Mock
- mock - A Python Mocking and Patching Library for Testing.
- responses - A utility library for mocking out the requests Python library.
- doublex - Powerful test doubles framework for Python.
- freezegun - Travel through time by mocking the datetime module.
- httpretty - HTTP request mock tool for Python.
- httmock - A mocking library for requests for Python 2.6+ and 3.2+.
- Code Coverage
- coverage - Code coverage measurement.
- Fake Data
- faker - A Python package that generates fake data.
- fake2db - Fake database generator.
- factory_boy - A test fixtures replacement for Python.
- mixer - Another fixtures replacement. Supported Django, Flask, SQLAlchemy, Peewee and etc.
- model_mommy - Creating random fixtures for testing in Django.
- radar - Generate random datetime / time.
- Error Handler
- FuckIt.py - FuckIt.py uses state-of-the-art technology to make sure your Python code runs whether it has any right to or not.
Code Analysis and Linter
- Code Analysis
- pysonar2 - A type inferencer and indexer for Python.
- pycallgraph - A library that visualises the flow (call graph) of your Python application.
- code2flow - Turn your Python and JavaScript code into DOT flowcharts.
- Linter
Debugging Tools
- pdb - (Python standard library) The Python Debugger.
- ipdb - IPython-enabled pdb.
- wdb - An improbable web debugger through WebSockets.
- winpdb - A Platform Independent Python Debugger with GUI, capable of remote debugging based on rpdb2.
- pudb – A full-screen, console-based Python debugger.
- pyringe - Debugger capable of attaching to and injecting code into Python processes.
- python-statsd - Python Client for the statsd server.
- memory_profiler - Monitor Memory usage of Python code.
- profiling - An interactive Python profiler.
- django-debug-toolbar - Display various debug information about the current request/response.
- pyelftools - A pure-Python library for parsing and analyzing ELF files and DWARF debugging information.
Science and Data Analysis
- SciPy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.
- NumPy - A fundamental package for scientific computing with Python.
- Numba - Python JIT (just in time) complier to LLVM aimed at scientific Python by the developers of Cython and NumPy.
- NetworkX - A high-productivity software for complex networks.
- Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
- Open Mining - Business Intelligence (BI) in Python (Pandas web interface)
- PyMC - Markov Chain Monte Carlo sampling toolkit.
- zipline - A Pythonic algorithmic trading library.
- PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.
- SymPy - A Python library for symbolic mathematics.
- statsmodels - Statistical modeling and econometrics in Python.
- astropy - A community Python library for Astronomy.
- orange - Data mining, data visualization, analysis and machine learning through visual programming or Python scripting.
- RDKit - Cheminformatics and Machine Learning Software.
- Open Babel - A chemical toolbox designed to speak the many languages of chemical data.
- cclib - A library for parsing and interpreting the results of computational chemistry packages.
- Biopython - Biopython is a set of freely available tools for biological computation.
- bccb - Collection of useful code related to biological analysis.
- bcbio-nextgen - A toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis.
- blaze - NumPy and Pandas interface to Big Data.
Data Visualization
- matplotlib - A Python 2D plotting library.
- bokeh - Interactive Web Plotting for Python.
- plotly - Collaborative web plotting for Python and matplotlib.
- vincent - A Python to Vega translator.
- d3py - A plotting library for Python, based on D3.js.
- ggplot - Same API as ggplot2 for R.
- Kartograph.py - Rendering beautiful SVG maps in Python.
- pygal - A Python SVG Charts Creator.
- pygraphviz - Python interface to Graphviz.
- PyQtGraph - Interactive and realtime 2D/3D/Image plotting and science/engineering widgets.
- VisPy - High-performance scientific visualization based on OpenGL.
Computer Vision
- OpenCV - Open Source Computer Vision Library.
- SimpleCV - An open source framework for building computer vision applications.
Machine Learning
- scikit-learn - A Python module for machine learning built on top of SciPy.
- pattern - Web mining module for Python.
- NuPIC - Numenta Platform for Intelligent Computing.
- Pylearn2 - A Machine Learning library based on Theano.
- hebel - GPU-Accelerated Deep Learning Library in Python.
- gensim - Topic Modelling for Humans.
- PyBrain - Another Python Machine Learning Library.
- Crab - A flexible, fast recommender engine.
- python-recsys - A Python library for implementing a Recommender System.
- vowpal_porpoise - A lightweight Python wrapper for Vowpal Wabbit.
MapReduce
- PySpark - The Spark Python API.
- dpark - Python clone of Spark, a MapReduce alike framework in Python.
- luigi - A module that helps you build complex pipelines of batch jobs.
- mrjob - Run MapReduce jobs on Hadoop or Amazon Web Services.
- dumbo - Python module that allows one to easily write and run Hadoop programs.
- streamparse - Run Python code against real-time streams of data. Integrates with Apache Storm.
Functional Programming
- fn.py - Functional programming in Python: implementation of missing features to enjoy FP.
- funcy - A fancy and practical functional tools.
- Toolz - A collection of functional utilities for iterators, functions, and dictionaries.
- CyToolz - Cython implementation of Toolz: High performance functional utilities.
Third-party APIs
- apache-libcloud - One Python library for all clouds.
- boto - Python interface to Amazon Web Services.
- twython - A Python wrapper for the Twitter API.
- google-api-python-client - Google APIs Client Library for Python.
- gspread - Google Spreadsheets Python API.
- facebook-sdk - Facebook Platform Python SDK.
- facepy - Facepy makes it really easy to interact with Facebook's Graph API
- gmail - A Pythonic interface for Gmail.
DevOps Tools
- OpenStack - Open source software for building private and public clouds.
- Ansible - A radically simple IT automation platform.
- SaltStack - Infrastructure automation and management system.
- Fabric - A simple, Pythonic tool for remote execution and deployment.
- Fabtools - Tools for writing awesome Fabric files.
- cuisine - Chef-like functionality for Fabric.
- psutil - A cross-platform process and system utilities module.
- pexpect - Controlling interactive programs in a pseudo-terminal like GNU expect.
- provy - An easy-to-use provisioning system in Python.
- honcho - A Python port of Foreman, a tool for managing Procfile-based applications.
- gunnery - Multipurpose task execution tool for distributed systems with web-based interface.
- Docker-Compose - Fast, isolated development environments using Docker.
- hgapi - Pure-Python API for Mercurial.
- gitapi - Pure-Python API for git.
- supervisor - Supervisor process control system for UNIX.
Job Scheduler
- APScheduler - A light but powerful in-process task scheduler that lets you schedule functions.
- doit - A task runner/build tool.
- Joblib - A set of tools to provide lightweight pipelining in Python.
- Plan - Writing crontab file in Python like a charm.
- Spiff - A powerful workflow engine implemented in pure Python.
- schedule - Python job scheduling for humans.
- TaskFlow - A Python library that helps to make task execution easy, consistent and reliable.
Foreign Function Interface
- ctypes - (Python standard library) Foreign Function Interface for Python calling C code.
- cffi - Foreign Function Interface for Python calling C code.
- SWIG - Simplified Wrapper and Interface Generator.
- PyCUDA - A Python wrapper for Nvidia's CUDA API.
High Performance
- Cython - Optimizing Static Compiler for Python. Uses type mixins to compile Python into C or C++ modules resulting in large performance gains.
- PyPy - An implementation of Python in Python. The interpreter uses black magic to make Python very fast without having to add in additional type information.
- Stackless Python - An enhanced version of the Python.
- Pyston - A Python implementation built using LLVM and modern JIT techniques with the goal of achieving good performance.
Microsoft Windows
- PyWin32 - Python Extensions for Windows.
- PythonNet - Python Integration with the .NET Common Language Runtime (CLR).
- pythonlibs - Unofficial Windows binaries for Python extension packages.
- spyder - IDE for the Python language with advanced editing, interactive testing, debugging and introspection features (also comes with Anaconda, WinPython).
- Python(x,y) - Scientific-applications-oriented Python Distribution based on Qt and Spyder.
- WinPython - Portable development environment for Windows 7/8.
Network Virtualization and SDN
- Mininet - A popular network emulator and API written in Python.
- POX - An open source development platform for Python-based Software Defined Networking (SDN) control applications, such as OpenFlow SDN controllers.
- Pyretic - A member of the Frenetic family of SDN programming languages that provides powerful abstractions over network switches or emulators.
- SDX Platform - SDN based IXP implementation that leverages Mininet, POX and Pyretic.
Hardware
- PyUserInput - A module for cross-platform control of the mouse and keyboard.
- wifi - A Python library and command line tool for working with WiFi on Linux.
- scapy - A brilliant packet manipulation library.
- ino - Command line toolkit for working with Arduino.
- Pyro - Python Robotics.
Compatibility
- Six - Python 2 and 3 compatibility utilities.
- Python-Future - The missing compatibility layer between Python 2 and Python 3.
- Python-Modernize - Modernizes Python code for eventual Python 3 migration.
Miscellaneous
- pluginbase - A simple but flexible plugin system for Python.
- itsdangerous - Various helpers to pass trusted data to untrusted environments.
- blinker - A fast Python in-process signal/event dispatching system.
- Pychievements - A framework for creating and tracking achievements.
Algorithms and Design Patterns
- python-patterns - A collection of design patterns in Python.
- algorithms - A module of algorithms for Python.
Editor Plugins
- Vim
- Python-mode - An all in one plugin for turning Vim into a Python IDE.
- Jedi-vim - Vim bindings for the Jedi auto-completion library for Python.
- YouCompleteMe - Includes Jedi-based completion engine for Python
- Emacs
- Elpy - Emacs Python Development Environment.
- Sublime Text
- SublimeJEDI - A Sublime Text plugin to the awesome auto-complete library Jedi.
- Anaconda - Anaconda turns your Sublime Text 3 in a full featured Python development IDE.
- AtomPylearn2 - A Machine Learning library based on Theano.
- Linter - A static code analysis tool for Atom.
- Linter-flake8 - An addon to
linter
, that acts as an interface forflake8
. - virtualenv - Atom package for virtualenv management.
- keras - Modular neural network library based on Theano.
- hebel - GPU-Accelerated Deep Learning Library in Python.
- PyBrain - Another Python Machine Learning Library.
- Crab - A flexible, fast recommender engine.
- python-recsys - A Python library for implementing a Recommender System.
- breze - Theano based library for deep and recurrent neural networks
- pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.
- mrjob - A library to let Python program run on Hadoop.
- Spearmint - Spearmint is a package to perform Bayesian optimization according to the algorithms outlined in the paper: Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Hugo Larochelle and Ryan P. Adams. Advances in Neural Information Processing Systems, 2012.
- PyMC - Markov Chain Monte Carlo sampling toolkit.
- zipline - A Pythonic algorithmic trading library.
- SymPy - A Python library for symbolic mathematics.
- XGBoost.R - R binding for eXtreme Gradient Boosting (Tree) Library
- rgp - rgp: R genetic programming framework
- h2o - A framework for fast, parallel, and distributed machine learning algorithms at scale -- Deeplearning, Random forests, GBM, KMeans, PCA, GLM
- caret - Classification and Regression Training: Unified interface to ~150 ML algorithms in R.
- caretEnsemble - caretEnsemble: Framework for fitting multiple caret models as well as creating ensembles of such models.
- bigRR - bigRR: Generalized Ridge Regression (with special advantage for p >> n cases)
- bmrm - bmrm: Bundle Methods for Regularized Risk Minimization Package
- GreatCircle - Library for calculating great circle distance.
- climin - Optimization library focused on machine learning, pythonic implementations of gradient descent, LBFGS, rmsprop, adadelta and others
- Allen Downey’s Data Science Course - Code for Data Science at Olin College, Spring 2014.
- Allen Downey’s Think Bayes Code - Code repository for Think Bayes.
- Allen Downey’s Think Complexity Code - Code for Allen Downey's book Think Complexity.
- Allen Downey’s Think OS Code - Text and supporting code for Think OS: A Brief Introduction to Operating Systems.
- python-timbl - A Python extension module wrapping the full TiMBL C++ programming interface. Timbl is an elaborate k-Nearest Neighbours machine learning toolkit.
- mlxtend - A library consisting of useful tools for data science and machine learning tasks.
- neon - Nervana's high-performance Python-based Deep Learning framework
- Theano - Optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python
- Petrel - Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python.
- Blaze - NumPy and Pandas interface to Big Data.
- emcee - The Python ensemble sampling toolkit for affine-invariant MCMC.
- windML - A Python Framework for Wind Energy Analysis and Prediction
- vispy - GPU-based high-performance interactive OpenGL 2D/3D data visualization library
- cerebro2 A web-based visualization and debugging platform for NuPIC.
- NuPIC Studio An all-in-one NuPIC Hierarchical Temporal Memory visualization and debugging super-tool!
- Open Mining - Business Intelligence (BI) in Python (Pandas web interface)
- PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.
- statsmodels - Statistical modeling and econometrics in Python.
- astropy - A community Python library for Astronomy.
- matplotlib - A Python 2D plotting library.
- bokeh - Interactive Web Plotting for Python.
- plotly - Collaborative web plotting for Python and matplotlib.
- vincent - A Python to Vega translator.
- d3py - A plottling library for Python, based on D3.js.
- ggplot - Same API as ggplot2 for R.
- Kartograph.py - Rendering beautiful SVG maps in Python.
- pygal - A Python SVG Charts Creator.
- PyQtGraph - A pure-python graphics and GUI library built on PyQt4 / PySide and NumPy.
Data analysis
- SciPy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.
- NumPy - A fundamental package for scientific computing with Python.
- Numba - Python JIT (just in time) complier to LLVM aimed at scientific Python by the developers of Cython and NumPy.
- NetworkX - A high-productivity software for complex networks.
- Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools. ## Sequence Analysis
- ToPS - This is an objected-oriented framework that facilitates the integration of probabilistic models for sequences over a user defined alphabet.
IDEs
- PyCharm - Commercial Python IDE based on the IntelliJ platform by JetBrains. Has free community edition available.
- Komodo - Commercial polyglot IDE with support for Python.
- LiClipse - Free polyglot IDE based on Eclipse. Uses PyDev for Python support.
- Spyder - Open Source Python IDE.
- WingIDE - Commercial IDE for Python.
Resources
Websites
- r/Python - News about Python.
- Python 3 Wall of Superpowers - Too many popular Python packages don't support Python 3.
- Trending Python repositories on GitHub today - Good place to find new Python libraries.
- Python Hackers - List of top 400 projects in GitHub.
- CoolGithubProjects - Sharing cool github projects just got easier!
- Full Stack Python - Plain English explanations for every layer of the Python web application stack.
Weekly
Blogs/Podcasts
- Hacker News for Data Science
- The O'Reilly Data Show
- Partially Derivative
- The Talking Machines
- The Data Skeptic
- Linear Digressions
- Data Stories
- Learning Machines 101
Data Science / Statistics
- http://iamtrask.github.io/
- http://blog.explainmydata.com/
- http://andrewgelman.com/
- http://simplystatistics.org/
- http://www.evanmiller.org/
- http://jakevdp.github.io/
- http://blog.yhathq.com/
- http://blog.wesmckinney.com/
- http://www.overkillanalytics.net/
- http://newton.cx/~peter/
- http://mbakker7.github.io/exploratory_computing_with_python/
- http://sebastianraschka.com/articles.html
- http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
- http://colah.github.io/
- http://snippyhollow.github.io/
- http://www.thomasdimson.com/
- http://blog.smellthedata.com/
- http://sebastianraschka.com/
- http://dogdogfish.com/
- http://www.johnmyleswhite.com/
- http://drewconway.com/zia/
- http://bugra.github.io/
- http://opendata.cern.ch/
- http://alexanderetz.com/
- http://www.sumsar.net/
- http://countbayesie.com
- http://karpathy.github.io/
- http://blog.dato.com/
- http://blog.kaggle.com/
- http://www.danvk.org/
- http://hunch.net/
Maths
- http://www.sumsar.net/
- http://allendowney.blogspot.ca/
- http://healthyalgorithms.com/
- http://petewarden.com/
- http://mrtz.org/blog/
Security Related
Books
Machine-Learning / Data Mining
- An Introduction To Statistical Learning - Book + R Code
- Elements of Statistical Learning
- Probabilistic Programming & Bayesian Methods for Hackers - Book + IPython Notebooks
- Thinking Bayes - Book + Python Code
- Information Theory, Inference, and Learning Algorithms
- Gaussian Processes for Machine Learning
- Data Intensive Text Processing w/ MapReduce
- Reinforcement Learning: - An Introduction
- Mining Massive Datasets
- A First Encounter with Machine Learning
- Pattern Recognition and Machine Learning
- Machine Learning & Bayesian Reasoning
- Introduction to Machine Learning - Alex Smola and S.V.N. Vishwanathan
- A Probabilistic Theory of Pattern Recognition
- Introduction to Information Retrieval
- Forecasting: principles and practice
- Introduction to Machine Learning - Amnon Shashua
- Reinforcement Learning
- Machine Learning
- A Quest for AI
- Introduction to Applied Bayesian Statistics and Estimation for Social Scientists
- Bayesian Modeling, Inference and Prediction
- A Course in Machine Learning
- Machine Learning, Neural and Statistical Classification
Natural Language Processing
- Coursera Course Book on NLP
- NLTK
- NLP w/ Python
- Foundations of Statistical Natural Language Processing
Neural Networks
Probability & Statistics
- Thinking Stats - Book + Python Code
- From Algorithms to Z-Scores
- The Art of R Programming
- All of Statistics
- Introduction to statistical thought
- Basic Probability Theory
- Introduction to probability
- Principle of Uncertainty
- Probability & Statistics Cookbook
- Advanced Data Analysis From An Elmentary Point of View
- Introduction to Probability - Book and course by MIT
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction.
- An Introduction to Statistical Learning with Applications in R
- Learning Statistics Using R
Linear Algebra
- Linear Algebra Done Wrong
- Linear Algebra, Theory, and Applications
- Convex Optimization
- Applied Numerical Computing
- Applied Numerical Linear Algebra