Python Data Science Handbook

Introduktion

IPython: Beyond Normal Python8 Ämnen

Introduction to NumPy9 Ämnen

Understanding Data Types in Python

The Basics of NumPy Arrays

Computation on NumPy Arrays: Universal Functions

Aggregations: Min, Max, and Everything In Between

Computation on Arrays: Broadcasting

Comparisons, Masks, and Boolean Logic

Fancy Indexing

Sorting Arrays

Structured Data: NumPy's Structured Arrays

Understanding Data Types in Python

Data Manipulation with Pandas13 Ämnen

Introducing Pandas Objects

Data Indexing and Selection

Operating on Data in Pandas

Handling Missing Data

Hierarchical Indexing

Combining Datasets: Concat and Append

Combining Datasets: Merge and Join

Aggregation and Grouping

Pivot Tables

Vectorized String Operations

Working with Time Series

HighPerformance Pandas: eval() and query()

Further Resources

Introducing Pandas Objects

Visualization with Matplotlib15 Ämnen

Simple Line Plots

Simple Scatter Plots

Visualizing Errors

Density and Contour Plots

Histograms, Binnings, and Density

Customizing Plot Legends

Customizing Colorbars

Multiple Subplots

Text and Annotation

Customizing Ticks

Customizing Matplotlib: Configurations and Stylesheets

ThreeDimensional Plotting in Matplotlib

Geographic Data with Basemap

Visualization with Seaborn

Further Resources

Simple Line Plots

Machine Learning15 Ämnen

What Is Machine Learning?

Introducing ScikitLearn

Hyperparameters and Model Validation

Feature Engineering

In Depth: Naive Bayes Classification

In Depth: Linear Regression

InDepth: Support Vector Machines

InDepth: Decision Trees and Random Forests

In Depth: Principal Component Analysis

InDepth: Manifold Learning

In Depth: kMeans Clustering

In Depth: Gaussian Mixture Models

InDepth: Kernel Density Estimation

Application: A Face Detection Pipeline

Further Machine Learning Resources

What Is Machine Learning?

Appendix: Figure Code
ThreeDimensional Plotting in Matplotlib
oktober 26, 2020
Matplotlib was initially designed with only twodimensional plotting in mind. Around the time of the 1.0 release, some threedimensional plotting utilities were built on top of Matplotlib’s twodimensional display, and the result is a convenient (if somewhat limited) set of tools for threedimensional data visualization. threedimensional plots are enabled by importing the mplot3d
toolkit, included with the main Matplotlib installation:In [1]:
from mpl_toolkits import mplot3d
Once this submodule is imported, a threedimensional axes can be created by passing the keyword projection='3d'
to any of the normal axes creation routines:In [2]:
%matplotlib inline import numpy as np import matplotlib.pyplot as plt
In [3]:
fig = plt.figure() ax = plt.axes(projection='3d')
With this threedimensional axes enabled, we can now plot a variety of threedimensional plot types. Threedimensional plotting is one of the functionalities that benefits immensely from viewing figures interactively rather than statically in the notebook; recall that to use interactive figures, you can use %matplotlib notebook
rather than %matplotlib inline
when running this code.
Threedimensional Points and Lines
The most basic threedimensional plot is a line or collection of scatter plot created from sets of (x, y, z) triples. In analogy with the more common twodimensional plots discussed earlier, these can be created using the ax.plot3D
and ax.scatter3D
functions. The call signature for these is nearly identical to that of their twodimensional counterparts, so you can refer to Simple Line Plots and Simple Scatter Plots for more information on controlling the output. Here we’ll plot a trigonometric spiral, along with some points drawn randomly near the line:In [4]:
ax = plt.axes(projection='3d') # Data for a threedimensional line zline = np.linspace(0, 15, 1000) xline = np.sin(zline) yline = np.cos(zline) ax.plot3D(xline, yline, zline, 'gray') # Data for threedimensional scattered points zdata = 15 * np.random.random(100) xdata = np.sin(zdata) + 0.1 * np.random.randn(100) ydata = np.cos(zdata) + 0.1 * np.random.randn(100) ax.scatter3D(xdata, ydata, zdata, c=zdata, cmap='Greens');
Notice that by default, the scatter points have their transparency adjusted to give a sense of depth on the page. While the threedimensional effect is sometimes difficult to see within a static image, an interactive view can lead to some nice intuition about the layout of the points.
Threedimensional Contour Plots
Analogous to the contour plots we explored in Density and Contour Plots, mplot3d
contains tools to create threedimensional relief plots using the same inputs. Like twodimensional ax.contour
plots, ax.contour3D
requires all the input data to be in the form of twodimensional regular grids, with the Z data evaluated at each point. Here we’ll show a threedimensional contour diagram of a threedimensional sinusoidal function:In [5]:
def f(x, y): return np.sin(np.sqrt(x ** 2 + y ** 2)) x = np.linspace(6, 6, 30) y = np.linspace(6, 6, 30) X, Y = np.meshgrid(x, y) Z = f(X, Y)
In [6]:
fig = plt.figure() ax = plt.axes(projection='3d') ax.contour3D(X, Y, Z, 50, cmap='binary') ax.set_xlabel('x') ax.set_ylabel('y') ax.set_zlabel('z');
Sometimes the default viewing angle is not optimal, in which case we can use the view_init
method to set the elevation and azimuthal angles. In the following example, we’ll use an elevation of 60 degrees (that is, 60 degrees above the xy plane) and an azimuth of 35 degrees (that is, rotated 35 degrees counterclockwise about the zaxis):In [7]:
ax.view_init(60, 35) fig
Out[7]:
Again, note that this type of rotation can be accomplished interactively by clicking and dragging when using one of Matplotlib’s interactive backends.
Wireframes and Surface Plots
Two other types of threedimensional plots that work on gridded data are wireframes and surface plots. These take a grid of values and project it onto the specified threedimensional surface, and can make the resulting threedimensional forms quite easy to visualize. Here’s an example of using a wireframe:In [8]:
fig = plt.figure() ax = plt.axes(projection='3d') ax.plot_wireframe(X, Y, Z, color='black') ax.set_title('wireframe');
A surface plot is like a wireframe plot, but each face of the wireframe is a filled polygon. Adding a colormap to the filled polygons can aid perception of the topology of the surface being visualized:In [9]:
ax = plt.axes(projection='3d') ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap='viridis', edgecolor='none') ax.set_title('surface');