Gå till index
Python Data Science Handbook
0% färdig
0/67 Steps
-
Introduktion
-
IPython: Beyond Normal Python8 Ämnen
-
Introduction to NumPy9 Ämnen
-
Understanding Data Types in Python
-
The Basics of NumPy Arrays
-
Computation on NumPy Arrays: Universal Functions
-
Aggregations: Min, Max, and Everything In Between
-
Computation on Arrays: Broadcasting
-
Comparisons, Masks, and Boolean Logic
-
Fancy Indexing
-
Sorting Arrays
-
Structured Data: NumPy's Structured Arrays
-
Understanding Data Types in Python
-
Data Manipulation with Pandas13 Ämnen
-
Introducing Pandas Objects
-
Data Indexing and Selection
-
Operating on Data in Pandas
-
Handling Missing Data
-
Hierarchical Indexing
-
Combining Datasets: Concat and Append
-
Combining Datasets: Merge and Join
-
Aggregation and Grouping
-
Pivot Tables
-
Vectorized String Operations
-
Working with Time Series
-
High-Performance Pandas: eval() and query()
-
Further Resources
-
Introducing Pandas Objects
-
Visualization with Matplotlib15 Ämnen
-
Simple Line Plots
-
Simple Scatter Plots
-
Visualizing Errors
-
Density and Contour Plots
-
Histograms, Binnings, and Density
-
Customizing Plot Legends
-
Customizing Colorbars
-
Multiple Subplots
-
Text and Annotation
-
Customizing Ticks
-
Customizing Matplotlib: Configurations and Stylesheets
-
Three-Dimensional Plotting in Matplotlib
-
Geographic Data with Basemap
-
Visualization with Seaborn
-
Further Resources
-
Simple Line Plots
-
Machine Learning15 Ämnen
-
What Is Machine Learning?
-
Introducing Scikit-Learn
-
Hyperparameters and Model Validation
-
Feature Engineering
-
In Depth: Naive Bayes Classification
-
In Depth: Linear Regression
-
In-Depth: Support Vector Machines
-
In-Depth: Decision Trees and Random Forests
-
In Depth: Principal Component Analysis
-
In-Depth: Manifold Learning
-
In Depth: k-Means Clustering
-
In Depth: Gaussian Mixture Models
-
In-Depth: Kernel Density Estimation
-
Application: A Face Detection Pipeline
-
Further Machine Learning Resources
-
What Is Machine Learning?
-
Appendix: Figure Code
avsnitt 4, Ämne 13
Pågår
Further Resources
april 18, 2021
avsnitt Progress
0% färdig
In this chapter, we’ve covered many of the basics of using Pandas effectively for data analysis. Still, much has been omitted from our discussion. To learn more about Pandas, I recommend the following resources:
- Pandas online documentation: This is the go-to source for complete documentation of the package. While the examples in the documentation tend to be small generated datasets, the description of the options is complete and generally very useful for understanding the use of various functions.
- Python for Data Analysis Written by Wes McKinney (the original creator of Pandas), this book contains much more detail on the Pandas package than we had room for in this chapter. In particular, he takes a deep dive into tools for time series, which were his bread and butter as a financial consultant. The book also has many entertaining examples of applying Pandas to gain insight from real-world datasets. Keep in mind, though, that the book is now several years old, and the Pandas package has quite a few new features that this book does not cover (but be on the lookout for a new edition in 2017).
- Stack Overflow: Pandas has so many users that any question you have has likely been asked and answered on Stack Overflow. Using Pandas is a case where some Google-Fu is your best friend. Simply go to your favorite search engine and type in the question, problem, or error you’re coming across–more than likely you’ll find your answer on a Stack Overflow page.
- Pandas on PyVideo: From PyCon to SciPy to PyData, many conferences have featured tutorials from Pandas developers and power users. The PyCon tutorials in particular tend to be given by very well-vetted presenters.
Using these resources, combined with the walk-through given in this chapter, my hope is that you’ll be poised to use Pandas to tackle any data analysis problem you come across!