Learning Pandas 2, Second Edition: Master Data Wrangling, NLP, Geospatial Analysis, and Production ML...

— —

Matthew Rosch

Learning Pandas 2, Second Edition

Name: Learning Pandas 2, Second Edition
Price: 89.59 AUD
Availability: OutOfStock
ISBN: 9789349174665

Master Data Wrangling, NLP, Geospatial Analysis, and Production ML Pipelines using pandas 2.3...

Matthew Rosch

$111.95 $89.59

Paperback

Not in-store but you can order this
How long will it take?

Availability Information

We source books from suppliers in Australia and overseas. For books we don't currently have in stock, the time it takes to get them from our suppliers can vary widely - from a few days to a few months - so we check each book with each supplier to determine the expected time it will take to be supplied to us.

We will update you on expected arrival time to us. If this delay is too long, please let us know within 2 business days and we can give you options regarding cancelling or adjusting your order. More details on cancelling your order can be found in our Terms of Trade.

To find out the anticipated arrival time for specific items prior to ordering, please contact us by phone or email:

Phone +61 2 9264 3111, or 1800 4 BOOKS (1800 4 26657) if outside Sydney:
option 1 Abbey's Bookshop (Crime, History, Science, Kids & more) • info@abbeys.com.au
option 2 Language Book Centre (ESL & Foreign Languages) • language@abbeys.com.au
option 3 Galaxy Bookshop (Sci-fi, Fantasy, Romance, Graphic Novels) • sf@galaxybooks.com.au

QTY:

English

Gitforgits
10 April 2026

Operational research; Enterprise software; Data capture & analysis; Data mining; Machine learning

Summary
Details

This book has been updated with Pandas 2.3, and it's exactly what ML engineers, data scientists and data engineers have been waiting for. It's a hands-on desk guide that's full of solutions, and it's the most up-to-date, production-ready book to the most widely used data manipulation library in the Python ecosystem.

This book covers all the big changes in Pandas 2.3, like Copy-on-Write semantics, PyArrow-backed types that save over 50% memory, the new default StringDtype, and the deprecated frequency aliases that are messing up time series pipelines everywhere. All the chapters are based on one growing application using a real Customer Churn dataset, so every technique is put into a context where you can trace it and use it in production.

Once you've got the hang of pandas, you will be exploring deep into feature engineering with feature_engine and scikit-learn's set_output API, dealing with class imbalance with SMOTE and ADASYN, and doing distributed computing with Dask, as well as JIT-compiled custom functions with Numba and JAX. On top of that, you'll be able to handle full NLP pipelines from TF-IDF to LDA topic modelling, and geospatial analysis with GeoPandas.

It doesn't matter if you're building ML pipelines, scaling data infrastructure, or connecting pandas to TensorFlow, PyTorch, or JAX, this book will give you the practical depth and modern patterns to do it correctly on pandas 2.3 today, and stay forward-compatible with pandas 3.0 tomorrow.

Key FeaturesBuild memory-efficient pipelines using PyArrow backends and targeted dtype choices.

Write Copy-on-Write-safe assignment patterns that work on pandas 2.3 and 3.0.

Engineer rich ML features using ratios, bins, group statistics, and interaction terms.

Handle class imbalance with SMOTE, ADASYN, and quantified pandas-based profiling.

Scale datasets beyond RAM using Dask lazy evaluation and distributed cluster computing.

Accelerate custom scoring functions with Numba JIT and JAX-compiled batch operations.

Extract sentiment, topics, and clusters from raw text using TF-IDF and LDA pipelines.

Perform spatial joins, buffer analysis, and geocoding with GeoPandas and geopy.

Preserve named DataFrames throughout sklearn Pipelines using the set_output API.

Migrate confidently from legacy pandas patterns to pandas 2.3 production standards.

Table of ContentGetting Started with Pandas 2.3

Data Read, Storage, and File Formats

Indexing and Selecting Data

Data Manipulation and Transformation

Time Series and DateTime Operations

Performance Optimization and Scaling

Machine Learning with Pandas 2.3

Text Mining and NLP

Geospatial Data Analysis

By: Matthew Rosch
Imprint: Gitforgits
Edition: 2nd ed.
Dimensions: Height: 235mm, Width: 191mm, Spine: 12mm
Weight: 386g
ISBN: 9789349174665
ISBN 10: 9349174669
Pages: 220
Publication Date: 10 April 2026
Audience: General/trade , ELT Advanced
Format: Paperback
Publisher's Status: Active