[O’R] BigDataFr recommends: Handling Missing Data

BigDataFr recommends: Handling Missing Data

‘The difference between data found in many tutorials and data in the real world is that real-world data is rarely clean and homogeneous. In particular, many interesting datasets will have some amount of data missing. To make matters even more complicated, different data sources may indicate missing data in different ways.

In this section, we will discuss some general considerations for missing data, discuss how Pandas chooses to represent it, and demonstrate some built-in Pandas tools for handling missing data in Python. Here and throughout the book, we’ll refer to missing data in general as “null”, “NaN”, or “NA” values.’ […]

Read more
By Jake VanderPlas
Source: beta.oreilly.com

Laisser un commentaire