Data Wrangling with Python Datatable - Transformations Within a GroupBy
Link to Source data
- Task: Group data by
Date
andZip
and get the ratio of Price to maximum Price per group.
from datatable import dt, f, update, by
df = dt.Frame( { "Date": ["2019-01-01", "2019-01-01",
"2019-01-01", "2019-01-01",
"2019-01-01"],
"Zip": [90102, 90102, 90102, 90102, 90103],
"Price": [58.02, 81.55, 11.97, 93.23, 13.68]})
| Date Zip Price
| str32 int32 float64
-- + ---------- ----- -------
0 | 2019-01-01 90102 58.02
1 | 2019-01-01 90102 81.55
2 | 2019-01-01 90102 11.97
3 | 2019-01-01 90102 93.23
4 | 2019-01-01 90103 13.68
[5 rows x 3 columns]
SOLUTION
df[:, update(Ratio=f.Price / dt.max(f.Price)), by("Date", "Zip")]
df
| Date Zip Price Ratio
| str32 int32 float64 float64
-- + ---------- ----- ------- --------
0 | 2019-01-01 90102 58.02 0.622332
1 | 2019-01-01 90102 81.55 0.874718
2 | 2019-01-01 90102 11.97 0.128392
3 | 2019-01-01 90102 93.23 1
4 | 2019-01-01 90103 13.68 1
[5 rows x 4 columns]
Resources: