Task: Create a new column by mapping the values in the city
column to the city_state
dictionary
from datatable import dt, f
raw_data = {"first_name": ["Jason", "Molly",
"Tina", "Jake", "Amy"],
"last_name": ["Miller", "Jacobson",
"Ali", "Milner", "Cooze"],
"age": [42, 52, 36, 24, 73],
"city": ["San Francisco", "Baltimore",
"Miami", "Douglas", "Boston"] }
df = dt.Frame(raw_data)
df
| first_name last_name age city
| str32 str32 int32 str32
-- + ---------- --------- ----- -------------
0 | Jason Miller 42 San Francisco
1 | Molly Jacobson 52 Baltimore
2 | Tina Ali 36 Miami
3 | Jake Milner 24 Douglas
4 | Amy Cooze 73 Boston
city_to_state = { "San Francisco": "California",
"Baltimore": "Maryland",
"Miami": "Florida",
"Douglas": "Arizona",
"Boston": "Massachusetts" }
city_to_state
{'San Francisco': 'California',
'Baltimore': 'Maryland',
'Miami': 'Florida',
'Douglas': 'Arizona',
'Boston': 'Massachusetts'}
Solution:
- Create a temporary dataframe to hold the values in
city
.
m = df['city']
m
| city
| str32
-- + -------------
0 | San Francisco
1 | Baltimore
2 | Miami
3 | Douglas
4 | Boston
[5 rows x 1 column]
- Replace the values in
m
withcity_to_state
, by using the replace function. Note that the replace function does not require assignment, as the computation is done inplace:
m.replace(city_to_state)
m
| city
| str32
-- + -------------
0 | California
1 | Maryland
2 | Florida
3 | Arizona
4 | Massachusetts
[5 rows x 1 column]
- Assign
m
to new columnstate
indf
:
df["state"] = m
df
| first_name last_name age city state
| str32 str32 int32 str32 str32
-- + ---------- --------- ----- ------------- -------------
0 | Jason Miller 42 San Francisco California
1 | Molly Jacobson 52 Baltimore Maryland
2 | Tina Ali 36 Miami Florida
3 | Jake Milner 24 Douglas Arizona
4 | Amy Cooze 73 Boston Massachusetts
[5 rows x 5 columns]