Pushback to the Future: Predict Pushback Time at US Airports - Benchmark¶

Coordinating our nation’s airways is the role of the National Airspace System (NAS). The NAS is one of the most complex transportation systems in the world. Operational changes can save or cost airlines, taxpayers, consumers, and the economy at large thousands to millions of dollars on a regular basis. It is critical that decisions to change procedures are done with as much lead time and certainty as possible. One significant source of uncertainty comes right at the beginning of a flight: the pushback time. A more accurate pushback time can lead to better predictability of take off time from the runway.

For this competition, your task is to train a machine learning model to automatically predict pushback time from public air traffic and weather data. In the Open Arena, you will work with 2 years of data to train a model and submit predictions for a validation set. In the Prescreened Arena, you will submit your trained models and inference code to run on a test set of held out data.

In this post, we'll give a quick tour of the features and labels and demonstrate a simple benchmark. That should give you an idea for how to start creating your own solutions and submitting to the Open Arena and Prescreened Arena.

The competition includes data for 10 airports spread throughout the continental US. Here's a map showing their locations.

Location of airports

In [1]:

from datetime import timedelta
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

airports = [
    "KATL",
    "KCLT",
    "KDEN",
    "KDFW",
    "KJFK",
    "KMEM",
    "KMIA",
    "KORD",
    "KPHX",
    "KSEA",
]

Let's start by looking at a single airport KATL in Atlanta, GA.

In [2]:

airport = "KATL"

Get the data¶

To run this notebook, you'll first need to download the following files from the Data download page (available after you've joined the competition):

Airport data (<airport>.tar): A tar archive containing labels and air traffic and weather features for a single airport from 2020 - 2022. Each feature is saved as a bzip2-compressed CSV. There is one file per airport.
Submission format (submission_format.csv): The simplest valid submission to the Open Arena that predicts 0 minutes to pushback for all flights. Use this as an example of a properly formatted submission.

Note that in the Prescreened Arena, you may use the Open Arena's validation set as training data. Those training labels are available as the "Training labels" file via the Prescreened Arena's Data download page.

Once you have downloaded the data files, you can unpack the tar archives. For example, unpack KATL with:

tar -xvf KATL.tar

# or to extract all of them at once
find . -name 'K*.tar' -exec tar xvf {} \;

which should result in the following files:

├── KATL
│   ├── KATL_config.csv.bz2
│   ├── KATL_etd.csv.bz2
│   ├── KATL_first_position.csv.bz2
│   ├── KATL_lamp.csv.bz2
│   ├── KATL_mfs.csv.bz2
│   ├── KATL_runways.csv.bz2
│   ├── KATL_standtimes.csv.bz2
│   └── KATL_tbfm.csv.bz2
└── train_labels_KATL.csv.bz2

Then you can delete the tar file. The rest of this notebook assumes that the data are extracted to a directory named data in the same directory as this notebook.

In [3]:

DATA_DIRECTORY = Path("./data")

We can start by looking at the prediction target: pushback time.

Let's look at the first few labels for KATL:

In [4]:

airport = "KATL"
pushback = pd.read_csv(DATA_DIRECTORY / f"train_labels_{airport}.csv.bz2")
pushback

Out[4]:

	gufi	timestamp	airport	minutes_until_pushback
0	AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM	2021-04-03 19:30:00	KATL	114
1	AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM	2021-04-03 19:45:00	KATL	99
2	AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM	2021-04-03 20:00:00	KATL	84
3	AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM	2021-04-03 20:15:00	KATL	69
4	AAL1008.ATL.DFW.210403.1312.0051.TFM_TFDM	2021-04-03 20:30:00	KATL	54
...	...	...	...	...
3194027	XSR729.ATL.TLH.210426.2354.0037.TFM	2021-04-27 17:45:00	KATL	75
3194028	XSR729.ATL.TLH.210426.2354.0037.TFM	2021-04-27 18:00:00	KATL	60
3194029	XSR729.ATL.TLH.210426.2354.0037.TFM	2021-04-27 18:15:00	KATL	45
3194030	XSR729.ATL.TLH.210426.2354.0037.TFM	2021-04-27 18:30:00	KATL	30
3194031	XSR729.ATL.TLH.210426.2354.0037.TFM	2021-04-27 18:45:00	KATL	15

3194032 rows × 4 columns

Check out the Problem Description for an explanation of the labels and features.

The "Fuser ETD minus 15 minutes benchmark"¶

For this benchmark, we'll use the existing Fuser estimated time of departure (ETD) as the basis for our solution. Fuser is a data processing platform designed by NASA as part of the ATD-2 project that processes the FAA's raw data stream and distributes cleaned, real-time data on the status of individual flights nationwide.

Let's take a look at Fuser ETD data. This table tracks the estimated departure time for flights departing from an airport. It typically contains many estimates for each flight:

gufi: GUFI (Global Unique Flight Identifier)
timestamp: The time that the prediction was generated
estimated_runway_departure_time: Estimated time that the flight will depart from the runway

Note that the ETD in this table refers to the time the flight will depart from the runway, whereas the prediction target we're after is when the flight pushes back from the gate. To account for that, we'll subtract 15 minutes from the estimated_runway_departure_time as an estimate of pushback time. This is just a rough estimate; a slightly less simple model could learn the proper adjustment from the data!

In [6]:

etd = pd.read_csv(
    DATA_DIRECTORY / airport / f"{airport}_etd.csv.bz2",
    parse_dates=["departure_runway_estimated_time", "timestamp"],
)
etd

Out[6]:

	gufi	timestamp	departure_runway_estimated_time
0	FFT17.ATL.MBJ.211031.1050.0029.TFM	2021-11-01 07:00:13	2021-11-01 11:06:00
1	AAR2513.ATL.ICN.211101.0200.0185.TMA	2021-11-01 07:00:23	2021-11-01 05:01:00
2	FFT100.ATL.SJU.211031.1625.0067.TFM	2021-11-01 07:00:29	2021-11-01 16:41:00
3	FFT419.ATL.DEN.211031.1625.0073.TFM	2021-11-01 07:00:45	2021-11-01 16:39:00
4	FFT421.ATL.DEN.211101.0140.0090.TFM	2021-11-01 07:00:49	2021-11-02 01:52:00
...	...	...	...
13327016	FFT1516.ATL.MIA.211030.2150.0049.TFM	2021-10-31 22:59:06	2021-10-31 22:11:00
13327017	SWA3427.ATL.MIA.211031.0150.0072.TFM	2021-10-31 22:59:45	2021-11-01 03:08:00
13327018	RPA4778.ATL.ORD.211030.2055.0058.TFM	2021-10-31 22:59:52	2021-10-31 22:39:00
13327019	RPA4778.ATL.ORD.211030.2055.0058.TFM	2021-10-31 22:59:56	2021-10-31 22:39:00
13327020	DAL663.ATL.SNA.211031.0015.0043.TFM	2021-10-31 22:59:59	2021-11-01 00:27:00

13327021 rows × 3 columns

Submission format¶

The submission format gives us a list of flights and times for which we'll need to make predictions.

In [5]:

submission_format = pd.read_csv(
    DATA_DIRECTORY / "submission_format.csv", parse_dates=["timestamp"]
)
submission_format

Out[5]:

	gufi	timestamp	airport	minutes_until_pushback
0	AAL1008.ATL.DFW.210607.2033.0110.TFM	2021-06-08 19:15:00	KATL	0
1	AAL1008.ATL.DFW.210607.2033.0110.TFM	2021-06-08 19:30:00	KATL	0
2	AAL1008.ATL.DFW.210607.2033.0110.TFM	2021-06-08 19:45:00	KATL	0
3	AAL1008.ATL.DFW.210607.2033.0110.TFM	2021-06-08 20:00:00	KATL	0
4	AAL1008.ATL.DFW.210607.2033.0110.TFM	2021-06-08 20:15:00	KATL	0
...	...	...	...	...
2042718	XOJ760.SEA.SJC.210606.0435.0007.TFM	2021-06-06 20:45:00	KSEA	0
2042719	XOJ760.SEA.SJC.210606.0435.0007.TFM	2021-06-06 21:00:00	KSEA	0
2042720	XOJ760.SEA.SJC.210606.0435.0007.TFM	2021-06-06 21:15:00	KSEA	0
2042721	XOJ760.SEA.SJC.210606.0435.0007.TFM	2021-06-06 21:30:00	KSEA	0
2042722	XOJ760.SEA.SJC.210606.0435.0007.TFM	2021-06-06 21:45:00	KSEA	0

2042723 rows × 4 columns

We can begin to prototype our solution using a single flight and prediction time. Let's use the 200th row of the submission format.

In [7]:

row = submission_format.iloc[200]
row

Out[7]:

gufi                      AAL1008.ATL.DFW.211204.2135.0163.TFM
timestamp                                  2021-12-05 21:00:00
airport                                                   KATL
minutes_until_pushback                                       0
Name: 200, dtype: object

This row assumes it is 2021-12-05 21:00:00 and we're looking for a time to pushback prediction for the flight with GUFI AAL1008.ATL.DFW.211204.2135.0163.TFM.

Now let's look at the ETD dataframe for entries for that flight:

In [8]:

etd.loc[etd.gufi == row.gufi]

Out[8]:

	gufi	timestamp	departure_runway_estimated_time
596631	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-04 21:35:50	2021-12-05 21:39:00
596946	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-04 21:40:45	2021-12-05 21:56:00
604259	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 21:02:25	2021-12-05 21:56:00
606150	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 21:39:33	2021-12-05 21:56:00
606951	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 21:51:38	2021-12-05 21:51:00
607097	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 21:52:58	2021-12-05 21:51:00
609143	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 22:26:46	2021-12-05 21:51:00
609490	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 22:32:10	2021-12-05 21:51:00
609707	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 22:37:10	2021-12-05 21:51:00
610873	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 22:51:22	2021-12-05 21:51:00
611009	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 22:53:03	2021-12-05 21:51:00
611365	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 22:58:53	2021-12-05 21:51:00
617975	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 23:32:27	2021-12-05 21:51:00
619770	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 23:55:36	2021-12-05 21:51:00

Notice how the departure time was estimated many times during the lifespan of this flight.

How we handle time is a critical aspect of this competition: we must only use features from 30 hours before the prediction time up until the prediction time itself. The following cell will filter the ETD features to just that time period (as well as just those entries that relate to the flight we are predicting).

In [9]:

now_etd = etd.loc[
    (etd.timestamp > row.timestamp - timedelta(hours=30))
    & (etd.timestamp <= row.timestamp)
    & (etd.gufi == row.gufi)
]
now_etd

Out[9]:

	gufi	timestamp	departure_runway_estimated_time
596631	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-04 21:35:50	2021-12-05 21:39:00
596946	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-04 21:40:45	2021-12-05 21:56:00

It looks like there are two ETD estimates in that time range. We will use the most recent, since we might expect it to be the most accurate.

Finally we subtract 15 minutes since pushback is typically about 15 minutes before the ETD.

In [10]:

flight_pushback = now_etd.iloc[-1].departure_runway_estimated_time - timedelta(minutes=15)
flight_pushback

Out[10]:

Timestamp('2021-12-05 21:41:00')

The submission format expects a prediction of the number of minutes (as an integer) from the current prediction time to the estimated pushback time.

In [11]:

flight_minutes_to_pushback = np.round(
    (flight_pushback - row.timestamp).total_seconds() / 60
).astype(int)
flight_minutes_to_pushback

Out[11]:

Simple right? Now we can just repeat this process for each of the rows of the submission format and each airport in our dataset.

But first, we'll want to consider some speed optimizations for solution. Your submission to the Prescreened Arena will need to process 1,800 time points and run in under 10 hours. It would take a very long time to run this ETD lookup for each of the >300k rows for this airport. We can speed things up by taking advantage of parallelization.

Let's get all of the flights for one prediction time. (As you'll see later in this post, that's how the code execution environment processes your submission.) As you can see, we need to make predictions for 79 flights at KATL for this prediction time.

In [12]:

now_submission_format = submission_format.loc[
    (submission_format.timestamp == row.timestamp)
    & (submission_format.airport == airport)
].reset_index(drop=True)
now_submission_format

Out[12]:

	gufi	timestamp	airport	minutes_until_pushback
0	AAL1008.ATL.DFW.211204.2135.0163.TFM	2021-12-05 21:00:00	KATL	0
1	AAL378.ATL.MIA.211204.2035.0063.TFM	2021-12-05 21:00:00	KATL	0
2	DAL1017.ATL.DEN.211204.2220.0072.TFM	2021-12-05 21:00:00	KATL	0
3	DAL1039.ATL.LAS.211204.2200.0167.TFM	2021-12-05 21:00:00	KATL	0
4	DAL1204.ATL.JAN.211204.2145.0094.TFM	2021-12-05 21:00:00	KATL	0
...	...	...	...	...
74	SWA4105.ATL.MKE.211204.2215.0031.TFM	2021-12-05 21:00:00	KATL	0
75	SWA668.ATL.DCA.211204.2050.0136.TFM	2021-12-05 21:00:00	KATL	0
76	SWA803.ATL.MIA.211204.2105.0118.TFM	2021-12-05 21:00:00	KATL	0
77	UAL225.ATL.DEN.211204.2115.0037.TFM	2021-12-05 21:00:00	KATL	0
78	UAL279.ATL.EWR.211204.2100.0196.TFM	2021-12-05 21:00:00	KATL	0

79 rows × 4 columns

A fast way to get the latest ETD for a bunch of flights is to use pandas.DataFrame.groupby to group by GUFI, then take the last estimate for each flight (making sure we've sorted ETD by timestamp).

In [13]:

etd.sort_values("timestamp", inplace=True)
now_etd = etd.loc[
    (etd.timestamp > row.timestamp - timedelta(hours=30))
    & (etd.timestamp <= row.timestamp)
]
now_etd
latest_now_etd = now_etd.groupby("gufi").last().departure_runway_estimated_time

Finally, we can merge the submission format and latest ETDs on GUFI and generate our predictions for this batch.

In [14]:

departure_runway_estimated_time = now_submission_format.merge(
    latest_now_etd, how="left", on="gufi"
).departure_runway_estimated_time
departure_runway_estimated_time

Out[14]:

0    2021-12-05 21:56:00
1    2021-12-05 21:13:00
2    2021-12-05 22:30:00
3    2021-12-05 22:03:00
4    2021-12-05 21:47:00
             ...        
74   2021-12-05 22:29:00
75   2021-12-05 21:10:00
76   2021-12-05 21:28:00
77   2021-12-05 21:34:00
78   2021-12-05 21:10:00
Name: departure_runway_estimated_time, Length: 79, dtype: datetime64[ns]

In [15]:

estimated_pushback = (
    (departure_runway_estimated_time - now_submission_format.timestamp).dt.total_seconds()
    / 60
) - 15

Since subtracting 15 minutes could put us in a situation where we are predicting a pushback time before the prediction time, we can clip our predictions to never be negative. We'll also cast the predictions to integer as required by the submission format.

In [16]:

estimated_pushback = estimated_pushback.clip(lower=0).astype(int)
estimated_pushback

Out[16]:

0     41
1      0
2     75
3     48
4     32
      ..
74    74
75     0
76    13
77    19
78     0
Length: 79, dtype: int64

Now let's wrap that all up in a function that takes a timestamp as input and outputs all of the predictions at an airport for that timestamp.

In [28]:

def estimate_pushback(now: pd.Timestamp) -> pd.Series:
    # subset submission format to the current prediction time
    now_submission_format = airport_submission_format.loc[
        airport_submission_format.timestamp == now
    ].reset_index(drop=True)

    # filter features to 30 hours before prediction time to prediction time
    now_etd = etd.loc[
        (etd.timestamp > now - timedelta(hours=30)) & (etd.timestamp <= now)
    ]

    # get the latest ETD for each flight
    latest_now_etd = now_etd.groupby("gufi").last().departure_runway_estimated_time

    # merge the latest ETD with the flights we are predicting
    departure_runway_estimated_time = now_submission_format.merge(
        latest_now_etd, how="left", on="gufi"
    ).departure_runway_estimated_time

    now_prediction = now_submission_format.copy()

    now_prediction["minutes_until_pushback"] = (
        (
            departure_runway_estimated_time - now_submission_format.timestamp
        ).dt.total_seconds()
        / 60
    ) - 15

    return now_prediction

We'll use tqdm.contrib.concurrent.process_map to parallelize the function and show a helpful progress bar.

In [18]:

from tqdm import tqdm
from tqdm.contrib.concurrent import process_map

In [41]:

airport_submission_format = submission_format.loc[submission_format.airport == airport]
predictions = process_map(
    estimate_pushback,
    pd.to_datetime(airport_submission_format.timestamp.unique()),
    chunksize=10,
)

pd.concat(predictions, ignore_index=True)

Out[41]:

	gufi	timestamp	airport	minutes_until_pushback
0	AAL1006.SEA.DFW.210827.1300.0052.TFM_TFDM	2021-08-28 00:00:00	KSEA	74.0
1	ASA107.SEA.ANC.210827.2140.0020.TFM	2021-08-28 00:00:00	KSEA	29.0
2	ASA1092.SEA.LAX.210827.2157.0001.TFM	2021-08-28 00:00:00	KSEA	66.0
3	ASA1146.SEA.AUS.210827.2140.0044.TFM	2021-08-28 00:00:00	KSEA	57.0
4	ASA123.SEA.FAI.210827.1442.0018.TFM	2021-08-28 00:00:00	KSEA	65.0
...	...	...	...	...
157315	UAL2436.SEA.DEN.210211.1321.0008.TFM	2021-02-12 12:00:00	KSEA	83.0
157316	UAL2436.SEA.DEN.210213.1321.0001.TFM	2021-02-14 12:00:00	KSEA	83.0
157317	UAL2436.SEA.DEN.210213.1321.0001.TFM	2021-02-14 12:15:00	KSEA	68.0
157318	UAL2436.SEA.DEN.210213.1321.0001.TFM	2021-02-14 12:30:00	KSEA	53.0
157319	UAL2436.SEA.DEN.210214.1321.0010.TFM	2021-02-15 12:00:00	KSEA	83.0

157320 rows × 4 columns

That's all of the predictions for KATL in about 2 minutes!

Now let's iterate over all the airports and generate predictions. We'll save individual airport predictions out and concatenate them together as a final step.

In [20]:

for airport in airports:
    print(f"Processing {airport}")
    airport_predictions_path = Path(f"validation_predictions_{airport}.csv.bz2")
    if airport_predictions_path.exists():
        print(f"Predictions for {airport} already exist.")
        continue

    # subset submission format to current airport
    airport_submission_format = submission_format.loc[
        submission_format.airport == airport
    ]

    # load airport's ETD data and sort by timestamp
    etd = pd.read_csv(
        DATA_DIRECTORY / airport / f"{airport}_etd.csv.bz2",
        parse_dates=["departure_runway_estimated_time", "timestamp"],
    ).sort_values("timestamp")

    # process all prediction times in parallel
    predictions = process_map(
        estimate_pushback,
        pd.to_datetime(airport_submission_format.timestamp.unique()),
        chunksize=20,
    )

    # concatenate individual prediction times to a single dataframe
    predictions = pd.concat(predictions, ignore_index=True)
    predictions["minutes_until_pushback"] = predictions.minutes_until_pushback.clip(
        lower=0
    ).astype(int)

    # reindex the predictions to match the expected ordering in the submission format
    predictions = (
        predictions.set_index(["gufi", "timestamp", "airport"])
        .loc[airport_submission_format.set_index(["gufi", "timestamp", "airport"]).index]
        .reset_index()
    )

    # save the predictions for the current airport
    predictions.to_csv(airport_predictions_path, index=False)

Processing KATL

Processing KCLT

Processing KDEN

Processing KDFW

Processing KJFK

Processing KMEM

Processing KMIA

Processing KORD

Processing KPHX

Processing KSEA

In [59]:

predictions = []

for airport in airports:
    airport_predictions_path = Path(f"validation_predictions_{airport}.csv.bz2")
    predictions.append(pd.read_csv(airport_predictions_path, parse_dates=["timestamp"]))

predictions = pd.concat(predictions, ignore_index=True)
predictions["minutes_until_pushback"] = predictions.minutes_until_pushback.astype(int)

In [60]:

with pd.option_context("float_format", "{:.2f}".format):
    display(predictions.minutes_until_pushback.describe())

count   2042723.00
mean         47.64
std          31.54
min           0.00
25%          23.00
50%          45.00
75%          69.00
max        1469.00
Name: minutes_until_pushback, dtype: float64

Most of the predictions fall between 0 and 70 minutes. The peak at 0 is due to a number of negative predictions that we set to 0.

In [67]:

fig, ax = plt.subplots(figsize=(6, 4), dpi=150)
predictions.minutes_until_pushback.clip(lower=0, upper=200).hist(
    bins=np.arange(0, 200), ax=ax
)
ax.set_title("Distribution of predicted minutes to pushback")
ax.set_ylabel("Number of predictions")
ax.set_xlabel("Minutes to pushback")
_ = plt.show()

It is a good idea to do a few final checks to make sure our prediction conforms to the proper submission format. The platform will reject submissions that do not match the provided submission format, but it's nice to check locally just to be sure.

In [22]:

assert (predictions.columns == submission_format.columns).all()
assert len(predictions) == len(submission_format)
assert predictions[["gufi", "timestamp", "airport"]].equals(
    submission_format[["gufi", "timestamp", "airport"]]
)

Submit predictions to the Open Arena¶

Finally we can load up all of the individual airport predictions, concatenate them, and submit them to the Open Arena for scoring!

We highly recommend saving your submission as a zipped CSV to drastically reduce the file size and upload time. pandas.to_csv can do this automatically if you give it a file path with the .zip extension.

In [23]:

predictions.to_csv("validation_predictions.zip", index=False)

Code submissions in the Prescreened Arena¶

In the Prescreened Arena, rather than submit predictions themselves, you'll submit your trained model and inference code, and we will compute predictions in our code execution environment.

The runtime repository has a ton of information including code examples you can use when constructing your own code submission, instructions for how to test your submission locally, and much more. For now, we'll simply show how to turn the Fuser ETD baseline solution into a valid code submission.

A code submission must include solution.py that implements two functions:

load_model that returns any model assets that are needed for prediction. This solution does not have any model assets, so the function can just return None.
predict that takes the features as inputs and outputs predictions for a set of flights at a single prediction time.

The code execution environment does a lot of the work for you: it loads the features, and subsets them to the valid time ranges for each prediction time. All you need to do is provide a function predict that takes as input:

A set of feature dataframes: These are already filtered to only the valid time range from 30 hours before the prediction time up until the prediction time. You can use these features without restriction to generate your prediction.
A "partial submission format": A subset of rows from the full submission format corresponding to all of the flights at a single prediction time for one airport.
Model assets: Whatever your load_model function returns, most likely your trained model.

and outputs predictions for all the flights in the partial submission format.

Now we'll show how to turn the Fuser ETD benchmark into a valid code submission.

In [24]:

"""Solution for the NASA Pushback to the Future competition."""
from pathlib import Path
from typing import Any

from loguru import logger
import pandas as pd


def load_model(solution_directory: Path) -> Any:
    """Load any model assets from disk."""
    return


def predict(
    config: pd.DataFrame,
    etd: pd.DataFrame,
    first_position: pd.DataFrame,
    lamp: pd.DataFrame,
    mfs: pd.DataFrame,
    runways: pd.DataFrame,
    standtimes: pd.DataFrame,
    tbfm: pd.DataFrame,
    tfm: pd.DataFrame,
    airport: str,
    prediction_time: pd.Timestamp,
    partial_submission_format: pd.DataFrame,
    model: Any,
    solution_directory: Path,
) -> pd.DataFrame:
    """Make predictions for the a set of flights at a single airport and prediction time."""
    logger.info("Computing prediction based on Fuser ETD")

    latest_etd = (
        etd.sort_values("timestamp")
        .groupby("gufi")
        .last()
        .departure_runway_estimated_time
    )
    departure_runway_estimated_time = partial_submission_format.merge(
        latest_etd, how="left", on="gufi"
    ).departure_runway_estimated_time

    prediction = partial_submission_format.copy()
    prediction["minutes_until_pushback"] = (
        (
            departure_runway_estimated_time - partial_submission_format.timestamp
        ).dt.total_seconds()
        / 60
    ) - 15

    prediction["minutes_until_pushback"] = prediction.minutes_until_pushback.clip(
        lower=0
    ).fillna(30)

    return prediction

In [25]:

!zip solution.zip solution.py

updating: solution.py (deflated 60%)

Now we have a submission that we can upload to the Prescreened Arena (provided that you've already been prescreened)!

Since it can take several hours to run the submission on the full test data, it's a good idea to first submit a "smoke test" version, which runs on a few hours of data from the training set. Smoke tests are only for helping quickly debug your submission, and scores on the smoke test are not counted in the leaderboard.

Once the smoke test has completed successfully, go ahead and submit your solution for evaluation on the full test set.

Even this simple solution takes about 5 hours to run, so you should definitely consider ways to optimize how your solution processes features and performs inference.

That concludes our benchmark! Head over to the competition home page to get started building your own solution. We're looking forward to seeing what you come up with!

Pushback to the Future: Predict Pushback Time at US Airports - Benchmark

Pushback to the Future: Predict Pushback Time at US Airports - Benchmark¶

Get the data¶

The "Fuser ETD minus 15 minutes benchmark"¶

Submission format¶

Submit predictions to the Open Arena¶

Code submissions in the Prescreened Arena¶

Tags

Latest posts

Community Spotlight: Paola Ruiz, Néstor González, Daniel Crovo

Community Spotlight: Kirill Brodt

Jump-starting data infrastructure and in-house data expertise

A production application to support survivors of human trafficking

Life beyond the leaderboard

(Tech) Infrastructure Week for the Nonprofit Sector

Meet the winners of Phase 2 of the PREPARE Challenge

AI sauce on everything: Reflections on ASU+GSV 2025

Open-source packages for using speech data in ML

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Crowdsourcing solutions for AI-assisted early literacy screening

Where to find a data job for a good cause

Meet the Winners of the Youth Mental Health Narratives Challenge

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

10 takeaways from 10 years of data science for social good

Making higher education data more accessible

Goodnight Moon, Hello Early Literacy Screening Benchmark

Building a private LLM sandbox for NCSC

Youth Mental Health: Automated Abstraction Benchmark

Work with us to build a better world

Pushback to the Future: Predict Pushback Time at US Airports - Benchmark¶

Get the data¶

The "Fuser ETD minus 15 minutes benchmark"¶

Submission format¶

Submit predictions to the Open Arena¶

Code submissions in the Prescreened Arena¶

Tags

Stay updated

Latest posts

Work with us to build a better world