Skip to main content
Synthefy Tabular predicts continuous values from tabular data. You give it some labeled rows as examples, and it predicts on new rows — no training, no fine-tuning required. GitHub  ·  🤗 Hugging Face There are two ways to use it:

Run locally

Install the Python package and run inference on your own machine or server.

Call the API

Send a request to our hosted endpoint. No setup, no GPU required.

Local

Install the package:
pip install synthefy-tabular
Model weights download automatically from Hugging Face on first use — no API key needed.

Your first prediction

from synthefy_tabular import SynthefyTabularRegressor

X_train = [[0.0, 1.0], [1.0, 0.0], [0.5, 0.5], [0.2, 0.8]]
y_train = [0.1, 0.9, 0.5, 0.3]
X_test  = [[0.3, 0.7], [0.8, 0.2]]

model = SynthefyTabularRegressor()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print(pred)
[0.36 0.74]
Or use the one-shot helper:
from synthefy_tabular import predict

pred = predict(X_train, y_train, X_test, task="regression")
[0.36 0.74]

With a DataFrame

import pandas as pd
from synthefy_tabular import SynthefyTabularRegressor

df = pd.read_csv("data.csv")
target_col = "price"
feature_cols = [c for c in df.columns if c != target_col]

train = df.sample(frac=0.8, random_state=42)
test  = df.drop(train.index)

model = SynthefyTabularRegressor()
model.fit(train[feature_cols].values, train[target_col].values)
predictions = model.predict(test[feature_cols].values)

Handle missing values

You don’t need to fill in missing values beforehand — the model handles them automatically.
import numpy as np
from synthefy_tabular import SynthefyTabularRegressor

X_train = [[0.0, 1.0], [1.0, np.nan], [0.5, 0.5], [np.nan, 0.8]]
y_train = [0.1, 0.9, 0.5, 0.3]
X_test  = [[0.3, 0.7], [0.8, np.nan]]

model = SynthefyTabularRegressor()
model.fit(X_train, y_train)
print(model.predict(X_test))
[0.37 0.69]
The model uses a GPU automatically if one is available, and falls back to CPU otherwise.

API

The hosted API runs the same model on our infrastructure — no installation or GPU required. Set up your API key in the API key guide.

Make a request

Send your labeled rows (X_train, y_train) and the rows you want to predict (X_test) in a single call:
curl -X POST https://model-3m5j7y9w.api.baseten.co/environments/production/predict \
  -H "Authorization: Api-Key $SYNTHEFY_TABULAR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "X_train": [[0.0, 1.0], [1.0, 0.0], [0.5, 0.5], [0.2, 0.8]],
    "y_train": [0.1, 0.9, 0.5, 0.3],
    "X_test":  [[0.3, 0.7], [0.8, 0.2]]
  }'

Response

{
  "task": "regression",
  "predictions": [0.36, 0.74],
  "usage": {
    "input_tokens": 16,
    "output_tokens": 2,
    "total_tokens": 18
  }
}
predictions contains one value per row in X_test. The usage field follows the OpenAI token counting convention — input_tokens counts every non-null value sent, output_tokens is one per predicted row.
The first request after the model has been idle may take ~90 seconds while it warms up. Subsequent requests return in 1–2 seconds.

Resources

White paper

Coming soon.

Product page

Coming soon.

GitHub

Source code for training, inference, and evaluation.

Hugging Face

Pretrained model weights.