→ AIQC is an open source framework for MLOps.

Accelerating scientific research with a simple API for data preprocessing, experiment tracking, & model evaluation.



framework






→   Goodbye, boilerplate scripts (X_train, y_test).   Hello, object-oriented machine learning.



Low-Level API

Dataset()
Feature()
Label()
Splitset()
Encoder()
Algorithm()
Hyperparamset()
Job()
Queue()
Prediction()
etc.


High-Level API

Pipeline()
Experiment()


↳ How does AIQC compare to other experiment trackers?





→ Write 98% less code with AIQC's simple yet rigorous workflows.


Tabular
(2D: array, df, file,
single site time series)
Sequence
(3D: files, channels,
multi site time series)
Image
(4D: multi image,
grayscale video)
Classification
(binary, multi)
Keras (binary, multi)

PyTorch (binary, multi)
Keras (binary, multi)

PyTorch (binary, multi)
Keras (binary, multi)

PyTorch (binary, multi)
Quantification
(regression)
Keras

PyTorch
Keras

PyTorch
Keras

PyTorch
Forecasting
(multivariate walk forward)
Keras

PyTorch
Keras

PyTorch
Keras

PyTorch

Supports the most popular data types, analytical use cases, and deep learning libraries.





→ Discover the insight hidden in your raw data with deep learning.


AIQC makes deep learning more accessible by solving the following data wrangling challenges:


  1. Preprocessing - Data must be encoded into a machine-readable format. Encoders don't handle multiple dimensions, columns, & types. Leakage occurs if splits/folds aren't encoded separately. Lack of validation splits causes evaluation bias. Which samples were used for training?
  2. Experiment Tracking - Tuning parameters and architectures requires evaluating many training runs with metrics and charts. However, leading tools are only designed for a single run and don't keep track of performance. Validation splits and/or cross-validation folds compound these problems.
  3. Postprocessing - If the encoder-decoder pairs weren't saved, then how should new samples be encoded and predictions be decoded? Do new samples have the same schema as the training samples? Did encoders spawn extra columns? Multiple encoders compound these problems.


Adding to the complexity, different protocols are required based on: analysis type (e.g. categorize, quantify, generate), data type (e.g. spreadsheet, sequence, image), and data dimensionality (e.g. timepoints per sample).

The DIY approach of patching together custom code and toolsets for each analysis is not maintainable because it places a skillset burden of both data science and software engineering upon the research team.



Overview



→ Automated visualizations for evaluating each split & fold of every model.


visualizations.gif


Let's get started!


Use Cases & Tutorials