PFA Format
CategoryData Interchange Format
Portable Format for Analytics (PFA) is a JSON-based predictive model interchange format developed by the Data Mining Group (DMG). It is an emerging standard focused on the scoring procedure, used for statistical models and data transformation engines supporting Big Data processing.
Predictive Modeling Format
Statistical Algorithms for Analyzing Data
PFA provides a way for analytical applications to describe and exchange predictive models produced by Statistical Modeling and Machine Learning algorithms. It supports common predictive models, such as logistic regression and decision trees.
A PFA document is JSON or YAML formatted text that describes an executable called a scoring engine. Each engine has a well-defined input, a well-defined output and functions for combining inputs to construct the output in an expression-centric syntax tree.
The specification defines a suite of mathematical and statistical functions for transforming data, but it does not define any means of communication with an operating system, file system or network. A PFA engine must be embedded in a larger system that has these capabilities and can manage data input-output pipelines.
PFA combines the ease of portability across systems with algorithmic flexibility: models, pre-processing and post-processing are all functions that can be arbitrarily composed, chained or built into complex data mining workflows.