December 16, 2016
Version 2016.12.16.0 Released

MLDB is the Machine Learning Database. It’s the best way to get machine learning or AI into your applications or personal projects. Head on over to MLDB.ai to try it right now or see Running MLDB for installation details.

We’re happy to announce the immediate availability of MLDB version 2016.12.16.0.

This release contains 228 new commits and modified 823 files. On top of many bug fixes and performance improvements, here are some of the highlights of this release:

  • It’s now possible to make multiple predictions per REST call by using the /v1/functions/<function>/batch REST route.
  • New Identifying Biased Features Tutorial
  • New signal processing functions. This includes the fft(data [,direction='forward' [,type='real']]) function that performs a fast fourier transform on the given data.
  • Added the devices configuration argument to the tensorflow.graph function to specify on which device the graph is allowed to run.
  • MLDB now contains CUDA kernels for shader model 5.2 (Kepler), 5.3 (Maxwell), 6.0 (P100) and 6.1 (Titan X)
  • Improved support for aarch64 and ARM architectures. CUDA is now supported on the Jetson TX1
  • It is now possible to track the progress of long-running procedures, as well as interrupt them. Check the Intro to Procedures page for more details.
  • New numerical functions:
    • sin(x), cos(x) and tan(x) are the normal trigonometric functions
    • asin(x), acos(x) and atan(x) are the normal inverse trigonometric functions
    • atan2(x, y) returns the two-argument arctangent of x and y, in other words the angle (in radians) of the point through x and y from the origin with respect to the positive x axis
    • sinh(x), cosh(x) and tanh(x) are the normal hyperbolic functions
    • asinh(x), acosh(x) and atanh(x) are the normal inverse hyperbolic functions
    • pi() returns the value of pi, the ratio of a circle’s circumference to its diameter, as a double precision floating point number.
    • e() returns the value of e, the base of natural logarithms, as a double precision floating point number.
  • New concat(x, ...) function that takes several embeddings with identical sizes in all but their last dimension and join them together on the last dimension.
  • The import.json procedure now supports the arrays configuration argument to specify how arrays should be encoded in the JSON output.
  • The import.text procedure now returns a rowCount field representing the number of rows that were imported, just as the import.json procedure does.
  • Fixes to the import.text procedure:
    • Fixed trailing whitespace on a CSV file that contains numbers in the last column makes MLDB think those columns are strings (as it keeps the trailing whitespace)
    • Fixed MLDB crashes with a “cannot seek” exception when attempting to open a file with autoGenerateHeaders
  • The reshape() function now has a 3 argument form. reshape(val, shape, newel) is similar to the two argument version of reshape, but allows for the number of elements to be different. If the number of elements increases, new elements will be filled in with the newel parameter.
  • Updated the uap-core library to the latest version improving user agent patterns used by the http.useragent function.
  • Fixed wide rows causing data corruption in tabular datasets
  • Many fixes to the JOIN operators
  • The merge() dataset function now accepts a single dataset
  • Random forest speedups and improvements
  • The svd.train procedure now supports all select expressions to specify it’s input data, instead of the restricted form of select statements.
  • Fixed regexes being recompiled for every row when using a LIKE operator
  • The user function infrastructure has been modified to be more like built-in functions. In particular, inputs and outputs no longer need to be rows.
  • The row_dataset has been modified to return one row per column, and an atom_dataset construct added with semantics similar to the original. The types of these datasets have been improved, with inference of the value type and the column type is now path, not string.
  • The sql.expression object has been improved to allow raw and autoInput parameters to be passed, bypassing the requirement for a row on output and input respectively.