January 24, 2017
Version 2017.01.24.0 Released

MLDB is the Machine Learning Database. It’s the best way to get machine learning or AI into your applications or personal projects. Head on over to MLDB.ai to try it right now or see Running MLDB for installation details.

We’re happy to announce the immediate availability of MLDB version 2017.01.24.0.

This release contains 123 new commits and modified 1653 files. On top of many bug fixes and performance improvements, here are some of the highlights of this release:

  • Big optimizations in query executing time by evaluating all const expressions at binding time instead of doing it for each row.
  • Added the blob_length(x) function that returns the length (in bytes) of the blob x.
  • Added the parse_exif(blob) function that takes a JPEG image blob and returns basic EXIF information from it.
  • Added the split_part(str, splitChars) function that splits the string str and returns an embedding of all tokens as separated by the provided splitChars parameter.
  • The fetcher() function now works with UTF-8 paths.
  • Fixed wrong error code returned by the fetcher() function when it should return a 404.
  • Fixed an issue with the fetcher() function that could make MLDB hang for a long time.
  • The number of lines returned by the /logs/mldb endpoint has been increased from 1024 to 8192 lines.
  • Improved the error message returned by the columnPathElement() function when using an out of bounds index.
  • It is now possible to do a transpose of a row_dataset. It is also now possible to merge two row_dataset together.
  • Fixed an issue where the WHERE clause would not be properly applied when used with a dataset of type UNION.
  • When running MLDB in a Docker container, if the mldb_runner process exits with a non-zero exit code, the docker run command will also exit with a non-zero exit code.
  • When executing a query, the atom return format was added. It returns a single atomic value, without the row name or the column name. The query will fail if anything other than a single row / column is returned. This is available in when using the /v1/query endoint or pymldb.
  • Logging improvements
  • Improved the handling of CUDA launch failures when using Tensorflow by modifying the default behaviour from assert() to throw so that they are recoverable.
  • Lower curl connect(2) timeout from 300s to 20s. This allows for ~3 SYN retransmits on a default linux config. The idea is to avoid being stuck in connect(2) for too long while still having a chance of success when going through flaky networks.
  • Updated svdlibc to version 1.4 which includes bug fixes.
  • Fixed an issue with Python plugins where POSTing to a route that returns no data and 200 code would be returned as a 404 by MLDB.
  • Fixed an issue when running a transform procedure with no input.