MLDB is the Machine Learning Database. It’s the best way to get machine learning or AI into your applications or personal projects. Head on over to MLDB.ai to try it right now or see Running MLDB for installation details.
We’re happy to announce the immediate availability of MLDB version 2016.08.31.0.
This release contains 114 new commits and modified 366 files. On top of many bug fixes and performance improvements, here are some of the highlights of this release:
We’re very excited to present MLPaint, the Real-Time Handwritten Digit Recognizer, a web app that runs on MLDB. It was made by Jonathan, the awesome intern we had with us this summer. Check out the video demo below:
The two demos below go into the technical details of how this plugin was built. The plugin is hosted on Github if you want to check out the implementation.
Weighting examples correctly is a crucial part of training machine learning models that will generalize well. It can be used to compensate for sampling bias, class imbalance, etc. This is well supported for training in two ways:
weightcolumn in the
equalizationFactorparameter that specifies the amount by which to adjust weights so that all classes have an equal total weight
Weights can also be useful for testing. For instance, the cost of making mistakes for certain examples can be much less than for others. Having the metrics take that into consideration will help deliver a clearer picture of the performance expectations you can have for the model.
All the metrics reported by the
classifier.test prodecure now fully take the weight of each example into account. You can specify the weight of each example by using the
weight column in the
MLDB makes it very easy to access secured resources using a variety of protocols like
sftp or even
s3. MLDB can store credentials and supply them transparently whenever required when accessing protected files.
We fixed an issue that cause a problem when credentials file were loaded from a remote resource when launching MLDB from the command-line by using the
add-credentials-from-url flag. This is mostly used in a production scenario. Error messages related to handling of credential files were also improved so they’re clearer.
The pymldb library is an open-source pure-Python module which provides a wrapper library that makes it easy to work with MLDB from Python. Version 0.7.1 is a minor update changes the way the
query function sends requests to MLDB. Instead of passing the query using the query string, it now sends it in the JSON payload. This makes it possible to send big feature vectors without hitting the query-string size limit.
Check out the Using pymldb Tutorial notebook for more info.
MLDB allows its functionality to be extended with plugins. While we often showcase Python plugins, like MLPaint mentioned at the top of this post, it’s also possible to write plugins in c++.
And so c++ plugin developers rejoice! It is now easier to take advantage of MLDB’s powerful SQL engine from c++ by using the new
eval_sql function. It makes running queries easier and faster.
You can now also specify built-in functions by using SQL from c++. This allows for much more compact code and less boilerplate.
We’d like to shout out to ZzEeKkAa who developed a very nice Golang interface for MLDB. Check it out if you’re into Golang!
If you created a plugin or library that works with MLDB, make sure to reach out!
We have been hard at work on a new LiDAR MLDB plugin. This enables MLDB to process 3D point cloud data and do voxel rendering. It makes is possible to visualize raw and voxelized data from any point of view. Combined with our existing Tensorflow integration, it opens the door to a solving cutting-edge deep learning image recognition problems with MLDB.
It is also now possible to build MLDB on 32 and 64 bit ARM architectures. This will enable us to target a wider range of hardware. Think of smartphones, Raspberry Pi, or even Nvidia’s Jetson TX1. This is a stepping stone in having MLDB run on-device.
ln(dp or numeric): natural logarithm
log(dp or numeric): base 10 logarithm
log(b numeric, x numeric): logarithm to base b
/v1/datasets/<dataset_name>/query) has been removed. Use the