algorithmic predictions

“Big data” are not (more precisely: must not be allowed to be) merely vast piles of data that clever machines will dig through each time, on demand, now extracting one statistical conclusion, now another. According to some ambitious technologists, these “piles” also contain the raw material for predictions… Yet which data correlations can “give birth” to accurate forecasts? How can the digitally stored past be harnessed to reliably represent the future?

At the American MIT they seem to be concerned with this issue as well. Last year they built some algorithms, fed them into the Data Science Machine, and entered it to compete against 906 human teams in three separate “prediction” or “intuition” contests. Given that there was no preset time limit for producing the final “prediction”, the D.S.M. reached a conclusion before its human rivals in 615 of the 906 cases. In two of the three contests the comparison showed the machine’s predictions were accurate (relative to the human ones) 94 % to 96 % of the time. In the third contest its performance dropped to 87 %.

Not bad at all. “Very good” in fact, considering that the machine needed between 2 and 12 hours to reach a conclusion, while the human teams in some cases needed even months. Kalyan Veeramachaneni, a researcher at MIT’s Laboratory for Computer Science and Artificial Intelligence, said he was delighted: “what we have seen from our experience in solving various problems related to scientific data is that one of the critical steps is what we used to call feature engineering”

Feature engineering; Why not?

cyborg #05 – 02/2016