My latest article on performing predictive modeling using document databases is now available on IBM developerWorks. The abstract:
Predictive analytics relies on processing, analyzing data from many different sources, collating, and then processing that through several stages into usable data. This involves recording and storing data in different formats, and may require translating information into PMML. Despite the complexities and structure of the information, and the sources often involving data from traditional RDBMS data sources, other solutions offer some advantages. We can use the recent range of document-based NoSQL databases to help collate the information in a structured format, while coping with the flexible structure of the individual data points. Many NoSQL environments also provide support for extensive map reduce type queries and processing that makes them ideal for processing large volumes of data into a summary format. In this article, we’ll look at the transfer, exchange, and formatting of information in NoSQL environments.