Tag: ibmdeveloperworks

  • Process home monitoring data using the Time Series Database in Bluemix

    I keep a lot of information about my house – I have had sensors and recording units in various parts of my house years, recording info through a variety of different devices. Over the years I’ve built a number of different solutions for storing and displaying the information, and when the opportunity came up to […]

  • Harvest machine data using Hadoop and Hive

    A new article on has been published on IBM developerWorks, looking at the basics of processing machine data using Hadoop, from extracting the core data, storing it, and then determining the baselines and trigger points required to identifying worrying trends and points. From the intro: Machine data can come in many different formats and quantities. […]

  • Process complex text for information mining

    My latest article on data mining text information is now available: Text — an everyday component of nearly all social interaction, social networks, and social sites — is difficult to process. Even the basic task of picking out specific words, phrases, or ideas is challenging. String searches and regex tools don\’t suffice. But the Annotation […]

  • Building flexible apps from big data sources

    My article on how to build flexible apps on top of the BigInsights platform has been published. This demonstrates a cool way to combine some client-end JavaScript and existing technologies to build a Big Data query interface without developing a specialised application for the purpose. It’s no secret that a significant proportion of the needs […]

  • Process big data with Big SQL in InfoSphere BigInsights

    The ability to write an SQL statement against your Big Data stored in Hadoop provides some much needed flexibility. Sure, using Hive or HBase you can perform some of those operations, but there are other alternatives that may suit your needs better, such as the Big SQL utility. My latest article on this tool is […]

  • SQL to Hadoop and back again, Part 3: Direct transfer and live data exchange

    The third, and final article in my series on migrating data to and from Hadoop and SQL databases is now available: Big data is a term that has been used regularly now for almost a decade, and it — along with technologies like NoSQL — are seen as the replacements for the long-successful RDBMS solutions […]

  • SQL to Hadoop and back again, Part 2: Leveraging HBase and Hive

    The second article in a series covering Big Data and SQL interaction is available now: “Big data” is a term that has been used regularly now for almost a decade, and it — along with technologies like NoSQL — are seen as the replacements for the long-successful RDBMS solutions that use SQL. Today, DB2®, Oracle, […]

  • SQL to Hadoop and back again, Part 1: Basic data interchange techniques

    I’ve got a new article, which is part of a new three-part series, on moving data between SQL and Hadoop, both the export to Hadoop and importing processed content back into an SQL store. In this first one, we look at the basic mechanics and considerations before you start the migration of data, such as […]

  • Data Mining in a Document World

    As databases evolve, learning how to get the best out of the different solutions out there is the key to understanding and extracting the data in the way you need from your required data store. Document databases, like MongoDB, CouchDB, Couchbase Server and many others provide a completely different model and set of problems for […]

  • Data Mining Techniques

    I have a new article on the basics of data mining techniques so that you can better understand some of the key principles behind the different methods and principles of data mining. From the abstract: Many different data mining, query model, proce…